Page MenuHomePhabricator

Redirect pages should not create multiple rows in pagelinks table. Causes invalid entries on WhatLinksHere.
Closed, ResolvedPublic

Description

When template is transcluded into a redirect page, and the template contains one
or more links, multiple rows are created in the pagelinks table. For the page
"Price, UT" in enwiki (ID=829925), 3 rows are created (can be checked in db),
but the page only renders the category link at the bottom.


Version: unspecified
Severity: normal
URL: http://en.wikipedia.org/w/index.php?title=Price%2C_UT&redirect=no

Details

Reference
bz7304

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:27 PM
bzimport set Reference to bz7304.
bzimport added a subscriber: Unknown Object (MLST).

Example:
http://en.wikipedia.org/w/index.php?title=Special:Whatlinkshere&target=United_States_postal_abbreviations
shows that there are lots of redirects pointing to
[[United_States_postal_abbreviations]], whereas in reality they are not
redirecting to it.

Needs separate parser entry point which registers only the redirect link plus
any categories. We could probably include language links as well.

My recommendation is a redirectlinks table; then the other links tables could do
whatever they liked. (A redirect target field on the page table has also been
suggested, but would be a more painful upgrade and may waste table space.)

Agree with Brion - let redirect pages have any links ppl want, yet putting a
redirect link into the page table will waste tons of space. Hence - I'm all for
the new table. Also, the actual process of redirecting might be efficient with
a single value lookup.

Added the proposed table structure at
http://meta.wikimedia.org/wiki/Proposed_Database_Schema_Changes :
CREATE TABLE wikidb.redirectlinks (

`rl_from` int(8) unsigned NOT NULL default '0',
`rl_namespace` int(11) NOT NULL default '0',
`rl_title` varchar(255) character set latin1 collate latin1_bin NOT NULL default ,
PRIMARY KEY  (`rl_from`),
KEY `rl_namespace` (`rl_namespace`,`rl_title`,`rl_from`)

) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Added patch-redirect.sql to the archive dir:

CREATE TABLE /*$wgDBprefix*/redirect (

rd_from int(8) unsigned NOT NULL default '0',
rd_namespace int NOT NULL default '0',
rd_title varchar(255) binary NOT NULL default '',
PRIMARY KEY rd_from (rd_from),
KEY rd_ns_title (rd_namespace,rd_title,rd_from)

) TYPE=InnoDB;

The table will be pre-populated from pagelinks table using:

INSERT IGNORE

INTO /*$wgDBprefix*/redirect (rd_from,rd_namespace,rd_title)
SELECT pl_from,pl_namespace,pl_title
  FROM /*$wgDBprefix*/pagelinks, /*$wgDBprefix*/page
  WHERE pl_from=page_id AND page_is_redirect=1;

The code to begin using redirect table will be posted after wiki tables get
modified to avoid scap problems.

The table has been created in the DB, and a patch (yurik r17291
/trunk/phase3/includes/Article.php) has been checked in. The patch will update
the redirect table on each save. Eventually the table should be used to find the
target of every redirect, both for API and regular web navigation.

cbm.wikipedia wrote:

*** Bug 12971 has been marked as a duplicate of this bug. ***

mike.lifeguard+bugs wrote:

tbl == TitleBlackList; let's use the whole word

mike.lifeguard+bugs wrote:

(In reply to comment #9)

tbl == TitleBlackList; let's use the whole word

Sorry, that was unclear. The summary previously used 'tbl' for 'title' which is misleading since TBL normally means TitleBlackList. I've changed the summary to use the full word for clarity.

cbm.wikipedia wrote:

*** Bug 16218 has been marked as a duplicate of this bug. ***

avir wrote:

This also causes the output of the backlinks list in the API to include incorrect links.

For example: http://en.wikipedia.org/w/api.php?action=query&list=backlinks&blredirect&bllimit=100&blnamespace=0&blfilterredir=redirects&bltitle=Plural shows that there are many articles that redirect to "Plural" when in fact they redirect to other articles.
For example 'Companies' is the plural of 'Company' so the article 'Companies' naturally redirects to 'Company', but is also listed as a redirect to 'Plural'.

yishaig wrote:

Another example similar to that in Comment #12:

http://en.wikipedia.org/w/api.php?action=query&list=backlinks&blredirect&bllimit=100&blnamespace=0&blfilterredir=redirects&bltitle=Common%20name
shows that there are thousands of redirects to the article 'Common name'

yishaig wrote:

Some of the most problematic articles (the ones with many redirects) are:
Trade name
New Jersey
New York City
Television
Common Name
Abbreviation
Geocode

ari.brodsky wrote:

Here's a very problematic one:
http://en.wikipedia.org/wiki/Special:WhatLinksHere/wp:redirect
Lots of redirect templates contain a link to [[wp:redirect]], and each of those templates is added to many redirect pages. So the WhatLinksHere shows all of those pages as redirects to [[wp:redirect]], and shows second-level backlinks for all of them as well. This makes is impossible to find a list of pages that actually redirect to [[wp:reidrect]].

ari.brodsky wrote:

It is not necessarily bad (and may even be desirable) for the pagelinks table to include additional links from redirect pages after the redirect. The problem is that the table must distinguish the link that is actually the redirect target from other links after the redirect. Only the former should be marked as "redirect" in the table.

From the comments above, there was discussion of a patch way back in 2006. What happened?

Created attachment 9133
special:whatlinkshere using redirect table instead of page_is_redirect

(In reply to comment #18)

It is not necessarily bad (and may even be desirable) for the pagelinks table
to include additional links from redirect pages after the redirect. The
problem is that the table must distinguish the link that is actually the
redirect target from other links after the redirect. Only the former should be
marked as "redirect" in the table.

From the comments above, there was discussion of a patch way back in 2006.
What happened?

We have a redirect table. Its not used currently with special:whatlinkshere, but it probably could be.

Attached is a quick patch doing that. Havn't really tested it. ( its just rought, might not use the proper db funcs in all places)

(also changing status back to new(?) since its only assigned to wikibugs)

Attached:

sumanah wrote:

Adding "need-review" keyword to signal to other developers to give Bawolff feedback on his approach. Thanks for the patch, Bawolff.

(In reply to comment #19)

We have a redirect table. Its not used currently with special:whatlinkshere,
but it probably could be.

Attached is a quick patch doing that. Havn't really tested it. ( its just
rought, might not use the proper db funcs in all places)

Patch worked just fine as it was. Applied verbatim in r103758.