Page MenuHomePhabricator

DoubleRedirects erroroneously influenced by interlinks
Open, LowPublic

Description

Author: t.brain

Description:
It appears that the DoubleRedirects special page lists pages according to the last internal link they contain,
which is not necessarily the redirect link.

For example, see this DoubleRedirect entry:
http://en.wikipedia.org/w/index.php?title=Special:DoubleRedirects&limit=1&offset=782

The page [[VRC St Leger Stakes]] actually redirects to [[VRC St Leger]] but contains additional content, in which
the last internal link is to [[Group I]]. DoubleRedirects lists this page as a redirect to [[Group I]].

In fact, it seems that Whatlinkshere for [[Group I]] lists [[VRC St Leger Stakes]], so the problem isn't specific
to the DoubleRedirects page, but rather to how links are stored.
http://en.wikipedia.org/w/index.php?title=Special:Whatlinkshere&target=Group_I

Additionally, [[Wikipedia:Redirect]] states that "Everything after the redirect line will be blanked when you save
the page.", but this appears to be incorrect.


Version: unspecified
Severity: normal

Details

Reference
bz6363

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:19 PM
bzimport set Reference to bz6363.
bzimport added a subscriber: Unknown Object (MLST).

t.brain wrote:

Reviewing this again, it seems I read too much into what was happening. To the best
of my understanding, the wiki keeps a record of all links from one page to another
in the database. So when a redirect page has other content besides the redirect
code, those links are stored as links from the redirect page (even though they are
never actually visible). When the DoubleRedirects page looks for double redirects,
it tries to find any link between one redirect page and another, even if that link
isn't actually a redirect link.

The way I see it, the solutions is one of the following:

  • Mark redirect links in the database differently from regular links. This would

make finding them in DoubleRedirects simpler, as you won't need to check if the link
is from/to a redirect 'page' as is done now.

  • Don't process other code in redirect pages for links, to not list those links in

the link database.

  • Clear all content from a redirect page as stated in [[Wikipedia:Redirect]].

All of the above are just suggestions based on my quick review of the involved code.

It should retrieve the *first* link in the page. AS #REDIRECT doesn't work if
it's below, that will always get the correct one.

Are links stored on the database on the order they appear on the page?

No, there is no ordering whatsoever.

A redirect should contain no markup other than the redirect; garbage in, garbage out.

Eventually we should add separate tracking for redirect targets, though.

t.brain wrote:

It also appears that this has a bad influence on the BrokenRedirects special page, in two aspects:

  • First, the obvious of a redirect article being interperted for links incorrectly could cause a redirect page to

show up in BrokenRedirects because its contents has a broken link, even when the redirect link is in fact correct.

  • Second, some redirect templates contain broken links. The most obvious example is [[Template:R from alternate

spelling]] which uses the [[Template:lts]] template to link to [[Template:R from spelling]]. The "R from spelling"
template doesn't have a talk page, causing lots of proper redirect pages show up as broken redirects because they
link to the non-existing [[Template talk:R from spelling]].

Should be fixable with the upcoming redirect table changes.

  • Bug 26531 has been marked as a duplicate of this bug. ***

bud0011 wrote:

(In reply to comment #5)

Should be fixable with the upcoming redirect table changes.

Did anything happen with these table changes?

The tables were added awhile ago (bug 14418).

The special page still needs fixing.

bud0011 wrote:

ah. Ok. Thank you.