Page MenuHomePhabricator

Special:DoubleRedirects *removes* "fixed" items, breaking paging
Closed, ResolvedPublic

Description

Compare

http://en.wikipedia.org/w/index.php?title=Special:DoubleRedirects (nothing)

http://en.wikipedia.org/w/index.php?title=Special:DoubleRedirects&limit=250 (nothing)

and

http://en.wikipedia.org/w/index.php?title=Special:DoubleRedirects&limit=1000 (entries)

I suspect the fixed entries (double redirects fixed earlier) are pulled through the SQL query but are not displayed because they are flagged as "fixed".


Version: unspecified
Severity: normal

Details

Reference
bz10985

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:50 PM
bzimport set Reference to bz10985.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #0)

Compare

http://en.wikipedia.org/w/index.php?title=Special:DoubleRedirects (nothing)

http://en.wikipedia.org/w/index.php?title=Special:DoubleRedirects&limit=250
(nothing)

and

http://en.wikipedia.org/w/index.php?title=Special:DoubleRedirects&limit=1000
(entries)

I suspect the fixed entries (double redirects fixed earlier) are pulled through
the SQL query but are not displayed because they are flagged as "fixed".

It does say it is "showing below up to 624 results starting with #1." even though only 60 items are shown (as of this report)

thomas.dalton wrote:

"I suspect the fixed entries (double redirects fixed earlier) are pulled through
the SQL query but are not displayed because they are flagged as "fixed"."

That would make sense. If that's what's causing it, there isn't much that can be done about it. We could have the SQL query get a few extra rows, but how many extra rows would be guesswork. It's a balance between having enough unfixed rows to reach the limit and having an efficient query.

thomas.dalton wrote:

After actually looking at the code, that does appear to be what's happening. It's also what's causing the misreporting of the number of results. The current codes contains a rather untidy hack in the formatResult method to recheck results if they've come from the cache, however this is too late for it to be taken into account in anything other than the outputted list. Unfortunately, using the cache on expensive queries like this is rather essential. With some effort (ie. rewriting the page not to use the standard QueryPage class), I think the number of results reported could be corrected, but making the limit parameter work properly looks pretty impossible to me... The only idea I have is running the query multiple times until you get enough results, but that's likely to be very slow.

The SQL query should be pulling the list of double redirects. It appears that there is some post SQL query processing. Of course this is a wild guess.

thomas.dalton wrote:

There isn't a list of double redirects. The SQL query looks though the entire list of redirects and works out which are doubles, which is very inefficient. That SQL query is then cached (which would, I suppose, constitute a list of double redirects). When someone loads the page it checks to see if there is a recent cached version and if there is, uses that, otherwise it does the inefficient query. Then, it takes the first however many results you requested and formats them one at a time. In that formatting code is a check that tests to see if the result came from the cache and if it did, it checks if it's still a double redirect. If it is, it outputs the name of the page, if not, it outputs the empty string (ie. nothing), and that output is added to the end of the list. The bit of code making the list doesn't know if what it added to the end of the list actually contained any text, so it assumes the list contains the same number of results as it took from the query.

The problem arises because it takes the appropriate number of results from the query *before* it tests them, so afterwards there is the wrong number of results left.

robchur wrote:

Probably the better behaviour here would be to strike through the result, rather than removing it completely, which confuses paging.

thomas.dalton wrote:

That would, at least, make more sense to the user. It's unfortunate that we can't give them 100 double redirects if that's what they ask for, though.

"The problem arises because it takes the appropriate number of results from the
query *before* it tests them, so afterwards there is the wrong number of
results left."

That is what I desperately tried telling as the problem. :)

robchur wrote:

*** Bug 9263 has been marked as a duplicate of this bug. ***

*** Bug 7875 has been marked as a duplicate of this bug. ***

thomas.dalton wrote:

Your fix has just stopped it double checking the cached results at all, so it will now just include fixed double redirects. That makes it worse than it was before. What happened to the striking out idea?

robchur wrote:

Fixed again in r24246; now striking out fixed entries.

russblau wrote:

I liked the old behavior better. And there is bot software that expected it to work that way. However, since you can't please everybody, how about replacing the hard-coded <s> element with a CSS class in the <li> element (e.g., <li class="fixed-redirect-entry">) so that we can use our personalized CSS files to display these elements or not as we prefer?

thomas.dalton wrote:

Could the class go in the <s> element instead? That way we get the current behaviour without having to change any CSS, but users who would like to get rid of them can do so.

russblau wrote:

I wouldn't complain if monobook.css were revised to make the default appearance of this class struck-out.

thomas.dalton wrote:

Not everyone uses monobook. Anyone using a non-standard skin would end up with the page showing all the results, including fixed ones. That's a pretty major regression.