Page MenuHomePhabricator

List allpages has endless loop with case-sensitive titles, apfrom broken
Closed, ResolvedPublic

Description

XML http://wiki.ubuntu.com.cn/api.php?action=query&list=allpages&apnamespace=101&aplimit=500&apfrom=Wiki/BugReports&format=xml

The URL gives <allpages apfrom="Wiki/BugReports" /> although this is the page I'm supposed to start from: the first row is <p pageid="34776" ns="101" title="UbuntuWiki:Wiki/FrontPage" /> and the last <p pageid="11237" ns="101" title="UbuntuWiki:wiki" />, perhaps a case-sensitivity problem.


Version: unspecified
Severity: normal
URL: http://wiki.ubuntu.com.cn/api.php?action=query&list=allpages&apnamespace=101&aplimit=500&apfrom=Wiki/BugReports

Attached:

Details

Reference
bz36987

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 12:24 AM
bzimport set Reference to bz36987.
bzimport added a subscriber: Unknown Object (MLST).

It's using MediaWiki 1.15.3, which isn't a supported version

Fine on stable releases (1.17 onwards)

Note (after chat with Reedy): we don't actually know what's the origin of the problem nor if it's actually been fixed or not in later releases.
I'll try divination to reproduce the bug elsewhere. ;)

I think this bug may still occur, even in git master. But to see the bug, you need a wiki where namespace 0 is set for case-insensitivity on the first letter (which means titles will automatically be transformed to have an uppercase first letter) and the namespace being processed has pages with a lowercase first letter.

What appears to be happening is that the API is trying to return a continuation for the page "UbuntuWiki:wiki/BugReports" (page id 11238 on that wiki). But the API just passes "wiki/BugReports" without the namespace prefix in the continuation parameter, and it also feeds it through the Title class to "normalize" the title. Title (of course) treats the text as a title in namespace 0 and therefore uppercases the first letter as part of that normalization, so the continuation parameter comes out as "Wiki/BugReports" instead. Which leads to an endless loop, as "Wiki/BugReports" sorts 500 pages earlier than "wiki/BugReports".

This vaguely resembles bug 29290, and if I'm right the fix should be similar: don't try to normalize the value passed through the continue parameter. We should probably review all API continuation parameters to ensure this incorrect normalization is not going on.

So, if your hypothesis is correct, what would the steps be to reproduce this bug on another wiki?

You'd have to somehow set up the conditions I mentioned: have namespace 0 be case-insensitive on the first letter, and the target namespace have titles beginning with a lowercase letter (i.e. the target namespace be case-sensitive, or insert impossible titles into the database for the target namespace).

Then you construct an API allpages query with an appropriate apfrom and aplimit so that the continuation is supposed to be a page with a lowercase title, and it will be output with an uppercase title instead.

For example, I just created a "Foo" namespace on my local test wiki with this configuration:

$wgExtraNamespaces[500] = "Foo";
$wgExtraNamespaces[501] = "Foo_talk";
$wgCapitalLinkOverrides[500] = false;
$wgCapitalLinkOverrides[501] = false;

Then I created pages Foo:A through Foo:Z and Foo:a through Foo:z, and a query like http://localhost/w/api.php?action=query&list=allpages&apnamespace=500&apfrom=Y&aplimit=4 gives me a continuation of "C" rather than "c".

Looks confirmed then, what version is your wiki?

Git master as of sometime this morning. I'll post a Gerrit changeset that cleans up this issue in something like 14 core API modules in a few hours.