Page MenuHomePhabricator

Wrong keys of Title::$titleCache as result of Title::newFromText() is not canonical
Closed, DeclinedPublic

Description

Author: twinchic

Description:
When Title::newFromText() called, $text variable used as key for $titleCache array. But every title can created from many text variants. For example "tEST", "Test" and "TEST" direct to one canonical title key "Test". However result of calling Title::newFromText() will different for this three variants because new Title instance will create for each variant and stored to $titleCache. Property $mDbkeyform will the same.

Continue work with system with this wrong $titleCache state create unexpected behaviour (Article::exists() results for example) when develop complicated extensions.


Version: 1.22.0
Severity: normal
URL: мит
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=36233

Details

Reference
bz47282

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:41 AM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz47282.
bzimport added a subscriber: Unknown Object (MLST).

guadalupeocampo01 wrote:

content hidden as private in Bugzilla

(In reply to Sergey Nevmerzhitsky from comment #0)

"tEST", "Test" and "TEST" direct to one canonical title key "Test".

This isn't the case. "tEST" and "TEST" are the same, but "Test" isn't, since titles are case-sensitive except for the first letter.

Krinkle claimed this task.
Krinkle subscribed.

This cache is for caching the parsing of titles and processing of various rules. One of those rules is first-letter case-sensitivity. So them being cached separately is on purpose as we can't know this ahead of time.

As mentioned, aside from the first character, the other characters are very much case-sensitive. Test and TEST are separate page titles and do not correlate to the same dbkey.

For example, https://en.wikipedia.org/wiki/Gui is a disambiguation page about various places and entities named "Gui". Whereas https://en.wikipedia.org/wiki/GUI is a redirect to Graphical User Interface, as the editors considered that the most notable meaning of the abbreviation (Most others were names, not abbreviations).