Page MenuHomePhabricator

Namespaces with diacritics are ignored on search suggestions
Closed, ResolvedPublic

Description

On https://pt.wikipedia.org, if I type "modulo" in the search box, I get the article "Módulo" as a suggestion, but if I type "modulo:string", there is no suggestion.

Notice that there is a namespace called "Módulo" (in English, "Module"), and the page called "Módulo:String" exists.


Version: 1.23.0
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=67521

Details

Reference
bz62322

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:05 AM
bzimport set Reference to bz62322.

We're starting to try tracking stuff with phabricator so, for science!: https://phabricator.wikimedia.org/T720

This looks to be caused by how the namespaces are resolved: cirrus uses mediawiki to resolve the namespaces but lsearchd used lucene. Lets see what we can do about that.

One way to "fix" this right away would be add Modulo as a namespace alias for Módulo but I'll see if I can make it work in search.

Patch is coming along. Should be ready in a bit.

Change 168167 had a related patch set uploaded by Manybubbles:
Add hook to extract namespace in prefix search

https://gerrit.wikimedia.org/r/168167

Change 168167 merged by jenkins-bot:
Add hook to extract namespace in prefix search

https://gerrit.wikimedia.org/r/168167

Resolved. It'll take some time for this to arrive on a wiki near you because it requires rebuilding the search index which has to come after the code is deployed.