Page MenuHomePhabricator

Use proper language codes in Language instead of $wgDummyLanguageCodes
Closed, ResolvedPublic

Description

Author: p.selitskas

Description:
If the language is just a fallback to another language, its grammar rules from $wgGrammarForms would not apply.

For instance, be-x-old is a fallback to be-tarask. Grammar rules are set for be-tarask only. Be-x-old should use its fallback grammar rules per se, but it does not indeed.

It's checked by switching btw. be-x-old and be-tarask locales in Preferences.

The issue has arisen since MediaWiki was updated on WMF servers. PLURAL keyword was not affected as well.


Version: 1.17.x
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=44747

Details

Reference
bz27571

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:24 PM
bzimport set Reference to bz27571.

Can you give examples where you are seeing these issues?

p.selitskas wrote:

There are three links in the footer: Privacy Policy, About Wikipedia, and Disclaimer (Правілы адносна прыватнасьці, Пра Вікіпэдыю, Адмова ад адказнасьці).

About Wikipedia (Пра Вікіпэдыю) uses grammar function: 'Пра {{GRAMMAR:вінавальны|{{SITENAME}}}}', {{SITENAME}} == 'Вікіпэдыя'. In be-x-old locale it shows 'Пра Вікіпэдыя' instead.

p.selitskas wrote:

The issue affects not only GRAMMAR rules, but redefined messages in MediaWiki namespace. For example, MediaWiki:Sharedupload-desc-here which was redefined for Belarusian (Taraškievica) Wikipedia (be-x-old.wikipedia.org), which seems to have 'be-tarask' locale, in 'be-x-old' locale falls back to default MediaWiki message:

Гэты файл паходзіць з Wikimedia Commons і можа выкарыстоўвацца іншымі праектамі. Апісаньне са старонкі апісаньня файла пададзенае ніжэй.

I can see the issue if they are using $wgGrammarForms, but if they are using some programmatic conversions it should work. But I see no reason why this would have changed when WMF was updated. The code has been as it is forever.

Lowering priority since we are having trouble getting to the root of this

I think the real bug is that the compatibility alias "be-x-old" is listed in the language selector in Special:Preferences to begin with. It should just be left out of the list and silently and automatically replaced with "be-tarask" for any users who happen to have selected it before. (The same should be done for other language code aliases, such as "als" -> "gsw".)

I'm trying to think of a way to fix this with minimal code bloat. One way to do it would be to remove these aliases from $wgLanguageNames and move them to a new list, say, $wgLanguageAliases or something. Of course, this would probably involve changes to other parts of MediaWiki which use $wgLanguageNames, such as interwiki link handling.

Or maybe just leave $wgLanguageNames as it is, but include a list of obsolete language codes and their replacements so that they can be filtered out of language selectors.

(Actually, I'm not even sure what simply removing these aliases from $wgLanguageNames would break. Fixing language selectors for users who have selected an alias before should be easy, since we could just check if their chosen language is in the list and use the fallback if not. Hmm... looks like wfGetLangObj() assumes all valid languages are listed in either $wgLanguageNames or $wgExtraLanguageNames, even though Language::factory() doesn't. We could probably just drop that check, or add an extra check for aliases.)

Gerard.meijssen wrote:

Hoi,
be-tarask and be-x-old are the same. be would be the fall back language for it. However the localisation for be-tarask is (almost) complete.
Thanks,

GerardM

p.selitskas wrote:

(In reply to comment #7)

Hoi,
be-tarask and be-x-old are the same. be would be the fall back language for it.
However the localisation for be-tarask is (almost) complete.
Thanks,

GerardM

MediaWiki core translation is done for 'be-tarask' so there is no reason for this. Our users hardly ever see English-written labels, so 'en' fallback is OK.

(In reply to comment #6)

I think the real bug is that the compatibility alias "be-x-old" is listed in
the language selector in Special:Preferences to begin with. It should just be
left out of the list and silently and automatically replaced with "be-tarask"
for any users who happen to have selected it before. (The same should be done
for other language code aliases, such as "als" -> "gsw".)

I'm trying to think of a way to fix this with minimal code bloat. One way to
do it would be to remove these aliases from $wgLanguageNames and move them to a
new list, say, $wgLanguageAliases or something. Of course, this would probably
involve changes to other parts of MediaWiki which use $wgLanguageNames, such as
interwiki link handling.

Or maybe just leave $wgLanguageNames as it is, but include a list of obsolete
language codes and their replacements so that they can be filtered out of
language selectors.

(Actually, I'm not even sure what simply removing these aliases from
$wgLanguageNames would break. Fixing language selectors for users who have
selected an alias before should be easy, since we could just check if their
chosen language is in the list and use the fallback if not. Hmm... looks like
wfGetLangObj() assumes all valid languages are listed in either
$wgLanguageNames or $wgExtraLanguageNames, even though Language::factory()
doesn't. We could probably just drop that check, or add an extra check for
aliases.)

I'm not sure, but 'be-x-old' exists because of the domain name be-x-old.wikipedia.org. If default MediaWiki language is not tied to the domain in WFM, so there's a reason to safely delete 'be-x-old'. Despite this, 3rd party tools may use domain name for language identification, so it can't be done.

According to $wgDummyLanguageCodes be-x-old should not be shown in the language selector in preferences. It sill doesn't help us migrate user preferences away from it.

p.selitskas wrote:

I've made a GRAMMAR test page about a year ago and thanks to it I detected some strange behaviour. I'm not sure on this one, but perhaps it could be a starting point (and maybe the fallback mechanism actually works good).

https://be-x-old.wikipedia.org/wiki/User:Wizardist/GrammarTest

If you see the green checks, then the GRAMMAR worked the right way (default interface language: be-tarask). In the footer you can also see the link "Пра Вікіпэдыю" ({{int:aboutsite}}).

Now, let's switch to be-x-old (I checked both with 'uselang' and Settings customizing).

If GRAMMAR would not fall back to be-tarask rules, we would see a lot of red fail signs, and we don't. But! If we call by the footer, we can see that {{int:aboutsite}} didn't actually worked out. {{SITENAME}} has just stayed {{SITENAME}} and GRAMMAR rules were not applied.

So, maybe it's not fallback mechanisms to blame.

P.S. dug into code and I think that Skin::footerLink() just doesn't support magic words.

p.selitskas wrote:

I'm re-qualifying the bug to a more general issue (which is the cause of broken GRAMMAR in dummy codes).

Sticking Gerrit change #35383 here.

Gerard.meijssen wrote:

The only reason why be-x-old.wikipedia.org is called as it is is because so far it has not been changed.
There are more projects waiting for a rename. It does not have much of a priority.
Thanks,

Gerard

p.selitskas wrote:

Fixed in Gerrit change #35383.

Now dummy language codes come into infinity. They come into Language, and then the system doesn't hear from them again (that being said, working with normal language codes, e.g. not with be-x-old, but be-tarask, and simple->en, etc.).