Page MenuHomePhabricator

Display changed content-language messages in preference to default interface-language messages
Closed, ResolvedPublic

Description

Author: arnomane

Description:
Some time ago the I18n system of MediaWiki worked in multilanguage wikis like Wikimedia Commons the following way:

  • Look if there is a localised English message at the corresponding MediaWiki namespace page (directly in SQL

database, not transcluded from I18n-file) regardless of the user language setting.

  • If the english message string is existing in the MediaWiki namespace don't use the transcluded I18n-file for that

message in any language but only the corresponding /<ISO-CODE> sub page if it is directly existing.

  • If that page is not existing fall back to the English message
  • If neither the english message nor the translated message exists in the MediaWiki namespace use the transcluded

string from the I18n file.

Since some months MediaWiki works the following way:

  • If the specific localised message is not existing in MediaWiki namespace take the transcluded message from the

I18n-file.

This is one of the worst UI changes we just realised by random in Commons, seriously. We did heavily rely on the
old behavior. If a message wasn't existing in the database, the user got the english one. Fine. The changes on
important UI messages were easy to follow and were an acceptable amount of work. We didn't need to fear that people
did get any totally useless UI messages if they did choose a less common language none of us admins was able to
speak.

Until we realised the change that http://commons.wikimedia.org/wiki/Special:Upload did show just useless info and
nothing what is absolutely crucial to remind during upload if you did have a less common language setting it took
several months (the corresponding default message MediaWiki:Uploadtext/<ISO-code> from I18n-files is totally
useless in Commons). Within that period many people did upload a lot of copyvios simply because they were not able
knowing it better. These are the nightmares of Commons admins: People flaming at them why they weren't informed
that fair use and such is not allowed and a Commons admin that somehow need to come to the idea that they really
weren't properly informed and just don't ly.

The result of that severe bug was this huge overwrite-quick-and-dirty-work-around for this single upload message
only:
http://commons.wikimedia.org/w/index.php?offset=&limit=50&target=Arnomane&title=Special%3AContributions&namespace=8
(all messages with the "in order to supress a inadequate default mediawiki message" change comment).

There are much more important MediaWiki messages existing that would need such a time consuming and painfull hack
ASAP.

So please fix that bug inmediatly. I wouldn't have been so bold and would have switched priority and severity to
the maximum if this wouldn't be the case.

The most elegant solution would be as follows:

  • If the default language message is more current (timestamp) then a translation take the default language message

(usually english in multilanguage wikis). I18n-file messages are allays the oldest ones.

If this elegant solution isn't doable easily or within the next few days please please consider using the old
behaviour in the meantime.


Version: unspecified
Severity: major

Details

Reference
bz8188

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:33 PM
bzimport set Reference to bz8188.
bzimport added a subscriber: Unknown Object (MLST).

rotemliss wrote:

I don't fully understand the problem. Please specify more details. Generally, if
the user (interface) language is different than the content language, and a
custom message page (i.e. MediaWiki:Key/lang) is not exist (and the message is
exist in the language file and is not an extension message), the message from
the messages file is used by default, not the English message, and I don't think
it was changed in the last several months.

A possible solution for that is to specify several messages in LocalSettings.php
as messages which should be used from the content language fallback if it was
changed from their source and if a localised message page (i.e.
MediaWiki:Key/lang) is not exist (recentchangestext and uploadtext should
probably be there, and all the messages which are used to be changed a lot).
It's probably too expensive, though.

By the way, it's better to use {{int:message_name}} instead of copying the
message contents in the workaround for that.

arnomane wrote:

At a certain point which I cannot specify by date (some day at least in early 2006) I18n certainly worked different
as described above. I can't say when this problem did come up the first time as we Commons admins simply are
struggeling with just about everything (you don't switch your language to something exotic every day) and everytime
realize broken things on i18n just afterwards by random (this wasn't the first time that certain aspects of i18n had
been changed in a surprising way towards multilingual wikis). So I am not talking about a change within the last two
months (see my contribs link, it is dated from 30. September). I am talking about a change I realized on 30.
September and I honestly couldn't believe that people have programmed such a bug (and I am personally not really
exited about mediawiki bugzilla).

Its not about several messages, just about every interface message can be important. And I know how much friction it
causes to ask devs on changing certain aspects of i18n on specific messages (~2 weeks constantly going on the nerves
of people per single item at minimum - someting I simply don't wnat to do, cause you can imagine that this causes
not so nice feelings towards me). So something like "let us define some messages that get treated special" simply
does not work as this will be a further work for server devs too.

And I often did hear the "this is too expensive to the servers": Honestly either Wikimedia Commons is truly
mulitlingual or officially declare that Commons is English only. I want a clear table not something in between,
which causes us *a lot* of additional work and internal friction. I am just sick of being made responsible for every
i18n-shit by random Wikipedia idiots and thus have lost fun working in Commons.

The template thing wouldn't have really made a difference in the edit number just further changes would have been
applied automatically in the other places and templates or some "template like syntax" in Mediawiki namespace are a
fun of their own (random parser bugs coming up and disapearing). That's why I intentionally did fill it in with the
message itself.

Sorry for my ranting but such problems are driving me crazy.

arnomane wrote:

And furthermore this bug is a "major loss of functionality" and thus cannot be a "normal priority".

rotemliss wrote:

As far as I can see in the file from December 2005, the same order was used
(though some elements may have been changed), and a test of version 1.5.8 had
the same results (the interface message from the messages file was used when
specified the interface language page was not exist, but the specified content
lanugage page *was* exist, in the message MediaWiki:Uploadtext of Special:Upload).

I think that it's a misunderstanding - currently, the following general
algorithm (let's call it Algorithm 1) is used to get messages:

  1. Check for a page in MediaWiki namespace and use it as a message:

1a. If the interface language ("uselang" page parameter or from preferences) is
not the content language, use the page "MediaWiki:Message/lang".
1b. If the interface language is the same as the content language, use the page
"MediaWiki:Message".

  1. If the message is still not received, check in the extension messages

(irrelevant, as uploadtext and the others are not extension messages).

  1. If the message is still not received, check for it in the localised language

file. Fallback message from the language file may be used if not exist, and
English message from the language file may be used if not exist.

  1. If the message is still not received and the key contains the key '/', use

the message from the appropriate messages file (probably irrelevant, as such
calls to messages are used in either explicit {{int:message/lang}} calls, or the
view of pages from the MediaWiki namespace, not in Special:Upload or everything).

  1. If the message is still not received (which means it was not in any language

file, even the English one, and not an extension message - which means a custom
message, which is probably not the case here), try to use the default English
message in the DB. Useful for links like the Village Pump in the sidebar, but
probably irrelevant for this.

This is the most current algorithm. As far as I understand, you want to use the
following algorithm (Algorithm 2) - though I may be wrong:

  1. Do step 1 in Algorithm A.
  2. If the message is still not received, check for the page "MediaWiki:Message"

and use it instead, regardless the user interface setting.

  1. Continue with steps 2-5 in Algorithm A.

Using such algorithm makes all the messages use the content language if an
interface message page is not found. Since most message pages for interface
messages are not exist because they are trivial and should not be changed, this
will actually prevent the use of an interface language. Please correct Algorithm
2 if it is wrong and not what you meant to say; I just don't understand how this
can be solved.

Thanks for the report, anyway.

arnomane wrote:

Well, maybe something different could have faked for me this algorithm. The i18n-files were enlarged a lot lately
thanks to the work of many people. Previous to that maybe these files did contain nothing for the specified message
and it worked like I did witness it many times.

Yes your described "Algorithm 2" is what I was suggesting as a quick-and-dirty solution. Not ideal but probably easy
to change in the source code and at least orders of magnitudes better than the current one. Your concern that a
minor change (like a typo) to the english default message would render manly valid translated messages valid is not
a real problem. I only remember two typos. The only thing would be that there are old english messages existing in
the database which are the same or even worse than the messages from the i18n-files. This could be solved with a
joined effort at cleaning up the MediaWiki namespace. Something which needs to be done anyways (btw. even more in
Wikipedias). Every other customization I can recall (and that are quite a lot) was nothing "trivial". Fakeing good
language support for a less common language in Wikipedia Commons via not making the huge customization differences
visible is really bad. For example I made a small but effective change in Commons: "articles" aren't "articles"
anylonger but "galleries". You'd have probably regarded it as minor from outside but it definitely is not because a
gallery is not filled in with plain text but well image, audio, whatever galleries (we had the problem of quite some
people abusing Commons as an encyclopedia or a random text core dump in article namespace).

Let me theorize about the perfect solution (wich involes more changes in source code):

  • MediaWiki pages (or current versions of te corresponding pages) could be tagged with a flag that

indicates "message up to date".

  • Everytime a default language message in MediaWiki namespace gets changed and you didn't flag this change

as "minor" all messages in /<Code> pages loose the "message up to date" flag (transcluded messages from the
i18n-files are always regarded as not up to date in case the default language message was customized in MediaWiki
namespace).

  • MediaWiki only uses message strings that have the "message up to date" flag. In case a certain message does not

have such a thing the message of the default language will be used as fallback.

ayg wrote:

Note: mid-air collision here, and I have to go now, but it seems Daniel Arnold
and I were thinking fairly similar things.

Step 2 in Algorithm 2 should have the added proviso "if it's not the same as the
language file message", presumably. Also, the interface language should
presumably try to inherit from on-wiki messages as usual, rather than going
straight to English.

Regardless, this would more or less totally destroy the ability to use different
interface languages on large monolingual wikis like enwiki, which have a high
number of messages superficially changed. So two major possibilities that I can
see:

  1. Use Algorithm 2 for multilingual wikis only.
  2. Somehow allow admins to specify an override on a per-message basis that

forces the use of the on-wiki message.

#2 at least is not at all trivial. #1 might be doable in a reasonable time
frame without generating masses of complaints from wiki users about not being
able to use their preferred interface language, but the consequences of that
still need to be considered, and it doesn't really fix the issue except
superficially.

(In reply to comment #2)

And I often did hear the "this is too expensive to the servers": Honestly

either Wikimedia Commons is truly

mulitlingual or officially declare that Commons is English only.

I would say that forcing display of superficially-changed English messages to
people who speak another language is more Anglocentric than the opposite.

(In reply to comment #3)

And furthermore this bug is a "major loss of functionality" and thus cannot be

a "normal priority".

No functionality was ever lost, so this is an enhancement request and not a bug.
However, it does need to be taken seriously, I agree. Let's just not implement
a solution that breaks five things we hadn't thought of because we were in a
rush. Whether you realize it or not you've survived with this since the Commons
was started, and important English messages can be bot-written to the necessary
places, so this isn't of such dire urgency that it has to be fixed right now.
It's a pretty major change.

arnomane wrote:

Comment 6:

rather than going straight to English.

I never meant choosing straight to English but the wiki default language. As well in Mediawiki there does not exist
something like a multilingual wiki (this just exists as a common social definition) and as Wikimedia Commons default
language is English I did use the word "English" when I meant "wiki default language".

Regardless, this would more or less totally destroy the ability to use different interface languages on large
monolingual wikis like enwiki, [...]

No it won't even not in "monolingual" wikis. Which value has a faked language competence of a given wiki community
if the customizations of that wiki aren't translated/visible in that language? I myself am German and I love my
mother language and use it wherever I can even in software (and all my wiki accounts, also en.wikipedia, have German
interface language setting) but it is really bad if I just can't inmediatly realize the interface changes just
because I stick to my "unsupported" German if possible. If I see the differences I am interested in making my
interface showing "clean" German again and thus will care about translation.

Beside that: It is better having 5 up to date translations than 80 somehow-translations that cause a lot of
lost-in-translation-friction. None needs to define the these 5 languages in advance (note "5" is just a example
number). It defines itself by maintenance within a community.

And furthermore outdated i18n-file messages even in monolingual wikis are dangerous. I am very sure that there is a
significant number of people having English interface in de.wikipedia. Now let's look for example at the upload form
there with German and English interface setting and note the difference:

http://de.wikipedia.org/wiki/Spezial:Upload?uselang=de (the standard text with the necessary information)
http://de.wikipedia.org/wiki/Spezial:Upload?uselang=en (don't show this to an admin who did delete 500 fair use
images that day).

Honestly I expect MediaWiki to make people realizing differences even if they don't understand German so that they
are aware "stop there is something important, I better ask first".

Maybe you want to have a look in en.wikipedia too:

This is what I get with my german interface: http://en.wikipedia.org/wiki/Special:Upload?uselang=de
This is what I should get somehow: http://en.wikipedia.org/wiki/Special:Upload?uselang=en

Whether you realize it or not you've survived with this since the Commons was started, and important English
messages can be bot-written to the necessary places, so this isn't of such dire urgency that it has to be fixed
right now. It's a pretty major change.

Well Commons has survived but how? I don't want to play me in the foreground but I did contribute a significant part
to the fact that Commons is still existing (there are other "project founders" of the Commons that just have
disapeared there and that are following their new pet idea instead of careing about their baby). As well there were
some other i18n-things that just got b0rked (like i18n of important global javascript add-ons I did code in large
parts). I don't want to fix everytime my once working stuff because I want to proceed with other problems that wait
for their solution too.

As well as I said in the past i18n did work better in Commons (see my comment 5 as well regarding that) so there is
a loss of functionality even if it would turn out in the end that no affecting code was actually changed.

But if people are afraid that the switch of default search order for interface messages could be bad for them a
switch like $wgPreferDatabaseMessages = true/false in LocalSettings.php would do it.

ayg wrote:

(In reply to comment #7)

I never meant choosing straight to English but the wiki default language.

Right, that's what I meant to say too.

This is what I get with my german interface:

http://en.wikipedia.org/wiki/Special:Upload?uselang=de

This is what I should get somehow:

http://en.wikipedia.org/wiki/Special:Upload?uselang=en

But look at the list of other things that have been changed at
[[Special:Allmessages]]. The inclination is to tweak, not just to overhaul.
Instead of "Über Wikipedia", you'd see "About Wikipedia". Instead of
"Kombinierte Anzeige der Datei-, Lösch-, Seitenschutz-, Benutzerblockaden- und
Rechte-Logbücher.<br />Sie können die Anzeige durch die Auswahl des Logbuchtyps,
des Benutzers oder des Seitentitels einschränken.", you'd see the equivalent in
English, but with an now-incorrect and never massively useful note tacked on the
end. Instead of "Dies ist eine Liste aller möglichen Texte im
MediaWiki-Namensraum" you'd see the same in English, because someone thought
there should be a colon after "MediaWiki:".

And so on. I'd guess that maybe 20% or more of the messages you'd see would be
in English, and most of the changes would be trivial. Some would be important,
and so this needs to be addressed. But the simple approach of overwriting *all*
the user language messages where content messages have been updated is
unacceptable and will result in a mass of complaints.

But if people are afraid that the switch of default search order for interface

messages could be bad for them a

switch like $wgPreferDatabaseMessages = true/false in LocalSettings.php would

do it.

Yes, that's one possibility. Having sysops decide message by message would be
better but harder. Either way it would then be up to each wiki to decide
whether they wanted the option.

arnomane wrote:

The issue with minor tweaking changes hiding still valid translations could be adressed the following way: If the
message string of the default language has only "minor" tagged versions look in the i18n-files first.

However there is a transition problem with that idea. In all Wikimedia wikis the MediaWiki namespace needs to be
cleaned up most such edits aren't probably tagged as minor.

  • Useful generic changes (like stylistic unification and so on) need to be moved upstream into the i18n-files and

the custom local mediawiki page gets deleted afterwards.

  • Check if customized messages hide the same (probably easy to do automatically) message in the

local wiki (probably a lot as many local messages were moved upstream).

  • Beside that: Clean up and delete messages that don't get used anylonger (sadly not visible in the interface but

that's another bug ;-).

Anyways this solution with simply switching the message search order is not a perfect solution but would be a quick
and for Commons and in case of important messages/important message changes in the other wikis a very effective
solution and I thus would strongly recomend doing it now with the $wgPreferDatabaseMessages switch option.

A really good working long term solution (wich could be still done after the quick one) has been described by me in
comment 5 (with the addition that initially all locally translated messages get tagged as up to date by the software
and later people look for themselves).

rotemliss wrote:

Simetrical is completely right about the problems which will be introduced by
the hack. I strongly recommend not to use this hack (though a limited version of
it, which includes a specific list of messages, could be used). About the
proposal with minor changes: I'm not sure, but I think it is too expensive,
especially if you check it for every message.

Note that there is a script named maintenance/rebuildMessages.php, which creates
and updates the default messages using a user named "MediaWiki default" (though
I'm not sure such updates are necessary, as the default message is shown in the
message and edit pages anyway, and as the whole namespace is now protected
instead of specific messages (so vandals can't create messages by hand when the
script doesn't run), it is currently used). The current assumption is that every
default message should have a MediaWiki page, which is possible to edit for
sysops. You will first have to check if it's the same as in the messages file.

About deleting obsolete messages, see Bug 5665.

The propsal in comment 5 seems OK to me.

arnomane wrote:

As I meantioned above there it is impossible having a specific list of important messages that get treated special.
I know this because I know the pain with i18n of link targets (honestly I really wonder why people get paranoid on
urls but not on plain wrong legal interface texts).

Commons now needs a quick solution within the next days. I don't want to wait three months for the perfect solution
without any progress in between, although I want the perfect solution too (but I know how long it takes). I don't
overestimate the problem. We simply are not able to maintain that wiki with the current software. And no: More
admins won't do it. We have quite some admins. But every admin job in Commons needs 10 times longer and is 10 times
more annoying than elsewhere cause of the multiwiki and multilingual implications. This is a serious threat for the
Commons. Just look at the plain upload vs. delete statistics for just one number. Most people have no imagination
how damn hard it is to maintain Commons. Commons admins simply loose to many time because of crappy software and yes
I am really angry about Erik Möller that he did at least co-found Commons but didn't care with the same high energy
about its software problems but proceeded with new ideas.

The problems mentioned by Simetrical are neglectible in Commons (I can say this for sure) so a quick hack with the
$wgPreferDatabaseMessages wouldn't hurt the other wikis if they don't want it.

brianna.laugher wrote:

Messages seem to be behaving at random.

http://commons.wikimedia.org/wiki/special:upload?uselang=wikinews

The messages that are different to the MediaWiki default ones, are
MediaWiki:Uploadtext, MediaWiki:Fileuploadsummary, MediaWiki:Nolicense
(the default option in the drop down list), and MediaWiki:Licenses
(the drop-down list contents).

None of the 'wikinews' messages are defined. I assume there is no
MessagesWikinews.php file.
All of the messages show the MediaWiki default, instead of the current
MediaWiki:Key page, *except* for MediaWiki:Licenses.

  • This bug has been marked as a duplicate of bug 1495 ***