Page MenuHomePhabricator

Incorrect use of Russian translation on Ukrainian wiki when Ukrainian translation available (incorrect fallback merging?)
Closed, ResolvedPublic

Description

This is an apparent separate bug discovered in the course of bug 67805.

Niklas suggested, "LC issue is possibly caused by incorrect fallback merging, because L10nupdate contains update for ru, but not for uk (because the translation is already up to date for uk):"


Version: wmf-deployment
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=67805
https://bugzilla.wikimedia.org/show_bug.cgi?id=70686

Details

Reference
bz68781

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:30 AM
bzimport added projects: Deployments, I18n.
bzimport set Reference to bz68781.
bzimport added a subscriber: Unknown Object (MLST).

Further detail from bug 67805: "I tracked that one as far as observing that the json file for uk that the cdb file gets built from has "Править вики-текст" as the value for messages:visualeditor-ca-editsource rather than "Редагувати код"."

So the problem seems to be in the (new JSON-based) l10nupdate process.

(In reply to Brad Jorsch from comment #1)

So the problem seems to be in the (new JSON-based) l10nupdate process.

Moving to "Deployment systems"; feel free to move back.

Niklas: Can you give any pointers to Brad to help him get up to speed on this?

(In reply to Greg Grossmeier from comment #3)

Niklas: Can you give any pointers to Brad to help him get up to speed on
this?

More specifically, I tracked the problem down as far as "the .json files are wrong". When I asked around, you were the only person anyone could point to who knows anything about the json-generating code. Where do I find this code? If it's not going to be obvious, how generally does that code work? Is there a known way to run just the json-generator and output files to some temporary directory?

(In reply to Brad Jorsch from comment #4)

point to
who knows anything about the json-generating code.

https://git.wikimedia.org/blob/mediawiki%2Ftools%2Fscap.git/HEAD/bin%2FrefreshCdbJsonFiles (authors Bryan and Antoine, AFAICS).

To clarify, are we talking about the json created by LocalisationUpdate (which are then included in the CDB files by LocalisationCache), or the json files created from the CDB files by the deployment scripts for efficient syncing?

(In reply to Nemo from comment #5)

https://git.wikimedia.org/blob/mediawiki%2Ftools%2Fscap.git/HEAD/
bin%2FrefreshCdbJsonFiles (authors Bryan and Antoine, AFAICS).

Yeah, I asked them. They said they didn't know anything about the actual generation of the json stuff.

(In reply to Niklas Laxström from comment #6)

To clarify, are we talking about the json created by LocalisationUpdate
(which are then included in the CDB files by LocalisationCache), or the json
files created from the CDB files by the deployment scripts for efficient
syncing?

There's two different json files? I'm glad I asked for help: I didn't know that it generated json files for the updated messages, then built cdbs, then generated more json files from the cdbs, and then generated cdbs again from the json files.

I was referring to the json files that are created from the cdbs and then used to recreate the cdbs. Now that I know better how things work, I'm digging further into it.

Change 158146 had a related patch set uploaded by Anomie:
LocalisationCache: Process one fallback at a time

https://gerrit.wikimedia.org/r/158146

Change 158147 had a related patch set uploaded by Anomie:
Use new LocalisationCacheRecacheFallback hook

https://gerrit.wikimedia.org/r/158147

The problem is over-optimization: in LocalisationUpdate you're storing only the changed messages for each language, but that's causing problems when the current language doesn't have a change but a fallback language does. We've had this bug before, and it was worked around in r104041.

Rather than doing the same thing as r104041 again and having this recur again the next time someone tries to optimize things, let's really fix it: LocalisationUpdate needs a hook to call for each fallback before it's merged into the final array, rather than one hook after we've already lost the information as to which language each message came from. But to do that, we have to rearrange the processing in LocalisationCache so it accumulates the data for each fallback language

Change 158147 merged by jenkins-bot:
Use new LocalisationCacheRecacheFallback hook

https://gerrit.wikimedia.org/r/158147

Change 158146 merged by jenkins-bot:
LocalisationCache: Process one fallback at a time

https://gerrit.wikimedia.org/r/158146

Now theoretically FIXED (once the code is deployed)?

Change 158421 had a related patch set uploaded by Reedy:
LocalisationCache: Process one fallback at a time

https://gerrit.wikimedia.org/r/158421

Change 158422 had a related patch set uploaded by Reedy:
Use new LocalisationCacheRecacheFallback hook

https://gerrit.wikimedia.org/r/158422

Change 158423 had a related patch set uploaded by Reedy:
LocalisationCache: Process one fallback at a time

https://gerrit.wikimedia.org/r/158423

Change 158422 merged by jenkins-bot:
Use new LocalisationCacheRecacheFallback hook

https://gerrit.wikimedia.org/r/158422

Change 158421 merged by jenkins-bot:
LocalisationCache: Process one fallback at a time

https://gerrit.wikimedia.org/r/158421

Change 158423 merged by jenkins-bot:
LocalisationCache: Process one fallback at a time

https://gerrit.wikimedia.org/r/158423

I think it's safe to mark this fixed now, although we still have to force-update all the l10n caches to make sure brokenness doesn't stay cached somewhere. Reedy says on IRC that he'll do that once he finishes the train deploy that's going on right now.