Page MenuHomePhabricator

Write git localization import scripts for translatewiki.net
Closed, ResolvedPublic

Description

One thing that the git migration might make more complicated is imports from TranslateWiki. This task is about fixing up those scripts


Version: unspecified
Severity: major

Details

Reference
bz34137

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 22 2014, 12:16 AM
bzimport added a project: Gerrit.
bzimport set Reference to bz34137.

sumanah wrote:

The scripts to change to use Gerrit are:

translatewiki.net:/home/betawiki/update-mediawiki-ext
translatewiki.net:/home/betawiki/bpmw

And step 5 in https://www.mediawiki.org/wiki/Git/Conversion/issues#Localization needs to be automated. Antoine, Chad will write up his notes for you today so you can do this as early as possible next week.

sumanah wrote:

Antoine will be doing this next week. From Chad's notes:

"There's 2 major things to be done to this process.

  1. Update their update-mediawiki-ext script to work on a git clone

instead of a svn checkout. This should be pretty trivial in practice.
A sub-step here is making the FUZZY process more automatic,
but I wasn't convinced that this wasn't out of scope...

  1. Right now, step #5 (on-wiki) is manually applying the message files

that were exported onto their local working copies and then committing.
This is annoying so we'd like to automate that (and probably just fold it
into the export step). This is probably the most time-consuming part.
We're also going to need a dedicated l10n-update user in gerrit with a
private/public key pair that they will push with. That shouldn't be too
bad..."

(Roan agrees that automating the FUZZY process is probably a bit out of scope for the 21 March migration.)

The fuzzy part is indeed out of scope. No idea how that crept in; probably because I mentioned that as a risk to Chad.

I have been granted shell access to translatewiki.net, now need to rethink the workflow for git.

sumanah wrote:

Antoine's thoughts right now (hope you don't mind me sharing them from email, Antoine!) as to the workflow he's implementing for the short term:

"based on a central repository holding all i18n files. This way, TranslateWiki will be able to fetch all the i18n files from that single repository.

For the reverse process:

Have the scripts that export translatewiki translations to push
changes in that single repository.

That repository could be configured to automatically approve change
submitted by translatewiki.

Then we will have to split the files back in the various extension.
This could be done from time to time, for example when releasing a
new version."

Why the sudden change of plans? I see no point in that, because the English source messages cannot be taken away (atomic commits and stuff), so we still need to fetch all the repositories.

Also taking translation updates out of the source is unacceptable if there is no alternative (like LocalisationUpdate) shipped with MediaWiki.

Sumana asked me to jump into this bug. Some thoughts from my side. I do most of the MediaWiki translation meta-stuff at Translatewiki.net

This is my workflow:

  1. Reading every commit (core and ALL extensions). That means I need a tool in Git to go linear from commit to commit like in current CR tool.
  2. I open every commit which touches an i18n file in a new browser tab
  3. A few times per day I run our script "bupdate" on translatewiki.net. It contains a "svn up" for core, extensions used on TWN and ALL i18n files
  4. Sync new English messages keys (and somestimes translations committed too) with TWN using the script "sync-group.php --lang=en --group="ext-..." "
  1. Based on 2. I have to decide:

5a. FUZZY translations in case of relevant message changes of the English original.
5b. For core messages check if committer has updates "messages.inc" and/or "messageTypes.inc" accordingly. Otherwise rebuild of core language fails in 8. breaks.
5c. Identifying message keys that should be ignored or are optional to translate and adding them to the file "mediawiki-defines.txt"
5d. Register new extensions for TWN.

  1. Once per day I export new translations using "bpmw xxx" script (xxx=time in hours since the last export)
  2. Transferring the new i18n files to my local machine
  3. Running a magic script on my local machine to rebuild i18n files (for core only: rebuidLanguage.php)
  4. Create a diff for all changes for a (short review). Sometimes I find errors in the files, especially in i18n files of new extensions or from new approved translators
  5. Commit core and extension i18n files to SVN.

Hope that helps.

I have deployed a basic script wich fetch i18n files from git. The source code is currently versionned on my computer only :/

The script is fetch-i18n-files.pl and has a wrapper named update-mediawiki-ext-git

It will fetch the extensions in projects/wmf-ext

Currently pending installation of package libio-socket-ssl-perl on translatewiki.net

Reply to comment 8: That is not an acceptable solution. It's going back in flexibility, it's error prone and implements L10n updates as an afterthought, instead of as a part of the main development process.

Please propose a solution that lies closer to the current situation.

sumanah wrote:

Antoine said today:

"I got scripts that would fetch the changes from mediawiki/core.git as well as from all extensions hosted on gerrit; will then write basic documentation for the language team"

Still waiting solution for exporting/comitting/pushing. For reading I like solution 2 better.

(In reply to comment #13)

Still waiting solution for exporting/comitting/pushing. For reading I like
solution 2 better.

Exporting change is very similar. While studying that script, I have made a copy of bpmw as bpmw-git. I have changed it to reflect the submodules tree hierarchy which is:

./core/ # points to mediawiki/core.git
./extensions/
./extensions/A # points to submodule mediawiki/extensions/A.git
./extensions/B # points to submodule mediawiki/extensions/B.git

exported php files are moved to ./core/, directories in ./extensions/.

bpmw-git export files to /home/betawiki/export-git and publish a tarball.

This was fixed last week with Raymond and NikeRabbit.

Thanks for the help, everyone. One major hurdle taken.