Page MenuHomePhabricator

Stray newlines at end of some messages break .po -> .mo compilation
Closed, ResolvedPublic

Description

msgfmt freaks out when it sees a message that ends with a straight newline in original but not in translation, something that I've seen creep into translations a number of times.

Error messages look like this:

locale/cs/LC_MESSAGES/statusnet.po:5904: `msgid' and `msgstr' entries do not both end with '\n'
locale/ka/LC_MESSAGES/statusnet.po:5443: `msgid' and `msgstr' entries do not both end with '\n'
locale/ka/LC_MESSAGES/statusnet.po:5568: `msgid' and `msgstr' entries do not both end with '\n'
locale/ka/LC_MESSAGES/statusnet.po:5606: `msgid' and `msgstr' entries do not both end with '\n'
locale/ka/LC_MESSAGES/statusnet.po:5649: `msgid' and `msgstr' entries do not both end with '\n'

Siebrand thinks this may be due to extra whitespace inserted by the JS code, so you end up with " \" instead of "\" for the final line.

They can be manually cleaned up once noticed, but there's a few things that could help prevent them:

  • during editing, see if we can avoid the extra space insertion
  • at save time, see if we can detect the mismatch and throw a warning for the translator to fix it
  • at .po export time, see if we can detect the mismatch and either throw a warning or silently fix it in the output

Version: unspecified
Severity: enhancement

Details

Reference
bz25120

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:15 PM
bzimport set Reference to bz25120.
bzimport added a subscriber: Unknown Object (MLST).

It still puzzles me why this is only a problem for Statusnet. Preferably messages should not end with a whitespace.

What is mentioned in comment 1 is exactly the issue. In particular, when messages end with a newline ("\n") that was noted in Translate using a single "\" without leading whitespace. However, we use to have TM suggestions that did insert whitespace at the beginning of every line, or if translators copy-pasted the TM suggestion instead of using the JavaScript insert arrow, it might also have leading whitespace. Annoying, but almost impossible to fix, because however unlikely, " \" (a single space) might be valid message content in gettext.

The only fix that I see is save time time gettext validation (source to translation) for exactly this issue, and if the "entries do not
both end with '\n'" just mark it fuzzy. Is that possible, Niklas?

This should maybe be a JS based check, based on the source message for gettext groups, or done "on save".

Nikerabbit assigned this task to abi_.
Nikerabbit added subscribers: abi_, Nikerabbit.

This can now be enforced with the new validator framework added by @abi_ .