Page MenuHomePhabricator

mw-update-l10n failure does not abort scap
Closed, ResolvedPublic

Description

mw-update-l10n wraps the l10n cache update in parentheses, causing it to run in a sub-shell.

The first problem is that 'set -e' doesn't work predictably in sub-shells. Its precise behavior is documented here: http://fvue.nl/wiki/Bash:_Error_handling#Caveat_1:_.60Exit_on_error.27_ignoring_subshell_exit_status.

The second problem is that the exit status of the sub-shell is ||'d with a call to 'echo' to emit an error message. Because echo typically returns success, the exit status of mw-update-l10n is always success, and so l10n failures do not reach scap.

A good solution is for the parent shell to explicitly check the exit status of any important command it invokes.

This pattern works well:

some_command || {

st="$?"
echo "some_command failed"
exit $st

}

It can be neatly encapsulated in a bash function, as shown in http://mywiki.wooledge.org/BashFAQ/101.


Version: wmf-deployment
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=50371

Details

Reference
bz50347

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:45 AM
bzimport set Reference to bz50347.
bzimport added a subscriber: Unknown Object (MLST).

A further consequence of the misplaced confidence in 'set -e' is that scap will sync an empty l10n file if mergeMessageFileList.php fails for whatever reason.

The problem is in lines 42-46 of mw-update-l10n:

mwTempDest=$(sudo -u apache mktemp)
sudo -u apache $BINDIR/mwscript mergeMessageFileList.php --wiki="$mwDbName" \
  --list-file=$MW_COMMON_SOURCE/wmf-config/extension-list $QUIET --output="$mwTempDest"
sudo -u apache chmod 664 "$mwTempDest"
cp "$mwTempDest" $MW_COMMON_SOURCE/wmf-config/ExtensionMessages-"$mwVerNum".php

An empty file is created by 'mktemp'. Because we're executing in a subshell, if mergeMessageFileList.php fails, the script continues to execute, and control eventually flows to 'cp' on line 46, which copies the temp file to ExtensionMessages-X.XXwmfX.php. The empty message file is then synced.

in case of WikibaseDataModel, we can try explicitly including it:

https://gerrit.wikimedia.org/r/#/c/71004/

I don't know if this really helps but seems to work better when i try mergeMessageFileList.php

if there is anything else we should do differently, please let us know!

Change 71056 had a related patch set uploaded by Aude:
(bug 50347) have mergeMessageFileList check if extension files in file-list are available

https://gerrit.wikimedia.org/r/71056

among other issues, I think we should not rely on include_once and have the mergeMessageFileList check for and omit any missing extension files.

see https://gerrit.wikimedia.org/r/71056

(In reply to comment #4)

among other issues, I think we should not rely on include_once and have the
mergeMessageFileList check for and omit any missing extension files.

see https://gerrit.wikimedia.org/r/71056

That is fine, but note that it wouldn't have avoided downtime in the recent case. We have to abort when there is a PHP fatal error, which causes include_once() to return false, so we have to check the return value from include_once. But WikibaseDataModel does an explicit "return" from the file scope, which causes include_once() to return null. That's why mergeMessageFileList.php continued to fail even after WikibaseDataModel was checked out.

The bug is fixed as described, I confirmed it on the live site. The problem with the WikibaseDataModel extension registration can be split out to another report if necessary.

Change 71056 abandoned by Tim Starling:
(bug 50347) have mergeMessageFileList check if extension files in file-list are available

Reason:
Per author.

https://gerrit.wikimedia.org/r/71056