Page MenuHomePhabricator

Import of Commons partial export fails with error about plurals-mediawiki.xml
Closed, ResolvedPublic

Description

I exported the latest version of all of the pages starting with MediaWiki:Lang/ (MediaWiki:Lang/de, etc.), to test Template:LangSwitch. However, on import I get repeating warnings like the below, I believe for every revision:

Warning: DOMDocument::load(): I/O warning : failed to load external entity "/vagrant/mediawiki/languages/data/plurals-mediawiki.xml" in /vagrant/mediawiki/includes/cache/LocalisationCache.php on line 588

Call Stack:

 0.0045     778856   1. {main}() /vagrant/mediawiki/maintenance/importDump.php:0
 0.0121    1396960   2. require_once('/vagrant/mediawiki/maintenance/doMaintenance.php') /vagrant/mediawiki/maintenance/importDump.php:291
 0.3200   17065176   3. BackupReader->execute() /vagrant/mediawiki/maintenance/doMaintenance.php:113
 0.3202   17065624   4. BackupReader->importFromStdin() /vagrant/mediawiki/maintenance/importDump.php:97
 0.3206   17066456   5. BackupReader->importFromHandle() /vagrant/mediawiki/maintenance/importDump.php:253
 0.3279   17824496   6. WikiImporter->doImport() /vagrant/mediawiki/maintenance/importDump.php:286
98.5509  207738984   7. WikiImporter->handlePage() /vagrant/mediawiki/includes/Import.php:473
98.5575  207748280   8. WikiImporter->handleRevision() /vagrant/mediawiki/includes/Import.php:609
98.5603  207754536   9. WikiImporter->processRevision() /vagrant/mediawiki/includes/Import.php:658
98.5607  207758064  10. WikiImporter->revisionCallback() /vagrant/mediawiki/includes/Import.php:706
98.5607  207758696  11. call_user_func_array() /vagrant/mediawiki/includes/Import.php:351
98.5608  207758840  12. BackupReader->handleRevision() /vagrant/mediawiki/includes/Import.php:0
98.5608  207758920  13. call_user_func() /vagrant/mediawiki/maintenance/importDump.php:165
98.5608  207758976  14. WikiImporter->importRevision() /vagrant/mediawiki/maintenance/importDump.php:0
98.5610  207759760  15. DatabaseBase->deadlockLoop() /vagrant/mediawiki/includes/Import.php:257
98.5623  207760720  16. call_user_func_array() /vagrant/mediawiki/includes/db/Database.php:3004
98.5623  207761088  17. WikiRevision->importOldRevision() /vagrant/mediawiki/includes/db/Database.php:0
98.5918  207781560  18. WikiPage->doEditUpdates() /vagrant/mediawiki/includes/Import.php:1434
98.5927  207782424  19. WikiPage->prepareContentForEdit() /vagrant/mediawiki/includes/WikiPage.php:2069
98.6048  207799064  20. WikitextContent->getParserOutput() /vagrant/mediawiki/includes/WikiPage.php:2026
98.6048  207799384  21. Parser->parse() /vagrant/mediawiki/includes/content/WikitextContent.php:300
98.6058  207812752  22. Parser->internalParse() /vagrant/mediawiki/includes/parser/Parser.php:395
98.6079  207813128  23. Parser->replaceInternalLinks() /vagrant/mediawiki/includes/parser/Parser.php:1230
98.6079  207813264  24. Parser->replaceInternalLinks2() /vagrant/mediawiki/includes/parser/Parser.php:1825
98.6080  207815560  25. Parser->getTargetLanguage() /vagrant/mediawiki/includes/parser/Parser.php:1861
98.6081  207815560  26. Title->getPageLanguage() /vagrant/mediawiki/includes/parser/Parser.php:837
98.6086  207815624  27. ContentHandler->getPageLanguage() /vagrant/mediawiki/includes/Title.php:4824
98.6097  207815920  28. wfGetLangObj() /vagrant/mediawiki/includes/content/ContentHandler.php:620
98.6107  207922328  29. Language::factory() /vagrant/mediawiki/includes/GlobalFunctions.php:1304
98.6107  207922328  30. Language::newFromCode() /vagrant/mediawiki/languages/Language.php:196
98.6122  207922688  31. Language::getFallbacksFor() /vagrant/mediawiki/languages/Language.php:237
98.6123  207922928  32. LocalisationCache->getItem() /vagrant/mediawiki/languages/Language.php:4120
98.6123  207922928  33. LocalisationCache->loadItem() /vagrant/mediawiki/includes/cache/LocalisationCache.php:259
98.6124  207922928  34. LocalisationCache->initLanguage() /vagrant/mediawiki/includes/cache/LocalisationCache.php:324
98.6203  207923208  35. LocalisationCache->recache() /vagrant/mediawiki/includes/cache/LocalisationCache.php:442
98.6667  208613368  36. LocalisationCache->readSourceFilesAndRegisterDeps() /vagrant/mediawiki/includes/cache/LocalisationCache.php:772
98.7115  210162200  37. LocalisationCache->getPluralRuleTypes() /vagrant/mediawiki/includes/cache/LocalisationCache.php:630
98.7115  210162200  38. LocalisationCache->loadPluralFiles() /vagrant/mediawiki/includes/cache/LocalisationCache.php:558
98.7188  210162576  39. LocalisationCache->loadPluralFile() /vagrant/mediawiki/includes/cache/LocalisationCache.php:578
98.7188  210163240  40. DOMDocument->load() /vagrant/mediawiki/includes/cache/LocalisationCache.php:588

Version: 1.23.0
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=62368

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:27 AM
bzimport set Reference to bz56439.

Created attachment 13641
Dump that shows problem

Attached:

Created attachment 13643
Comparison trivial import that works

The import with the warning also does not seem to work (though I have ctrl-Ced it every time) even partially.

A trivial one-page import (attached) works fine.

Attached:

Only the full history version of it, works fine for the current versions only dump

  • Bug 60435 has been marked as a duplicate of this bug. ***

I wonder if this has to do with insufficient caching: if we're hammering the localisation cache in MySQL for hundreds revisions per second, I wouldn't be surprised that something goes wrong.
In an instance set up by role::mediawiki-install::labs , I now uncommented

$wgCacheDirectory = "$IP/cache";

and

php maintenance/rebuildLocalisationCache.php --force

took only few seconds instead of minutes because it uses CDB files (I first checked write permissions in cache/ directory, owned by www-data rather than root).
From a cursory view of top, it seems that now the php process running importDump is as busy as before, but mysqld is one order of magnitude less loaded.

(In reply to Yuvi Panda from comment #3)

Also affects testwikidatawiki

What caching system did it use?

Adding libxml_disable_entity_loader(false); to the top of LocalisationCache::loadPluralFile makes the warning go away. We disable the entity loader for a reason (it's insecure), so we can't use that as a solution. But it does finger entity references as most likely being the underlying cause.

Apparently caused by the fact that libxml_disable_entity_loader is not thread-safe. Great one, PHP!

See https://bugs.php.net/bug.php?id=64938 and http://pyd.io/f/topic/failed-to-load-external-entity-boot-confmanifest-xml

Change 149507 had a related patch set uploaded by TTO:
Use file_get_contents to load plurals XML data

https://gerrit.wikimedia.org/r/149507

  • Bug 65116 has been marked as a duplicate of this bug. ***

According to comments on the patch, it is not clear if it actually fixes the issue or not.

Not sure if any of those commenters actually tested it...

Change 149507 had a related patch set uploaded (by TTO):
LocalisationCache: Use file_get_contents instead of DOMDocument::load

https://gerrit.wikimedia.org/r/149507

Patch-For-Review

Change 149507 merged by jenkins-bot:
LocalisationCache: Use file_get_contents instead of DOMDocument::load

https://gerrit.wikimedia.org/r/149507