Page MenuHomePhabricator

API requests to commons frequently return 500
Closed, ResolvedPublic

Description

Author: daniel

Description:
Similar to bug 30201 API request using pywikipediabot now frequently return error 500. This screws up my Quality Images bot completely as it fails in a half finished state, creating a terrible mess.


Version: unspecified
Severity: normal

Details

Reference
bz32598

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:07 AM
bzimport set Reference to bz32598.
bzimport added a subscriber: Unknown Object (MLST).

daniel wrote:

What queries? It is a simple page.put() in pywikipedia:

Updating page [[Commons:Quality images/Subject/Places/Man made structures]] via API
HTTPError: 500 Internal Server Error

WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server is down. Retrying in 1 minutes...

(In reply to comment #2)

What queries? It is a simple page.put() in pywikipedia:

Updating page [[Commons:Quality images/Subject/Places/Man made structures]] via
API
HTTPError: 500 Internal Server Error

WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server is down. Retrying in 1 minutes...

Could you get on IRC some time and find me (RoanKattouw)? Then we can do a controlled experiment where you try to reproduce the error while I watch the server logs.

(In reply to comment #2)

What queries? It is a simple page.put() in pywikipedia:

Updating page [[Commons:Quality images/Subject/Places/Man made structures]] via
API
HTTPError: 500 Internal Server Error

WARNING: Could not open 'http://commons.wikimedia.org/w/api.php'. Maybe the server is down. Retrying in 1 minutes...

Right, but you didn't say this in your original post, so you could've been doing one of many different things. We're not mind readers ;)

daniel wrote:

The page in question is very large (3.3Mb). Might this be a problem?

I just went through the logs of the last few weeks and 17 out of 18 times the error happens when the bot is trying to update that particular page! Sometimes it works after a few retries, sometimes all retries fail (pywikipediabot seems to ignore the increased retry count I set).

(In reply to comment #5)

The page in question is very large (3.3Mb). Might this be a problem?

You mean you were trying to save 3.3 megs of *wikitext*? That should just be rejected, because $wgMaxArticleSize is set to 2000 KB.

daniel wrote:

Oops, that scratch that. I was looking the the rendered HTML. Wikitext is 'just' 276K. However the observation still holds, that this particular page causes most of the bot failures (and it is not the first page that is accessed by the bot!)

I worked with Daniel to obtain a backtrace; it's an OOM error. Offhand I'd say
there are three explanations for this:

  1. the object fetched from memcached is large, e.g. because it includes

metadata (I now realize that the "(tried to allocate 8208 bytes)" part of the error message disproves this)

  1. GlobalUsage fetches many file objects, and there's a memory leak somewhere
  2. we're in the post-parse LinksUpdate hook while the parser has just finished

parsing a large wiki page, so the wikitext and all sorts of metadata and parser
data structures are still in memory

[23-Nov-2011 15:41:47] Fatal error: Allowed memory size of 125829120 bytes exhausted (tried to allocate 8208 bytes) at /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedClient.php on line 918
Server: srv300
Method: POST
URL: http://commons.wikimedia.org/w/api.php
Cookie: commonswiki_session=CENSORED; commonswikiUserID=126604; commonswikiToken=CENSORED; commonswikiUserName=QICbot;
Backtrace:
#0 /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedClient.php(918): unserialize('a:15:{s:7:"vers...')
#1 /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedClient.php(436): MWMemcached->_load_items(Resource id #146, Array)
#2 /usr/local/apache/common-local/php-1.18/includes/objectcache/MemcachedPhpBagOStuff.php(63): MWMemcached->get('commonswiki:fil...')
#3 /usr/local/apache/common-local/php-1.18/includes/filerepo/LocalFile.php(184): MemcachedPhpBagOStuff->get('commonswiki:fil...')
#4 /usr/local/apache/common-local/php-1.18/includes/filerepo/LocalFile.php(340): LocalFile->loadFromCache()
#5 /usr/local/apache/common-local/php-1.18/includes/filerepo/LocalFile.php(569): LocalFile->load()
#6 /usr/local/apache/common-local/php-1.18/includes/filerepo/FileRepo.php(126): LocalFile->exists()
#7 /usr/local/apache/common-local/php-1.18/includes/filerepo/FileRepo.php(179): FileRepo->findFile('Case_??_la_chef...', Array)
#8 /usr/local/apache/common-local/php-1.18/extensions/GlobalUsage/GlobalUsageHooks.php(20): FileRepo->findFiles(Array)
#9 [internal function]: GlobalUsageHooks::onLinksUpdateComplete(Object(LinksUpdate))
#10 /usr/local/apache/common-local/php-1.18/includes/Hooks.php(216): call_user_func_array('GlobalUsageHook...', Array)
#11 /usr/local/apache/common-local/php-1.18/includes/GlobalFunctions.php(3626): Hooks::run('LinksUpdateComp...', Array)
#12 /usr/local/apache/common-local/php-1.18/includes/LinksUpdate.php(113): wfRunHooks('LinksUpdateComp...', Array)
#13 /usr/local/apache/common-local/php-1.18/includes/WikiPage.php(2022): LinksUpdate->doUpdate()
#14 /usr/local/apache/common-local/php-1.18/includes/WikiPage.php(1200): WikiPage->doEditUpdates(Object(Revision), Object(User), Array)
#15 [internal function]: WikiPage->doEdit('==Man made stru...', 'sorted into the...', 118)
#16 /usr/local/apache/common-local/php-1.18/includes/Article.php(1921): call_user_func_array(Array, Array)
#17 [internal function]: Article->__call('doEdit', Array)
#18 /usr/local/apache/common-local/php-1.18/includes/EditPage.php(1214): Article->doEdit('==Man made stru...', 'sorted into the...', 118)
#19 /usr/local/apache/common-local/php-1.18/includes/api/ApiEditPage.php(273): EditPage->internalAttemptSave(Array, true)
#20 /usr/local/apache/common-local/php-1.18/includes/api/ApiMain.php(692): ApiEditPage->execute()
#21 /usr/local/apache/common-local/php-1.18/includes/api/ApiMain.php(358): ApiMain->executeAction()
#22 /usr/local/apache/common-local/php-1.18/includes/api/ApiMain.php(342): ApiMain->executeActionWithErrorHandling()
#23 /usr/local/apache/common-local/php-1.18/api.php(115): ApiMain->execute()
#24 /usr/local/apache/common-local/live-1.5/api.php(3): require('/usr/local/apac...')
#25 {main}

(In reply to comment #8)

I worked with Daniel to obtain a backtrace; it's an OOM error. Offhand I'd say
there are three explanations for this:

  1. the object fetched from memcached is large, e.g. because it includes

metadata (I now realize that the "(tried to allocate 8208 bytes)" part of the
error message disproves this)

  1. GlobalUsage fetches many file objects, and there's a memory leak somewhere
  2. we're in the post-parse LinksUpdate hook while the parser has just finished

parsing a large wiki page, so the wikitext and all sorts of metadata and parser
data structures are still in memory

Daniel blanked the page in question, upon which the bot edit worked fine. He mentioned the page used to contain a gallery with 7000 images, so I'm now leaning towards theory #2.

(In reply to comment #9)

Daniel blanked the page in question, upon which the bot edit worked fine. He
mentioned the page used to contain a gallery with 7000 images, so I'm now
leaning towards theory #2.

Turns out I was right. Fixed in r104049 and deployed.