Page MenuHomePhabricator

very slow interwiki.py and urlopen error with disambiguations
Closed, DeclinedPublic

Description

After change in disambiguationpages is interwiki bot very very slow. Loading 60 pages from source language takes more than one minute (from other languages is time standard).

Additionally bot often (sometimes every minute, sometimes after 10 minutes) frozes because of no response from server, see log:

interwiki.py -new -family:wiktionary -wiktionary -async -autonomous -cleanup

Getting 60 pages from wiktionary:cs...

NOTE: [[cs:synowa]] does not have any interwiki links ... ERROR: URLError: <urlopen error [Errno 10060] ...> WARNING: Could not open 'https://cs.wiktionary.org/w/api.php?action=query&format =json&titles=steteramus&ppprop=disambiguation&prop=pageprops'. Maybe the server or your connection is down. Retrying in 1 minutes...

Version: compat-(1.0)
Severity: normal

Details

Reference
bz55374

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:38 AM
bzimport set Reference to bz55374.
bzimport added a subscriber: Unknown Object (????).

That seems because for each existing page it makes an API request for its disambiguation property. I wonder if this couldn't be done in the first page request when checking for its existence, insteand of making an additional query.

I checked and even with very slow internet of mine (100KB/s) It's working fast and correct, system of checking disambiguation has changed so I close this bug as fixed but If It's still really slow, feel free to reopen it

I'm not sure it's faster, just it looks different. It seems to take longer in the beginning now (loading first 50 pages from Wiktionary takes 1 minute or so). It seems to be checking for the disambiguation page attribute individually when it fetches the page titles, instead of then it processes the page. That would be the same problem. But again, that's just my feeling, not that I examined the code. Either way, it's still slow, just in a different phase of the process.