Page MenuHomePhabricator

PageNotFound error while running replace.py with compat
Closed, DeclinedPublic

Description

When I'm running pywikibot with replace.py I do sometimes (like once every 6-7 pages replaced) get an error which stops the replace to run.

Traceback (most recent call last):

File "replace.py", line 967, in <module>
  main()
File "replace.py", line 956, in main
  bot.run()
File "replace.py", line 542, in run
  page.put(new_text, self.editSummary)
File "C:\compat\compat\wikipedia.py", line 2112, in put
  sysop = sysop, botflag=botflag, maxTries=maxTries)
File "C:\compat\compat\wikipedia.py", line 2203, in _putPage
  response, data = query.GetData(params, self.site(), sysop=sysop, back_respon

se = True)

File "C:\compat\compat\pywikibot\support.py", line 121, in wrapper
  return method(*__args, **__kw)
File "C:\compat\compat\query.py", line 135, in GetData
  res, jsontext = site.postForm(path, params, sysop, site.cookies(sysop = syso

p) )

File "C:\compat\compat\wikipedia.py", line 6495, in postForm
  cookies=cookies)
File "C:\compat\compat\wikipedia.py", line 6549, in postData
  raise PageNotFound(u'Page %s could not be retrieved. Check your family file

?' % url)
pywikibot.exceptions.PageNotFound: Page https://commons.wikimedia.org/w/api.php
could not be retrieved. Check your family file ?

The family is set as:
family = 'commons'
mylang = 'commons'

the command used is: python replace.py -namespace:6 -cat:RCE_suggested:_Centrum -summary:"Remove RCE-tag suggestion (this specific tag is not useful)" "{{RCE-subject|Centrum}}" ""

but the issue also occures on other commands and when entering a wrong password in the login.py (only at first attempt so far).

version information:
Pywikibot: wikipedia.py (r-1 (unknown), ???????, 2013/10/23, 12:56:06, OUTDATED)

Release version: 1.0b1
Python: 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)]
config-settings:
use_api = True
use_api_login = True
unicode test: ok


Version: compat-(1.0)
Severity: major
Whiteboard: aklapper-moreinfo

Details

Reference
bz56042

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:26 AM
bzimport set Reference to bz56042.
bzimport added a subscriber: Unknown Object (????).

I noticed the command line in irc pastebin as

python replace.py -debug -lang:commons -family:commons -namespace:6 -cat:RCE_suggested:_Centrum -summary:"Remove RCE-tag suggestion (not useful)" -regex -dotall "(\{\{RCE\-subject\|Centrum\}\})" ""

git wordt niet herkend als een interne of externe opdracht, programma of batchbestand.

Getting [[Category:RCE suggested: Centrum]] list...
Getting 60 pages via API from commons:commons...

...

File:Overzicht - Amsterdam - 20408351 - RCE.jpg <<<

  • {{RCE-subject|Centrum}}{{RCE-subject|Verdedigingswerk}}

+ {{RCE-subject|Verdedigingswerk}}

Updating page [[File:Overzicht - Amsterdam - 20408351 - RCE.jpg]] via API
Traceback (most recent call last):

On the other hand, the bot worked:
https://commons.wikimedia.org/w/index.php?title=File%3AOverzicht_hoekpartij_met_koperen_koepel%2C_overdekte_winkelgalerij_-_Amsterdam_-_20408976_-_RCE.jpg&diff=107729373&oldid=101940554

https://commons.wikimedia.org/w/index.php?title=File%3AOverzicht_-_Amsterdam_-_20408351_-_RCE.jpg&diff=107728528&oldid=101940682

I tried it with multiple commands, and indeed most of the time 5 files or so get updated (up to 40 one time) and then the error occures.

The error you describe is raised by the following code:

if e.code in [401, 404]:
    raise PageNotFound(u'Page %s could not be retrieved. Check '
                       u'your family file ?' % url)

which implies the server returned either HTTP/401 Unauthorized or HTTP/404 Not Found.

Unfortunately, I cannot reproduce it with the new to-be-removed tag basvb suggested on IRC...

In //gerrit.wikimedia.org/r/92075 , I have added debug output to show what the actual error is. Could you:

  1. make a backup of wikipedia.py
  2. download https://git.wikimedia.org/raw/pywikibot%2Fcompat/015e067e078bc7611f27c4450755897e1c1fd42f/wikipedia.py and place it where the original one was
  3. run again, with -debug
  4. post the new debug response here?

Thanks!

The change has been merged, so instead of downloading the seperate file, you should now just download the latest nightly: http://tools.wmflabs.org/pywikibot/compat.zip

@Basvb, would you please follow the steps suggested by Merlijn? Thanks.

@Basvb, would you please follow the steps suggested by Merlijn? Thanks.

I'm currently not working on commons and am not really up to date how this all works. I can't find the time to dive into this the comming 1-2 months. I'm sorry.

It seems to work now, I do not get the error anymore (changed around 100 pages now). Thanks for the help, and sorry for the delay.

(In reply to Basvb from comment #8)

It seems to work now

Closing as WORKSFORME then...