Page MenuHomePhabricator

archivebot problems on cswiki
Closed, ResolvedPublic

Description

Some errors causes tahat not all pages should be correctly archived

I:\py\rewrite>pwb.py archivebot archive -lang:cs
...

  1. incorrect month name, but this string "06." is in page only in urls :

Processing [[cs:Diskuse s wikipedistou:JAn Dudík]]
incorrect month name "06." in page in site wikipedia:cs
ERROR: Error occured while processing page [[cs:Diskuse s wikipedistou:JAn Dudík
]]
ERROR: KeyError:
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 614, in main
  archiver = PageArchiver(pg, a, salt, force)
File "I:\py\rewrite\scripts\archivebot.py", line 383, in __init__
  self.page = DiscussionPage(page, self)
File "I:\py\rewrite\scripts\archivebot.py", line 293, in __init__
  self.load_page()
File "I:\py\rewrite\scripts\archivebot.py", line 321, in load_page
  cur_thread.feed_line(line)
File "I:\py\rewrite\scripts\archivebot.py", line 238, in feed_line
  timestamp = self.ts.timestripper(line)
File "I:\py\rewrite\pywikibot\textlib.py", line 1321, in timestripper
  raise KeyError

KeyError

  1. attributeError: 'NoneType' object has no attribute 'group'

Processing [[cs:Diskuse s wikipedistou:JeremySil]]
19 Threads found on [[cs:Diskuse s wikipedistou:JeremySil]]
Looking for: {{archivace}} in [[cs:Diskuse s wikipedistou:JeremySil]]
ERROR: Error occured while processing page [[cs:Diskuse s wikipedistou:JeremySil
]]
ERROR: AttributeError: 'NoneType' object has no attribute 'group'
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 493, in run
  whys = self.analyze_page()
File "I:\py\rewrite\scripts\archivebot.py", line 453, in analyze_page
  max_arch_size = str2size(self.get_attr('maxarchivesize'))
File "I:\py\rewrite\scripts\archivebot.py", line 173, in str2size
  val, unit = (int(r.group(1)), r.group(2))

AttributeError: 'NoneType' object has no attribute 'group'

  1. When archive is in another path, bot fails:

Processing [[cs:Wikipedie:Byrokraté/Nástěnka]]
29 Threads found on [[cs:Wikipedie:Byrokraté/Nástěnka]]
Looking for: {{archivace}} in [[cs:Wikipedie:Byrokraté/Nástěnka]]
Processing 29 threads
ERROR: Error occured while processing page [[cs:Wikipedie:Byrokraté/Nástěnka]]
ERROR: ArchiveSecurityError: Archive page [[cs:Wikipedie:Byrokraté/Archiv1]] doe
s not start with page title (Wikipedie:Byrokraté/Nástěnka)!
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 493, in run
  whys = self.analyze_page()
File "I:\py\rewrite\scripts\archivebot.py", line 481, in analyze_page
  if self.feed_archive(archive, t, max_arch_size, params):
File "I:\py\rewrite\scripts\archivebot.py", line 447, in feed_archive
  % (archive, self.page.title()))

ArchiveSecurityError: Archive page [[cs:Wikipedie:ByrokratĂ?/Archiv1]] does not
start with page title (Wikipedie:ByrokratĂ?/NástÄ?nka)!

  1. unknown interwiki prefixes c: and outreach:

Processing [[cs:Wikipedie:Nástěnka správců]]
52 Threads found on [[cs:Wikipedie:Nástěnka správců]]
Looking for: {{archivace}} in [[cs:Wikipedie:Nástěnka správců]]
Processing 52 threads
127 Threads found on [[cs:Wikipedie:Nástěnka správců/Archiv58]]
Archiving 23 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Nástěnka správců]]
ERROR: SiteDefinitionError: :c:User:Martinnovacek.cz is not a local page on wiki
pedia:cs, and the interwiki prefix c is not supported by PyWikiBot!
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
  self.archives[a].update(comment)
File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
  self.save(summary)
File "I:\py\rewrite\pywikibot\tools.py", line 516, in wrapper
  return obj(*__args, **__kw)
File "I:\py\rewrite\pywikibot\page.py", line 985, in save
  **kwargs)
File "I:\py\rewrite\pywikibot\page.py", line 993, in _save
  comment = self._cosmetic_changes_hook(comment) or comment
File "I:\py\rewrite\pywikibot\page.py", line 1040, in _cosmetic_changes_hook
  self.text = ccToolkit.change(old)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
  new_text = self._change(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
  text = self.safe_execute(method, text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
  result = method(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
  'startspace'])
File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
  replacement = new(match)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
  namespace = page.namespace()
File "I:\py\rewrite\pywikibot\page.py", line 157, in namespace
  return self._link.namespace
File "I:\py\rewrite\pywikibot\page.py", line 4153, in namespace
  self.parse()
File "I:\py\rewrite\pywikibot\page.py", line 4069, in parse
  self._text, self._site, prefix))

SiteDefinitionError: :c:User:Martinnovacek.cz is not a local page on wikipedia:c
s, and the interwiki prefix c is not supported by PyWikiBot!

Processing [[cs:Wikipedie:Pod lípou (návrhy)]]
12 Threads found on [[cs:Wikipedie:Pod lípou (návrhy)]]
Looking for: {{archivace}} in [[cs:Wikipedie:Pod lípou (návrhy)]]
Processing 12 threads
14 Threads found on [[cs:Wikipedie:Pod lípou (návrhy)/Archiv 2014-01]]
Archiving 2 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Pod lípou (návrhy)]]
ERROR: SiteDefinitionError: :outreach:Welcome to Wikipedia (Bookshelf)/2013 edit
ion/text is not a local page on wikipedia:cs, and the interwiki prefix outreach
is not supported by PyWikiBot!
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
  self.archives[a].update(comment)
File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
  self.save(summary)
File "I:\py\rewrite\pywikibot\tools.py", line 516, in wrapper
  return obj(*__args, **__kw)
File "I:\py\rewrite\pywikibot\page.py", line 985, in save
  **kwargs)
File "I:\py\rewrite\pywikibot\page.py", line 993, in _save
  comment = self._cosmetic_changes_hook(comment) or comment
File "I:\py\rewrite\pywikibot\page.py", line 1040, in _cosmetic_changes_hook
  self.text = ccToolkit.change(old)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
  new_text = self._change(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
  text = self.safe_execute(method, text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
  result = method(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
  'startspace'])
File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
  replacement = new(match)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
  namespace = page.namespace()
File "I:\py\rewrite\pywikibot\page.py", line 157, in namespace
  return self._link.namespace
File "I:\py\rewrite\pywikibot\page.py", line 4153, in namespace
  self.parse()
File "I:\py\rewrite\pywikibot\page.py", line 4069, in parse
  self._text, self._site, prefix))

SiteDefinitionError: :outreach:Welcome to Wikipedia (Bookshelf)/2013 edition/tex
t is not a local page on wikipedia:cs, and the interwiki prefix outreach is not
supported by PyWikiBot!


Version: core-(2.0)
Severity: major

Details

Reference
bz72047

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:53 AM
bzimport set Reference to bz72047.
bzimport added a subscriber: Unknown Object (????).

d: is also unknown interwiki prefix

I think you should split this is several bugs.

Hmmm that is strange. Can you test it without an apicache? For me all three prefixes are known:

import pywikibot
s = pywikibot.Site("cs", "wikipedia")
s.interwiki("c")

Site("commons", "commons")

s.interwiki("d")

Site("wikidata", "wikidata")

s.interwiki("outreach")

Site("outreach", "outreach")

Regarding point 1)

In textlib.py, in timestripper(self, line), line should be stripped of all items where a valid date cannot be (comments, links, external links, etc.).

Esp. external links might be deceptive, saw some like this in cswp: [https://lists.wikimedia.org/pipermail/mobile-l/2014-August/007927.html]

Best way to reuse already available textlib functions with minimum code duplication TBD. Some regexes might become global constants, maybe?

Maybe I can fix the point 1,2 and 3. but other issues are not things that I can look into

Okay even parsing a link does work as expected:

s = pywikibot.Site("cs", "wikipedia")
pywikibot.Link(":outreach:site", s)

pywikibot.page.Link('Site', Site("outreach", "outreach"))

pywikibot.Link(":c:site", s)

pywikibot.page.Link('Site', Site("commons", "commons"))

pywikibot.Link(":d:site", s)

pywikibot.page.Link('Site', Site("wikidata", "wikidata"))

Uploaded patch https://gerrit.wikimedia.org/r/#/c/167406/

Please report if errors are still present after this is approved.

(In reply to Mpaa from comment #8)

Uploaded patch https://gerrit.wikimedia.org/r/#/c/167406/

Please report if errors are still present after this is approved.

Prefixes c:, outreach: and d: are still problematic.
here are errors from log:

Processing [[cs:Diskuse s wikipedistou:JeremySil]]
19 Threads found on [[cs:Diskuse s wikipedistou:JeremySil]]
Looking for: {{archivace}} in [[cs:Diskuse s wikipedistou:JeremySil]]
ERROR: Error occured while processing page [[cs:Diskuse s wikipedistou:JeremySil
]]
ERROR: ValueError: invalid literal for int() with base 10: ''
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 493, in run
  whys = self.analyze_page()
File "I:\py\rewrite\scripts\archivebot.py", line 454, in analyze_page
  arch_counter = int(self.get_attr('counter', '1'))

ValueError: invalid literal for int() with base 10: ''

Processing [[cs:Wikipedie:Nástěnka správců]]
55 Threads found on [[cs:Wikipedie:Nástěnka správců]]
Looking for: {{archivace}} in [[cs:Wikipedie:Nástěnka správců]]
Processing 55 threads
127 Threads found on [[cs:Wikipedie:Nástěnka správců/Archiv58]]
Archiving 29 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Nástěnka správců]]
ERROR: SiteDefinitionError: :c:User:Martinnovacek.cz is not a local page on wiki
pedia:cs, and the interwiki prefix c is not supported by PyWikiBot!
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
  self.archives[a].update(comment)
File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
  self.save(summary)
File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
  return obj(*__args, **__kw)
File "I:\py\rewrite\pywikibot\page.py", line 982, in save
  **kwargs)
File "I:\py\rewrite\pywikibot\page.py", line 990, in _save
  comment = self._cosmetic_changes_hook(comment) or comment
File "I:\py\rewrite\pywikibot\page.py", line 1037, in _cosmetic_changes_hook
  self.text = ccToolkit.change(old)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
  new_text = self._change(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
  text = self.safe_execute(method, text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
  result = method(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
  'startspace'])
File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
  replacement = new(match)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
  namespace = page.namespace()
File "I:\py\rewrite\pywikibot\page.py", line 158, in namespace
  return self._link.namespace
File "I:\py\rewrite\pywikibot\page.py", line 4141, in namespace
  self.parse()
File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
  self._text, self._site, prefix))

SiteDefinitionError: :c:User:Martinnovacek.cz is not a local page on wikipedia:c
s, and the interwiki prefix c is not supported by PyWikiBot!

Processing [[cs:Wikipedie:Pod lípou]]
32 Threads found on [[cs:Wikipedie:Pod lípou]]
Looking for: {{archivace}} in [[cs:Wikipedie:Pod lípou]]
Processing 32 threads
63 Threads found on [[cs:Wikipedie:Pod lípou/Archiv 2014/02]]
13 Threads found on [[cs:Wikipedie:Pod lípou/Archiv 2014/03]]
Archiving 22 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Pod lípou]]
ERROR: SiteDefinitionError: :c:Category:2014 events in Brno is not a local page
on wikipedia:cs, and the interwiki prefix c is not supported by PyWikiBot!
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
  self.archives[a].update(comment)
File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
  self.save(summary)
File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
  return obj(*__args, **__kw)
File "I:\py\rewrite\pywikibot\page.py", line 982, in save
  **kwargs)
File "I:\py\rewrite\pywikibot\page.py", line 990, in _save
  comment = self._cosmetic_changes_hook(comment) or comment
File "I:\py\rewrite\pywikibot\page.py", line 1037, in _cosmetic_changes_hook
  self.text = ccToolkit.change(old)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
  new_text = self._change(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
  text = self.safe_execute(method, text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
  result = method(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
  'startspace'])
File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
  replacement = new(match)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
  namespace = page.namespace()
File "I:\py\rewrite\pywikibot\page.py", line 158, in namespace
  return self._link.namespace
File "I:\py\rewrite\pywikibot\page.py", line 4141, in namespace
  self.parse()
File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
  self._text, self._site, prefix))

SiteDefinitionError: :c:Category:2014 events in Brno is not a local page on wiki
pedia:cs, and the interwiki prefix c is not supported by PyWikiBot!

Processing [[cs:Wikipedie:Pod lípou (návrhy)]]
15 Threads found on [[cs:Wikipedie:Pod lípou (návrhy)]]
Looking for: {{archivace}} in [[cs:Wikipedie:Pod lípou (návrhy)]]
Processing 15 threads
14 Threads found on [[cs:Wikipedie:Pod lípou (návrhy)/Archiv 2014-01]]
Archiving 5 thread(s).
ERROR: Error occured while processing page [[cs:Wikipedie:Pod lípou (návrhy)]]
ERROR: SiteDefinitionError: :outreach:Welcome to Wikipedia (Bookshelf)/2013 edit
ion/text is not a local page on wikipedia:cs, and the interwiki prefix outreach
is not supported by PyWikiBot!
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
  self.archives[a].update(comment)
File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
  self.save(summary)
File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
  return obj(*__args, **__kw)
File "I:\py\rewrite\pywikibot\page.py", line 982, in save
  **kwargs)
File "I:\py\rewrite\pywikibot\page.py", line 990, in _save
  comment = self._cosmetic_changes_hook(comment) or comment
File "I:\py\rewrite\pywikibot\page.py", line 1037, in _cosmetic_changes_hook
  self.text = ccToolkit.change(old)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
  new_text = self._change(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
  text = self.safe_execute(method, text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
  result = method(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
  'startspace'])
File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
  replacement = new(match)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
  namespace = page.namespace()
File "I:\py\rewrite\pywikibot\page.py", line 158, in namespace
  return self._link.namespace
File "I:\py\rewrite\pywikibot\page.py", line 4141, in namespace
  self.parse()
File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
  self._text, self._site, prefix))

SiteDefinitionError: :outreach:Welcome to Wikipedia (Bookshelf)/2013 edition/tex
t is not a local page on wikipedia:cs, and the interwiki prefix outreach is not
supported by PyWikiBot!

Processing [[cs:Wikipedie:Pod lípou (technika)]]
61 Threads found on [[cs:Wikipedie:Pod lípou (technika)]]
Looking for: {{archivace}} in [[cs:Wikipedie:Pod lípou (technika)]]
Processing 61 threads
ERROR: Error occured while processing page [[cs:Wikipedie:Pod lípou (technika)]]

ERROR: IsRedirectPage: Page [[cs:Wikipedie:Pod lípou (technika)/Archiv 2014-01]]
is a redirect page.
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 493, in run
  whys = self.analyze_page()
File "I:\py\rewrite\scripts\archivebot.py", line 481, in analyze_page
  if self.feed_archive(archive, t, max_arch_size, params):
File "I:\py\rewrite\scripts\archivebot.py", line 449, in feed_archive
  self.archives[title] = DiscussionPage(archive, self, params)
File "I:\py\rewrite\scripts\archivebot.py", line 293, in __init__
  self.load_page()
File "I:\py\rewrite\scripts\archivebot.py", line 308, in load_page
  lines = self.get().split('\n')
File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
  return obj(*__args, **__kw)
File "I:\py\rewrite\pywikibot\page.py", line 332, in get
  self._getInternals(sysop)
File "I:\py\rewrite\pywikibot\page.py", line 364, in _getInternals
  raise self._getexception

IsRedirectPage: Page [[cs:Wikipedie:Pod lĂ­pou (technika)/Archiv 2014-01]] is a
redirect page.

Processing [[cs:Wikipedie:Potřebuji pomoc]]
44 Threads found on [[cs:Wikipedie:Potřebuji pomoc]]
Looking for: {{archivace}} in [[cs:Wikipedie:Potřebuji pomoc]]
Processing 44 threads
7 Threads found on [[cs:Wikipedie:Potřebuji pomoc/Archiv14]]
Archiving 20 thread(s).

ERROR: Error occured while processing page [[cs:Wikipedie:Potřebuji pomoc]]
ERROR: SiteDefinitionError: d:Q252189 is not a local page on wikipedia:cs, and t
he interwiki prefix d is not supported by PyWikiBot!
Traceback (most recent call last):

File "I:\py\rewrite\scripts\archivebot.py", line 615, in main
  archiver.run()
File "I:\py\rewrite\scripts\archivebot.py", line 509, in run
  self.archives[a].update(comment)
File "I:\py\rewrite\scripts\archivebot.py", line 358, in update
  self.save(summary)
File "I:\py\rewrite\pywikibot\tools.py", line 529, in wrapper
  return obj(*__args, **__kw)
File "I:\py\rewrite\pywikibot\page.py", line 982, in save
  **kwargs)
File "I:\py\rewrite\pywikibot\page.py", line 990, in _save
  comment = self._cosmetic_changes_hook(comment) or comment
File "I:\py\rewrite\pywikibot\page.py", line 1037, in _cosmetic_changes_hook
  self.text = ccToolkit.change(old)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 228, in change
  new_text = self._change(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 221, in _change
  text = self.safe_execute(method, text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 199, in safe_execute
  result = method(text)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 548, in cleanUpLinks
  'startspace'])
File "I:\py\rewrite\pywikibot\textlib.py", line 224, in replaceExcept
  replacement = new(match)
File "I:\py\rewrite\scripts\cosmetic_changes.py", line 442, in handleOneLink
  namespace = page.namespace()
File "I:\py\rewrite\pywikibot\page.py", line 158, in namespace
  return self._link.namespace
File "I:\py\rewrite\pywikibot\page.py", line 4141, in namespace
  self.parse()
File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
  self._text, self._site, prefix))

SiteDefinitionError: d:Q252189 is not a local page on wikipedia:cs, and the inte
rwiki prefix d is not supported by PyWikiBot!

Can you execute "python pwb.py shell", then in the Python console "import pywikibot" and "s = pywikibot.Site('cs', 'wikipedia')". Then try one of the three:

  • pywikibot.Link(":outreach:site", s)
  • pywikibot.Link(":c:site", s)
  • pywikibot.Link(":d:site", s)

I personally have no problem executing this:

$ python pwb.py shell
Welcome to the Pywikibot interactive shell!

import pywikibot
s = pywikibot.Site("cs", "wikipedia")
pywikibot.Link(":outreach:site", s)

pywikibot.page.Link('Site', Site("outreach", "outreach"))

You could also call "s.interwiki('outreach')" to see what it is returning.

Processing [[cs:Diskuse s wikipedistou:JeremySil]]
...

arch_counter = int(self.get_attr('counter', '1'))

ValueError: invalid literal for int() with base 10: ''

counter parameter is malformed. It is empty while an int is expected.

(In reply to Fabian from comment #10)
This might be the problem:

I:\py\rewrite>pwb.py shell
Welcome to the Pywikibot interactive shell!

import pywikibot
s = pywikibot.Site('cs', 'wikipedia')
pywikibot.Link(":outreach:site", s)

Traceback (most recent call last):

File "<console>", line 1, in <module>
File "I:\py\rewrite\pywikibot\page.py", line 3975, in __repr__
  return "pywikibot.page.Link(%r, %r)" % (self.title, self.site)
File "I:\py\rewrite\pywikibot\page.py", line 4151, in title
  self.parse()
File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
  self._text, self._site, prefix))

SiteDefinitionError: :outreach:site is not a local page on wikipedia:cs, and the
interwiki prefix outreach is not supported by PyWikiBot!

I:\py\rewrite>pwb.py version
Pywikibot: pywikibot-core (74beb9e, s5354, 2014/10/20, 13:36:31, OUTDATED)
Release version: 2.0b2
Python: 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)]
unicode test: ok
httplib2 version: 0.8

Okay maybe the families are not loaded properly? Have you changed the "installed" families? With the following code line (in the shell like before) it'll list all families:

print("\n".join("{}: {}".format(*i) for i in sorted(pywikibot.config2.family_files.items())))

The default installation should contain at least outreach, wikidata and commons.

(In reply to Fabian from comment #13)

Okay maybe the families are not loaded properly? Have you changed the
"installed" families? With the following code line (in the shell like
before) it'll list all families:

print("\n".join("{}: {}".format(*i) for i in
sorted(pywikibot.config2.family_files.items())))

The default installation should contain at least outreach, wikidata and
commons.

I have all files which are in nightly, I didn't change anything

I:\py\rewrite>pwb.py shell
Welcome to the Pywikibot interactive shell!

import pywikibot
s = pywikibot.Site('cs', 'wikipedia')
pywikibot.Link(":outreach:site", s)

Traceback (most recent call last):

File "<console>", line 1, in <module>
File "I:\py\rewrite\pywikibot\page.py", line 3975, in __repr__
  return "pywikibot.page.Link(%r, %r)" % (self.title, self.site)
File "I:\py\rewrite\pywikibot\page.py", line 4151, in title
  self.parse()
File "I:\py\rewrite\pywikibot\page.py", line 4057, in parse
  self._text, self._site, prefix))

SiteDefinitionError: :outreach:site is not a local page on wikipedia:cs, and the
interwiki prefix outreach is not supported by PyWikiBot!

print("\n".join("{}: {}".format(*i) for i in sorted(pywikibot.config2.family

_files.items())))
anarchopedia: I:\py\rewrite\pywikibot\families\anarchopedia_family.py
battlestarwiki: I:\py\rewrite\pywikibot\families\battlestarwiki_family.py
commons: I:\py\rewrite\pywikibot\families\commons_family.py
fon: I:\py\rewrite\pywikibot\families\fon_family.py
gentoo: I:\py\rewrite\pywikibot\families\gentoo_family.py
i18n: I:\py\rewrite\pywikibot\families\i18n_family.py
incubator: I:\py\rewrite\pywikibot\families\incubator_family.py
lockwiki: I:\py\rewrite\pywikibot\families\lockwiki_family.py
lyricwiki: I:\py\rewrite\pywikibot\families\lyricwiki_family.py
mediawiki: I:\py\rewrite\pywikibot\families\mediawiki_family.py
meta: I:\py\rewrite\pywikibot\families\meta_family.py
oldwikivoyage: I:\py\rewrite\pywikibot\families\oldwikivoyage_family.py
omegawiki: I:\py\rewrite\pywikibot\families\omegawiki_family.py
osm: I:\py\rewrite\pywikibot\families\osm_family.py
outreach: I:\py\rewrite\pywikibot\families\outreach_family.py
southernapproach: I:\py\rewrite\pywikibot\families\southernapproach_family.py
species: I:\py\rewrite\pywikibot\families\species_family.py
strategy: I:\py\rewrite\pywikibot\families\strategy_family.py
test: I:\py\rewrite\pywikibot\families\test_family.py
vikidia: I:\py\rewrite\pywikibot\families\vikidia_family.py
wikia: I:\py\rewrite\pywikibot\families\wikia_family.py
wikibooks: I:\py\rewrite\pywikibot\families\wikibooks_family.py
wikidata: I:\py\rewrite\pywikibot\families\wikidata_family.py
wikimedia: I:\py\rewrite\pywikibot\families\wikimedia_family.py
wikinews: I:\py\rewrite\pywikibot\families\wikinews_family.py
wikipedia: I:\py\rewrite\pywikibot\families\wikipedia_family.py
wikiquote: I:\py\rewrite\pywikibot\families\wikiquote_family.py
wikisource: I:\py\rewrite\pywikibot\families\wikisource_family.py
wikitech: I:\py\rewrite\pywikibot\families\wikitech_family.py
wikiversity: I:\py\rewrite\pywikibot\families\wikiversity_family.py
wikivoyage: I:\py\rewrite\pywikibot\families\wikivoyage_family.py
wiktionary: I:\py\rewrite\pywikibot\families\wiktionary_family.py
wowwiki: I:\py\rewrite\pywikibot\families\wowwiki_family.py

the file
i:\py\rewrite\pywikibot\families\outreach_family.py
is the only one containing 'outreach' in whole core package

But I don't understand why archivebot needs to know interwiki links - it should only take part of text and move it to another page.

Works for me.

Pywikibot: pywikibot-core.git (bc34a4a, g4328, 2014/10/20, 18:56:56, OUTDATED)
Release version: 2.0b2
Python: 2.7.6 (default, Mar 22 2014, 22:59:38)
unicode test: ok
httplib2 version: 0.9

The only difference I can spot is httplib2. No idea if it could explain it.
Might be worth while updating it and retry.

I found, taht problem is somewhere in cosmetic_changes.py.
after disabling, bot archives well.

(In reply to Mpaa from comment #8)

Uploaded patch https://gerrit.wikimedia.org/r/#/c/167406/

Please report if errors are still present after this is approved.

BTW, patch https://gerrit.wikimedia.org/r/#/c/167406/ is Merged.

Hmmm about the interwiki problem: It doesn't really matter why it does also parse interwiki links (*) because otherwise we would just hide a potential bug.

Could you execute the following line in shell:

pywikibot.Site('cs', 'wikipedia').interwiki('outreach')

This returns for me 'Site("outreach", "outreach")'.

*: There is no reliable way to know if a link is an interwiki link or not without looking into the prefix. The script just goes the extra mile to also assign a Site object to it which could be ignored but that must be done by the script by catching the exception.

It looks like this is fixed.