Page MenuHomePhabricator

Existing pages does not exist
Closed, ResolvedPublic

Description

I run interwiki.py

Many existing categories and it's interwiki links are reported as missing
https://cs.wikinews.org/w/index.php?title=Kategorie:Srpen_2013&curid=7375&diff=34600&oldid=34561

https://cs.wikinews.org/w/index.php?title=Kategorie:21._%C4%8Dervenec_2013&curid=7378&diff=34599&oldid=34505

and many others


Version: compat-(1.0)
Severity: critical

Details

Reference
bz55414

Related Objects

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:12 AM
bzimport set Reference to bz55414.

Could you give any hints, tracebacks, messages while processing iw.py?

Created attachment 13448
log from interwiki py

See attached log.
But now seems to work correctly

Attached:

try
interwiki.py -lang:tr -family:wikinews -subcatsr:2013

Although all categories have all members, in february some categories "does not exist"

When I run -subcatsr:2013/02 exist all

It seems that bot takes only first 50 pages from some languages, because on
interwiki.py -lang:cs -family:wikinews -new -namespace:14
it deleted some links and these languages now have only 50 pages to work.

Additionally there is bug
https://bugzilla.wikimedia.org/show_bug.cgi?id=55374
very slow run - loading about 1 page per second and error messages every few minutes


interwiki.py -lang:tr -family:wikinews -subcatsr:2013
...

NOTE: [[tr:Kategori:2013/02/18]] does not exist. Skipping. NOTE: [[tr:Kategori:2013/02/19]] does not exist. Skipping. NOTE: [[tr:Kategori:2013/02/20]] does not exist. Skipping. NOTE: [[tr:Kategori:2013/02/21]] does not exist. Skipping. NOTE: [[tr:Kategori:2013/02/22]] does not exist. Skipping. NOTE: [[tr:Kategori:2013/02/23]] does not exist. Skipping. NOTE: [[tr:Kategori:2013/02/24]] does not exist. Skipping. NOTE: [[tr:Kategori:2013/02/25]] does not exist. Skipping. NOTE: [[tr:Kategori:2013/02/26]] does not exist. Skipping. NOTE: [[tr:Kategori:2013/02/27]] does not exist. Skipping. ... NOTE: The first unfinished subject is [[tr:Kategori:2013/01]] NOTE: Number of pages queued is 50, trying to add 60 more. Getting [[Kategori:2013/02/27]] list... Getting [[Kategori:2013/02/28]] list... Getting [[Kategori:2013/03]] list... Getting [[Kategori:2013/03/01]] list... Getting [[Kategori:2013/03/02]] list... Getting [[Kategori:2013/03/03]] list... Getting [[Kategori:2013/03/04]] list... Getting [[Kategori:2013/03/05]] list... Getting [[Kategori:2013/03/06]] list... Getting [[Kategori:2013/03/07]] list... Getting [[Kategori:2013/03/08]] list...

*** Bug 55655 has been marked as a duplicate of this bug. ***

I can add that the interwikis removed are somewhat random. In two consecutive runs, interwiki.py readds the interwikis removed in the previous run, and sometimes also removes others it didn't remove in the previous run.
The problem is definitely related to the "NOTE: [[:*]] does not exist. Skipping." message, which sometimes isn't correct.

Change 89500 had a related patch set uploaded by Xqt:
(Bug 55414) Initial bugfix for non existing pages

https://gerrit.wikimedia.org/r/89500

Change 89500 merged by Xqt:
(Bug 55414) Initial bugfix for non existing pages

https://gerrit.wikimedia.org/r/89500

Decreased maxquerysize to 50 which is the same value as in core