Page MenuHomePhabricator

Large category not registering all files
Closed, ResolvedPublic

Description

This category:
http://commons.wikimedia.org/wiki/Category:Images_from_the_State_Library_of_Queensland

...is supposed to contain ~13,000 images. As of this writing, it's only registering 264 images.

Examples of files in the category that don't register as being in the category:

Seems likely related to category changes, but worth rechecking after category work is done.


Version: unspecified
Severity: major

Details

Reference
bz27956

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:27 PM
bzimport set Reference to bz27956.
bzimport added a subscriber: Unknown Object (MLST).

This is part of a bigger problem:
http://commons.wikimedia.org/wiki/Commons:Village_pump#Strange_system_behaviour

It appears to be a temporary result of new category sorting deployment. There's a long running script that is still in the process of running on the largest wikis (including commons). After the script is complete (hopefully later today), much of the weirdness should clear up.

Looks like the script finished on the S3 cluster:

12:31 RoanKattouw: updateCollation.php finished on s3

Still, the wikimania2011 Category:Wikimania_submissions holds only six pages while there seems to be a bit more:

http://wikimania2011.wikimedia.org/wiki/Special:PrefixIndex/Submissions/

[wikimania2010wiki]> select count(*) from categorylinks where cl_to='Wikimania_submissions' \G

  • 1. row *******

count(*): 179
1 row in set (0.00 sec)

Sorry, wrong database name, the count is actually six.

[wikimania2011wiki]> select count(*) from categorylinks where cl_to='Wikimania_submissions' \G

  • 1. row *******

count(*): 6
1 row in set (0.00 sec)

(In reply to comment #3)

Sorry, wrong database name, the count is actually six.

[wikimania2011wiki]> select count(*) from categorylinks where
cl_to='Wikimania_submissions' \G

  • 1. row *******

count(*): 6
1 row in set (0.00 sec)

http://wikimania2011.wikimedia.org/wiki/Category:Wikimania_submissions seems fine now. It showed only six category members until I logged in. Now it's showing 52. I imagine the category description page was simply cached.

Regarding Commons, these issues should mostly be resolved once the category maintenance script finishes on s4. The category member counts (stored in the category table) are going to be wrong until another maintenance script is written/run to fix those counts, as far as I'm aware. I spoke with Roan about this earlier.

We have to purge the squids for Categories pages :(

I wrote a script (r83404) to purge all pages of a given namespace

Example usage:
maintenance/purgeNamespace.php wikimania2011wiki --namespace 14

You probably do not want to run it on enwiki NS_MAIN though :)

Is this fixed? the category is reporting 14,000 images now.

I have not run the script since it is not reviewed and I avoid "playing" with caches systems.