In short: The updateArticleCount.php script is not counting articles correctly.
The evidence:
See the table I'm still filling out at [[m:User:Dcljr/Article counts]], which collects (way too many) statistics based on the official database dumps. (In particular, see the columns highlighted in pink, which show how far off the "on-wiki" article counts were from the actual dump-based article counts, both before and after the script was run.)
The longer version:
Ever since the resolution of bug 33253, which led to several wikis "losing" or "gaining" huge numbers of articles (according to their {NUMBEROFARTICLES} count), I've suspected very strongly that the updateArticleCount.php script is not counting articles correctly. Now I have firm evidence.
I wrote a Perl script to download and parse relevant dumps from <dumps.wikimedia.org> thereby counting articles "from scratch" based on the current "non-redirect with at least one wikilink" criteria (as well as some more and less generous criteria that I'm trying out for comparison). The results are being collected at the Meta page above.
I've started with the Wiktionaries whose article counts dropped the most (in terms of percentage), so the table is currently showing huge undercounts. I originally suspected that the wikis whose article counts gained the most would show significant overcounts, but the handful of checks I've made of such wikis (which haven't been added to the table yet) haven't shown this to be the case.
We Shall See...
Punchline: Someone needs to check the updateArticleCount.php script to see why it's undercounting articles.
Version: unspecified
Severity: normal
URL: http://meta.wikimedia.org/wiki/User:Dcljr/Article_counts