Page MenuHomePhabricator

Italian Wikivoyage page count in Wikistats seems too low
Closed, ResolvedPublic

Description

See discussion at https://en.wikipedia.org/wiki/User_talk:Erik_Zachte#Wikivoyage_Stats


Version: unspecified
Severity: normal

Details

Reference
bz55927

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:18 AM
bzimport set Reference to bz55927.
bzimport added a subscriber: Unknown Object (MLST).

MariaDB [itwikivoyage_p]> SELECT

->   COUNT(*)
-> FROM page
-> WHERE page.page_namespace = 0
-> AND page.page_is_redirect = 0;

+----------+

COUNT(*)

+----------+

3246

+----------+
1 row in set (0.03 sec)

MariaDB [itwikivoyage_p]> SELECT

->   page_namespace,
->   page_title
-> FROM page
-> LEFT JOIN pagelinks
-> ON pl_from = page_id
-> WHERE pl_namespace IS NULL
-> LIMIT 1;

+----------------+------------+

page_namespacepage_title

+----------------+------------+

0Bug_54831

+----------------+------------+
1 row in set (0.10 sec)


[[voy:it:Special:Statistics]] says 3485 and it's surely wrong, there can be at most 3246 countable articles. However, only 1 page (which I just created) has no links, AFAICS, so the article count should not be lower than 3200. As mentioned in the URL, Special:Statistics is wrong due to bug 40009; why Wikistats is wrong, I don't know. I'm going to file a site request to fix the former.

Actually, I doubt the second query does what it's advertised for... no time to fix it now though.

(In reply to comment #4)

Prioritization and scheduling of this bug is tracked on Mingle card
https://mingle.corp.wikimedia.org/projects/analytics/cards/1248

Is this really you, Diederik, or Bingle? :) You created a duplicate.

Just one note on the above query. The article count should be based not only on NS:0 but also on "Portale" and "Tematica".

Right, so now Special:Statistics is correct (thanks to Reedy) and it's exactly 3400 as of now, it was off by 85 only.

There was indeed an undercount, as follows:

Wikistats script did not expect pages with missing sha1. If looked ahead to next <sha1>..</sha1> and missed pages with <sha1 />.

Compare numbers
old http://stats.wikimedia.org/wikivoyage/WikivoyageItalianPrev.htm
new http://stats.wikimedia.org/wikivoyage/EN/draft/TablesWikipediaIT.htm

This affects other wikis as well. I will reprocess all dumps in coming week and when new numbers are report on impact.

All wikis have been reprocessed in Dec 2013 after issue had been fixed

Thanks. One question.
The various https://en.wikivoyage.org/wiki/Special:Statistics (I mean for each language), do not reflect the real amount of articles, images, etc...

Can all this numbers be "reprocessed" too?

@Andyrom75 That is outside the realm of Wikistats, which this bug is about. Sorry I can't give you more specific pointers.