Page MenuHomePhabricator

Large drop in historical total active editors numbers
Closed, ResolvedPublic

Description

The "Total" active editor numbers for past months on the report card [1] ("Active Wikimedia Editors for All Wikimedia Projects (5+ edits per month)") appear to have changed drastically at some point during the last few months.

Compare the table with the same numbers on stats.wikimedia.org [2] with its version as archived by the Internet Archive in May [3]. E.g.:

March 2013: 79346 (current) vs. 82105 (archived, also matches the number given in the monthly WMF report at the time [4])

February 2013: 75140 (current) vs. 77701 (archived)

...

June 2012: 77363 (current) vs. 78320 (archived)

...

Note that some shrinkage is expected because of page deletions, but these decreases are way too large to be explained by this effect.

Also, two smaller issues which might be related or not:

  • For some reason, the source CSV file linked in the report card [5] contains empty columns for several other language projects (e.g. Chinese, Portuguese, and Polish Wikipedia, and Meta-Wiki), apart from those included in the charts.
  • The June 2013 number for the Italian Wikipedia is currently missing from the report card chart ("NaN").

[1] http://reportcard.wmflabs.org/graphs/active_editors

[2] http://stats.wikimedia.org/EN/TablesWikimediaAllProjects.htm (Edits ≥ 5 column)

[3] http://web.archive.org/web/20130521103812/http://stats.wikimedia.org/EN/TablesWikimediaAllProjects.htm (Edits ≥ 5 column)

[4] https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Report,_April_2013#Data_and_Trends

[5] http://reportcard.wmflabs.org/data/datafiles/rc/rc_active_editors_count.csv


Version: unspecified
Severity: major

Details

Reference
bz53118

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:04 AM
bzimport set Reference to bz53118.
bzimport added a subscriber: Unknown Object (MLST).

Wikistats related:

  1. Mismatch between [2] and [3] is under investigation.

Limn related:

  1. Incomplete csv file [5] is something in Limn, probably related to missing Italian data (which is last column filled in Limn csv file). All data except last month for Italian were available in wikistats output file 'wikilytics_in_wikistats_core_metrics.csv'
  1. Also I just noticed data in [5] are in wrong columns. See e.g. Wikidata: should be empty till Oct 12.
  1. Limn shows a notice: "Jun 2013: Data for Italian Wiki is not yet available. This affects the overall totals and percentages."

It would have been clearer to blank total editors and MoM and YoY percentages, but I understand that wasn't so easy on such a short notice. Maybe tweak Limn, for next time this occurs?

Process flow related:

  1. Italian wiki dump was not available till July 29 (dump server migrated to new data center on as far as Wikistats is concerned an ill-timed moment), and last minute processing failed (manual error), so that's why data were missing. I sent an updated data file Aug 8, but Limn was not updated, probably due to vacations.

I was indeed on vacation, but I've updated the data now so Italian numbers show up. MoM and YoY percentages should also be correct.

Hey Dan/Erik -- can we close this?

thanks,

-Toby

It's fine by me as far as I know