Page MenuHomePhabricator

Dump stats: automated validation of new monthly dump stats
Closed, DeclinedPublic

Description

21 Sep, 2012 Erik Zachte:
new script to compare csv output from different months (and halt publishing if differences exceed threshold)

What I did since 21 Sep is this:
New reports are first published in draft folder
After manual vetting reports are generated in 26 languages and published in final folder


Version: unspecified
Severity: normal

Details

Reference
bz46197

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:21 AM
bzimport set Reference to bz46197.
bzimport added a subscriber: Unknown Object (MLST).

Would be nice. But would be incomplete at best, and possibly a major task, depending on ambition level, as there are many metrics, and plausible rate of change could differ per project. It also could create lots of false positives, depending on thresholds chosen.

BTW manual vetting is mostly quick comparison of key metrics on old and new reports for some wikis (mostly English Wikipedia) to see if these metrics are ballpark within expected range. Other than that many eyeballs keep Wikistats under scrutiny.

Background: Several years ago there was a major bug that caused all article counts to be twice as high (redirects were not recognized, or something of that nature). Given that Wikistats regenerates all historic months on every run that gave the impression of a complete overhaul of Wikistats methodology and caused some turmoil. Given the high stability of the wikistats scripts (after 10 years of operation few operational errors has skewed stats) the fact that any software can fail in changing circumstances came as a surprise to some users.

Aklapper edited subscribers, added: Aklapper; removed: ezachte, wikibugs-l-list.

Closing this ticket as Wikistats version 1 is dead per https://stats.wikimedia.org/Wikistats_1_announcements.htm . In case this ticket is still a valid bug report or feature request for Wikistats 2, then please reopen. Thanks a lot!