Page MenuHomePhabricator

Dump stats: collect monthly totals more efficiently
Closed, DeclinedPublic

Description

On largest dumps this routine can take hours, not trivial to rework though


Version: unspecified
Severity: minor

Details

Reference
bz46208

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:22 AM
bzimport set Reference to bz46208.
bzimport added a subscriber: Unknown Object (MLST).

One of the design anomalies that dates back to an era when the English dump could be parsed in minutes rather than days ;-)

Current implementation is really inefficient (bad coded) as for some metrics in WikiCountsOutput.pm data are established month by month, and for every iteration a recursive lookup occurs. The inefficiency grows over month by month, but speed improvements in hardware make it less urgent.

If Wikistats would change to incremental updates https://bugzilla.wikimedia.org/show_bug.cgi?id=46198 the issue is mute.

Aklapper edited subscribers, added: Aklapper; removed: ezachte, wikibugs-l-list.

Closing this ticket as Wikistats version 1 is dead per https://stats.wikimedia.org/Wikistats_1_announcements.htm . In case this ticket is still a valid bug report or feature request for Wikistats 2, then please reopen. Thanks a lot!