On largest dumps this routine can take hours, not trivial to rework though
Version: unspecified
Severity: minor
On largest dumps this routine can take hours, not trivial to rework though
Version: unspecified
Severity: minor
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Declined | None | T48208 Dump stats: collect monthly totals more efficiently | |||
Declined | None | T48198 Dump stats: switch to persistent stats rather than monthly regenerated stats |
One of the design anomalies that dates back to an era when the English dump could be parsed in minutes rather than days ;-)
Current implementation is really inefficient (bad coded) as for some metrics in WikiCountsOutput.pm data are established month by month, and for every iteration a recursive lookup occurs. The inefficiency grows over month by month, but speed improvements in hardware make it less urgent.
If Wikistats would change to incremental updates https://bugzilla.wikimedia.org/show_bug.cgi?id=46198 the issue is mute.
Closing this ticket as Wikistats version 1 is dead per https://stats.wikimedia.org/Wikistats_1_announcements.htm . In case this ticket is still a valid bug report or feature request for Wikistats 2, then please reopen. Thanks a lot!