Page MenuHomePhabricator

ContributionReporting extension (Special:FundraiserStatistics) has been disabled
Closed, ResolvedPublic

Description

The full details are a bit unclear to me, but as far as I understand it, https://wikimediafoundation.org/wiki/Special:FundraiserStatistics was linked from reddit and due to the feature's implementation, it caused excessive database load through long-running queries. This excessive load eventually took down the main wiki sites and the extension was consequently disabled: http://wikitech.wikimedia.org/index.php?title=Server_admin_log&diff=40583&oldid=40582.

The extension should be optimized and re-enabled. There are already questions about the page's disappearance (cf. https://meta.wikimedia.org/w/index.php?title=Foundation_wiki_feedback&oldid=3109621#Why.2C_why.2C_why).

Until the extension is re-enabled, it would be nice if the link didn't display such a terrible error message. "You have requested an invalid special page." isn't accurate; the user has requested a disabled special page due to a special set of circumstances.

Related mailing list posts:


Version: unspecified
Severity: major

Details

Reference
bz32679

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 12:02 AM
bzimport set Reference to bz32679.

A simple stopgap solution would be to introduce a new right 'viewfrstats' and enable it only for registered users - this would prevent slashdotting while keeping the page working. While people are working on a real fix of performance issues, of course.

(In reply to comment #0)

The extension should be optimized and re-enabled. There are already questions
about the page's disappearance (cf.
https://meta.wikimedia.org/w/index.php?title=Foundation_wiki_feedback&oldid=3109621#Why.2C_why.2C_why).

Until the extension is re-enabled, it would be nice if the link didn't display
such a terrible error message. "You have requested an invalid special page."
isn't accurate; the user has requested a disabled special page due to a special
set of circumstances.

Hmmm, this part could be good for a separate bug report. Some sort of $wg variable to ACTUALLY 'disable' certain special pages by config, instead of resorting to uninstalling extensions, etc...

Indeed. From a european point of view (myself, Mark and Ariel), it was getting late, and for the Americans it was a holiday weekend.

Hence, disabling the extension outright was the simplest fix until a more appropriate time to evaluate and sort the extension, and then we knew it wouldn't do any more harm till it was completely assessed.

On the optimisation note, Domas gave Arthur and Ryan (Kaldari) reasonable instructions on how to fix it up (hopefully in the near term) so it would work a lot better under load. It has been known for a long time that the underlying database is poorly indexed and the alike.

It might just be mainly one path through the code, but Mark did see a lot of slow queries that seem to have originated from the "maximumsMax" code. Looking at the surrounding code all the other code paths seem to do similarly bad queries, so I'm not holding out much hope

From the top of my head, we don't have a "nice" way to mark special pages as disabled, so it's likely going to need a hack on that special page itself.

The rest of the extension may be fine to re-enable (the other special pages)

(In reply to comment #2)

Hmmm, this part could be good for a separate bug report. Some sort of $wg
variable to ACTUALLY 'disable' certain special pages by config, instead of
resorting to uninstalling extensions, etc...

Hah. Thanks for confirming that.

Doing it how the API can do it (replacing the module with a defined "ApiDisabled" module) seems sensible rather than $wgDisableSpecialPages

e.g

public function execute() {

		$this->dieUsage( "The ``{$this->getModuleName()}'' module has been disabled.", 'moduledisabled' );

}

Which then means we'd only need something like

$wgSpecialPages['ContributionHistory'] = 'DisabledSpecialPage';

I would love the ability to specify text that is shown to the user when a specific extension is disabled (so some sort of array of pages disabled vs additional messages for display).

https://wikimediafoundation.org/wiki/Special:FundraiserStatistics now returns the following error message:

This page is down for maintenance. Please check back later.

This is an improvement.

Which components are involved in the ContributionReporting extension and of those, can any be re-enabled immediately? Of those components with issues, what are the exact issues?

Roan: Can you post what you thought might reduce the number of queries here?

I miss the graphs. :-(

mladen.sablic wrote:

Second, Max Semenik proposal seems a good solution - just have the graph for registered users.

(In reply to comment #7)

Roan: Can you post what you thought might reduce the number of queries here?

For the public record:
(1) the cache timeout is 15 minutes for all fundraising campaigns, so we're recomputing the numbers for the 2007-2010 campaigns every 15 minutes. Since no one is gonna go back in time to donate to those campaigns, that's just a waste of resources. The cache timeout should be configurable per campaign and set to a high value (maybe not quite infinity, but 168h or something?) for the 2007-2010 campaigns and to 15 mins for the 2011 campaign
(2) the current code uses one query ('dailyTotals') to get all the data that goes into the bars, then uses another 6 (!) queries ('dailyTotalMax', 'yearlyTotalMax', etc.) to find the maximum for each column. These maxima are needed to scale the graphs correctly, but they can be computed much more easily from the data that's already been fetched from the first query (use max() or, if the array structure is too weird, roll your own loop that goes through all data points and computes the maxima)
(3) remove the CONVERT_TZ() stuff from the queries; the timezone stuff is broken anyway and can't easily be reenabled without breaking caching
(3b) stop using the timezone query parameter. The form element is disabled, but the query parameter still allows cache pollution to occur

In the current code, there are 7 queries per campaign (one to get the data, six to get the maxima) and 5 campaigns, so 7*5=35 queries are done per recache (i.e. every 15 minutes when the memcached entry expires). If #1 and #2 are both implemented, this would be down to one query.

The number of queries matters here, because every single query that we're talking about here is profoundly evil. I've included two samples below for shock value. Unfortunately, using these queries is necessary right now because we don't have a summary table. What Domas suggested we use in the longer term (and I support this suggestion) is either a statistics table that's updated by a cron job that runs the slow query hourly or daily (not ideal but not hard to do), or introduce a summary table with counts that CiviCRM would update along with the main public_reporting table (this is harder, but provides cheap live data).

The base query (to get the data) is:

SELECT
DATE_FORMAT(CONVERT_TZ(FROM_UNIXTIME(received), '+00:00', '$timezone'), '%Y-%m-%d'),
sum(converted_amount),
count(*),
avg(converted_amount),
max(converted_amount)
FROM public_reporting
WHERE received >= 'Nov 16 2011'
AND received <= 'Jan 15 2012'
AND converted_amount >= 1
AND converted_amount <= 10000
ORDER BY received
GROUP BY DATE_FORMAT(FROM_UNIXTIME(received), '+00:00', '$timezone'), '%Y-%m%d');

The queries to get the maxima look like (this one is for the highest daily average):

SELECT avg(converted_amount) as sum
FROM public_reporting
WHERE received >= 'Nov 16 2011'
AND received <= 'Jan 15 2012'
AND converted_amount >= 1
AND converted_amount <= 10000
ORDER BY sum DESC
GROUP BY DATE_FORMAT(FROM_UNIXTIME(received), '+00:00', '$timezone'), '%Y-%m%d');
LIMIT 1;

(In these queries, $timezone is the timezone requested by the user. This feature is currently semi-disabled: there is no form control for setting the timezone, so $timezone is '+0:00' unless the user sets a query parameter by hand (and pollutes the cache that way, oops!).)

(In reply to comment #9)

In the current code, there are 7 queries per campaign (one to get the data, six
to get the maxima)

Clearly, I suck at counting. There are 6 queries per campaign, one to get the data and five to get the maxima.

Bumping this for a status update.

I've re-enabled the FundraiserStats page on wikimediafoundation.org. Kaldari implemented a temporary solution which reduces the # of queries used down to one per campaign and are now caching data for past years for up to 1 week. This year's data is still cached for 15 minutes.

We're still working on a long-term solution which makes use of a maintenance script which runs periodically to populate summary tables with the necessary data. We will hopefully be releasing that sometime next week.

The long term solution was finished last year, but I forgot to close the bug.