Maniphest T20489

MediaWiki needs better throttling for expensive queries
Closed, ResolvedPublic
Actions

Assigned To

None

Authored By

	• MZMcBride
	Apr 17 2009, 12:38 AM

Description

Currently, I believe the API has little throttling built in (if any). This can cause problems if people or organizations are hitting the API with expensive queries at a fast rate.

Bug filed in response to ExternalStorage issues from &rvprop=content|timestamp being batch queried from the API.

Some sort of sensible options should be implemented to throttle expensive queries with variables to override the defaults for very robust or very weak servers.

Version: unspecified
Severity: minor

Details

Reference: bz18489

Related Objects

Mentioned In: T115088: General GET/POST limiting in MediaWiki

Event Timeline

• bzimport raised the priority of this task from to Lowest.Nov 21 2014, 10:31 PM

• bzimport added projects: MediaWiki-General, Performance Issue.

• bzimport set Reference to bz18489.

• bzimport added a subscriber: Unknown Object (MLST).

• MZMcBride created this task.Apr 17 2009, 12:38 AM

I don't believe this is a general API issue, but rather a MediaWiki or Wikimedia issue. Batch-querying rvprop=content is one way to cause an ES hiccup, but there are others, like hitting Special:Export at the same rate; the latter would actually be much more effective at killing ES servers, because it'll happily export an unlimited number of revisions of a single page (and an unlimited number of pages too, IIRC, so you could potentially ask Special:Export to try to export the entire wiki at once), whereas the API is limited to 50 content fetches per request (500 for bots/sysops).

Bryan.TongMinh wrote:

INVALID, not an API bug.

Re-opening with an expanded scope. Changed summary from "API needs better throttling for expensive queries" to "MediaWiki needs better throttling for expensive queries." This is definitely an issue that needs to be addressed at some point.

Bringing in performance-savvy contributors to weigh in in this old request.

Marking as Lowest, since nobody seems to be working or planning to work on this currently.

jcrespo subscribed.May 3 2016, 9:49 AM

jayvdb subscribed.Aug 11 2016, 1:41 PM

Aklapper removed a subscriber: • wikibugs-l-list.Apr 12 2020, 9:55 AM

MediaWiki has various rate limits (so-called "ping" limits). These are applied for example to write actions (like edits) but also to some other expensive actions, such as generating a thumbnail on cache-miss.

If there are specific endpoints known to be expensive and that for load reasons we want to limit (the latter isn't applicable to all slow things), then this could be used there as-needed.

For the concern of general load and concurrency (not individually slow actions), there is also room for higher-level throttling at the traffic layer. We do this at WMF in at one or more of the Nginx/ATS/Varnish layers (I forget which one).

This is not just an optimisation for WMF. The general concern if a single user occupying many workers (even with relatively simple requests like page views) is not something that can be solved well at the application layer, because by the time you've let an Apache worker be dedicated and startup a PHP process (even a well-optimised one like MW), and then let it do rate limiting by tracking a counter eg. in Memcached, then you've essentially already lost the cost you wanted to prevent.

I'm closing this as the load concern is already solved outside MW for WMF, and for third-parties it should be solved there as well. E.g. with a basic IP-based throttle in Nginx or such. And for the expensive backend cost (past Apache, or for long-running web requests) we have the existing ping limiting and PoolCounter systems that seem to work well. If there are specific incidents that call for such measure being added somewhere, then a specific task should be filed instead.

Krinkle mentioned this in T115088: General GET/POST limiting in MediaWiki.Apr 12 2020, 11:14 PM

MediaWiki needs better throttling for expensive queriesClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

MediaWiki needs better throttling for expensive queries
Closed, ResolvedPublic
Actions