Page MenuHomePhabricator

max_user_connections too low on Wikimedia Labs
Closed, ResolvedPublic

Description

max_user_connections for tools on labs is 10 by default. This is too low for the RENDER tools. Please increase the limit to something like 512 or remove it for SQL users 50380g50613 and p50380g50454, i.e. render and render-tests.

We need to publish the tools for the evaluation phase asap so this is urgent.

A limit of 10 will be too low for many projects. It should be increased for all tools. On the toolserver, there was no limit for MMPs.


Version: unspecified
Severity: blocker

Details

Reference
bz49872

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:41 AM
bzimport added a project: Toolforge.
bzimport set Reference to bz49872.

That is hardcoded in puppet:

modules/mysql_multi_instance/manifests/instance.pp

'max_user_connections'        => 10,

Adding in CC a few ops people that edited that file (Peter, Asher and Marc-André).

The role::db::labsdb class calls:

mysql_multi_instance::instance { $instances_keys :
  instances => $instances
}

mysql_multi_instance::instance should have a new 'max_user_connections' parameter defaulting to '10' that would let one override it in role::db::labsdb.

Permament fix in puppet, runtime directive in place (that will last until the next DB restart)

afeldman wrote:

For the record, I think the new limit of 500 is unwise, and will likely impact performance and availability at times. 50 could be ok, instead of the 10 I originally chose. Any labs project that needs more connections to a single db should probably use mysql proxy with a limited connection pool or something similar.

You need to take into account that the connection limit also affects web requests, and several users might do requests at the same time. The Article List Generator opens several connections, and several programs in the Toolkit as well. We need to be able to have several people use our tools, so the raised limit for render and render-tests isn't really optional. And I am pretty sure it is the same for several other tools.

If you don't want to raise the limit for all tools, I suggest you set a default value of, say, 20, and create a simple process to raise the limit per-tool if necessary.

In any case, I never requested to raise the limit for all accounts by default, so don't blame us if performance is affected by somebody abusing the new default limit!

In any case, I never requested to raise the limit for all accounts by
default

  • to something as high as 512

so don't blame us if performance is affected by somebody abusing the new
default limit!

I put in an arbitrarily high limit so that (a) users are generally not limited in the number of connection but (b) no runaway user can consume all of them.

I believe that tools using too many connections is a human problem not a technical one. :-)

Something must have reset the variable to its default value. But only for some wikis:

local-render@tools-login:~$ sql enwiki_p 'show variables like "max_user_connections"'
Variable_name Value
max_user_connections 512
local-render@tools-login:~$ sql dewiki_p 'show variables like "max_user_connections"'
Variable_name Value
max_user_connections 10
local-render@tools-login:~$ sql frwiki_p 'show variables like "max_user_connections"'
Variable_name Value
max_user_connections 10

Can somebody please set it to 512 for all wikis permanently?

Sean, I agree with the request in principle; given the use patterns it'd be reasonable to up the limit by a few orders of magnitude (just low enough that no single tool can use up everything) so that bursts of activity don't hit the limit.

It was set (by hand) to 512 in the past with no issues.

This has been un-fixed since more than a month now. We are getting angry emails from users because the article list generator doesn't work for anything except enwiki. Please fix it.

Should now be re-fixed back to 512. https://gerrit.wikimedia.org/r/#/c/106254/

Need to monitor for a while to see peak connection usage per user. From that we can pick a sane cut-off.

I wish mariadb had a "95th percentile max_connections". /That/ would be useful. :-)

(In reply to comment #14)

[...]
Need to monitor for a while to see peak connection usage per user. From that
we
can pick a sane cut-off.

Do we now how some data to close this bug?

Johannes: Any recent complaints from users because article list generator doesn't work?

Is there any work left to do here or can this be closed (see comment 14)?

Regarding monitoring:

We have userstat=ON on the labsdb instances to maintain user_statistics and client_statistics tables in information_schema. The data is also being logged on db1044 for time-series reporting, but isn't exposed anywhere yet.

http://www.percona.com/doc/percona-server/5.5/diagnostics/user_stats.html

Unfortunately the user_statistics.concurrent_connections field isn't updated due to bug. However by logging user_statistics.total_connections we can still identify spikes.

So far nobody has abused max_user_connections=512, and, as Coren implied earlier, we can continue to be lenient until forced to do otherwise. Closing.