Page MenuHomePhabricator

Separate replica datacenter causes high query latency
Closed, ResolvedPublic

Description

The database replicas are in a different datacenter than Tools Labs. The resulting latency (26ms per query) has a major performance impact on tools that need to make many queries. This prevents the migration of many Toolserver tools (notably crosswiki tools).

This is a tracking ticket for plans to combine the datacenters to address this, mentioned here: http://lists.wikimedia.org/pipermail/labs-l/2013-September/001574.html

Repro

The following scripts reproduce the latency issue. They open a connection to each database server, then execute a sample query for each wiki ("use xxwiki").

Toolserver: https://toolserver.org/~pathoschild/accounteligibility/test.php |
source: http://pastebin.com/ctUQsXZm

Tools Labs: http://tools.wmflabs.org/pathoschild-contrib/accounteligibility/test.php | source: http://pastebin.com/Xi3abGFP

This runs in 0.3 seconds on the Toolserver, but takes a whopping 23 seconds on
Tools Labs. The scripts shows the profiling results at the bottom of the page (click the [+] next to "Page generated in XX seconds").


Version: unspecified
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=58890

Details

Reference
bz55929

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:19 AM
bzimport added a project: Cloud-VPS.
bzimport set Reference to bz55929.

As a note, most tools can be tweaked to make that issue mostly irrelevant by batching row I/O (fetching or inserting more than one row at a time) since the lag only affects roundtrips.

The latest timeline for this is mid-January[1], but all Toolserver accounts have been set to expire in early January[2]. Tool authors waiting for this ticket should make sure to renew their accounts before then.

[1]: http://lists.wikimedia.org/pipermail/labs-l/2013-November/001845.html
[2]: http://lists.wikimedia.org/pipermail/toolserver-l/2013-November/006379.html

What has become of this issue since you moved Tool Labs in December? Is it still valid?

(In reply to comment #3)

What has become of this issue since you moved Tool Labs in December? Is it
still valid?

Yes, except that it got 50 % slower in the meanwhile:

Page generated in 31.614 seconds. [+]

fetch wikis: 0.183 (0.58%)
connect to each slice: 0.756 (2.39%)
switch to each wiki DB: 30.669 (97.01%)

The latest timeline is now mid-March; the last blocker is the OpenStack setup by a contractor. See discussion at https://meta.wikimedia.org/wiki/IRC_office_hours/Office_hours_2014-01-23

Tool authors can now migrate to the new infrastructure[1], which resolves this issue.

[1]: http://lists.wikimedia.org/pipermail/labs-l/2014-March/002160.html

Okay, closing then. (Well, FIXED doesn't quite fit it, but I think it's appropriate :-).)

Pathos, please mark this VERIFIED when you've migrated your tools and confirmed they run as fast as they should.

(In reply to Tim Landscheidt from comment #7)

Okay, closing then. (Well, FIXED doesn't quite fit it, but I think it's
appropriate :-).)

Well actually it is fixed, as the tool has been optimized, by me, to perform quickly on pmtpa, and should run even faster once I migrate them eqiad.

(In reply to Cyberpower678 from comment #9)

(In reply to Tim Landscheidt from comment #7)

Okay, closing then. (Well, FIXED doesn't quite fit it, but I think it's
appropriate :-).)

Well actually it is fixed, as the tool has been optimized, by me, to perform
quickly on pmtpa, and should run even faster once I migrate them eqiad.

And I just realized this went in the wrong bug. Whoops.