Page MenuHomePhabricator

Provide replication lag as a database function
Closed, DeclinedPublic

Description

There should be a function tools.replag() (can MySQL tie functions to a database?) that provides the replication lag in seconds, and maybe another function tools.lastrepupdate() that returns "NOW() - replag()".

It should be a "SECURITY DEFINER" function that queries "SHOW SLAVE STATUS.Seconds_Behind_Master" (cf. mediawiki/core:includes/db/DatabaseMysql.php). This is preferable to the Toolserver method of querying recentchanges as it gives a more accurate picture of the actual replication lag. http://stackoverflow.com/questions/1570776/how-to-access-seconds-behind-master-from-sql has some information that this query should be possible, though not trivial.


Version: unspecified
Severity: normal

Details

Reference
bz48628

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 1:18 AM
bzimport added a project: Toolforge.
bzimport set Reference to bz48628.

I wondered what effect multi-hop replication like in LabsDB (production DBs => sanitizer => LabsDB) has on Seconds_Behind_Master, and Sean said that for this reason WMF uses http://www.percona.com/doc/percona-toolkit/2.2/pt-heartbeat.html. So a database function as proposed in this bug should probably query the heartbeat record instead.

coren removed coren as the assignee of this task.Mar 25 2015, 6:45 PM
coren triaged this task as Low priority.
coren set Security to None.
Restricted Application added a subscriber: Aklapper. · View Herald Transcript

This will be possible very soon due to: T111266, and that may even be a duplicate of this.

jcrespo claimed this task.

Already provided by T71463 (both on labs and on production). If that needs an web API/programing binding, please have a look at https://tools.wmflabs.org/replag/ and T106151.

scfc changed the task status from Resolved to Declined.Apr 24 2016, 3:42 AM

@scfc Why declined? the functionality has been implemented on the database. Do you really need a "MySQL function" instead of a table?

@jcrespo: Just trying to be accurate. If someone looks at this task "Provide replication lag as a database function" with the status "Resolved", they expect such a database function to exist, and the task description would suggest that it directly reflects internal MySQL status. T71463 is much more useful for real-life tools, but is something different than this task.

The important thing that worries me, @scfc, are you satisfied with that implementation (even if it does not exactly follow this ticket)?

Yes, very much, it provides an easy, standardized and documented way to query the replication lag. ¡Muchas gracias!