Page MenuHomePhabricator

Add support for deploying per-datacenter config variances
Closed, ResolvedPublic

Description

Author: afeldman

Description:
Once apaches are up in eqiad, we need the ability to deploy different versions of certain mediawiki config files to hosts in each datacenter.

For example, files such as db.php or mc.php could be split in db.php-eqiad, db.php-pmtpa, mc.php-eqiad, mc.php-pmtpa.

When running a scap, the right version should be copied to hosts in the intended datacenter. The same support should be in scripts such as sync-file, where "sync-file wmf-config/mc.php" would recognize mc.php has dc variants, and deploy both versions of the file.

An alternate option would be to make mediawiki itself datacenter aware, and have arrays like $wgMemCachedServers = { 0 => '10.xxx', .. } become $wgMemCachedServers = { 'pmtpa' => { 0 => '10.xxx', .. }, 'eqiad' => { 0 => , '10.xxx', .. } }.


Version: unspecified
Severity: normal

Details

Reference
bz39082

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 22 2014, 12:59 AM
bzimport set Reference to bz39082.

Way back when, we varied config based on $cluster, but that's kind of a PITA...so I like the idea of doing cluster-specific config instead.

Yeah, this was back in the yaseo days ;-)

This is already done, isn't it? Antoine set it up to support labs, using the same method we used for yaseo. We have db-wmflabs.php, mc-wmflabs.php, CommonSettings-wmflabs.php, etc.

The new configuration setup has two axes: realm and cluster. Realm is for labs/production, cluster is for pmtpa/eqiad. The realm is configured with /etc/wikimedia-realm. When we had yaseo, the cluster was configured with /etc/wikimedia-cluster, and I assume we'll be doing that again once we have more than one cluster.

Chad / Asher: What is the actual status on this, as Tim implies that this might be already done? Do you agree?
(Setting tentative "High" priority as it's blocking bug 39106 which also has high priority.)

afeldman wrote:

It isn't clear to me that it is, as I don't think the current production deploy scripts actually support the new configuration setup. We now have mw servers up in eqiad, I think actively deploying an eqiad specific config to them is the measure of completion.

https://gerrit.wikimedia.org/r/30784 brings back to life /etc/wikimedia-cluster which has been dropped out of the debian package wikimedia-base. That would let us detect on which cluster we are running.

File /etc/wikimedia-cluster has been renamed to /etc/wikimedia-site . Gerrit change #30784 has been merged it and thus the file should be available soon on all servers.

Gerrit change #30792 adjusts wmf-config to read the datacenter from /etc/wikimedia-site, and changes the various switches to either vary based on realm (labs/production) or add a case for eqiad. It also renames the variable $cluster to $datacenter, based on IRC discussion.

After looking at bug 41133, it looks like it's best to fix things for both bugs all at once. So the change above is now abandoned, and the new fix for this bug is spread across two changesets:

Gerrit change #32167 has the changes to operations/mediawiki-config
Gerrit change #32168 has the changes to operations/mediawiki-multiversion

I'm bumping this to highest priority as bug 39082 (blocked by this) has also highest priority.

I believe this is waiting for review by Antoine, so assigning to him.

I forgot to update. This is still a work in progress, a few changes have been deployed such as updating the WikimediaMaintenance extension and getting rid of the legacy $cluster variable.

Tim has merged the remaining changes, I guess it is more or less fixed.

I have checked with Brad, there is no point in keeping this around. The remaining changes have been deployed in production and beta. Both sites are still up so that is a work for me.

There must be some minor glitches remaining, will open bug as we encounter them.

afeldman wrote:

How do I use the new system to specific pmtpa and eqiad specific configuration, such as for memcached? Is there documentation anywhere?

I don't see anything cluster specific in the prod wmf-config files, so while the code may be in prod, I don't think it has actually been tested there yet. I'm reopening this until it has been demonstrated in production.

The doc change has been merged.