Page MenuHomePhabricator

Provide dumps of wikitech.wikimedia.org
Closed, ResolvedPublic

Description

I looked at http://dumps.wikimedia.org/backup-index.html for "labs" or "wikitech", but didn't find anything. The wiki at wikitech.wikimedia.org should have occasional dumps/backups.

I believe as wikitech is outside of the cluster, this wouldn't fall under dumps.wikimedia.org, per se, but it should be simple enough to use the same scripts and host the dumps somewhere.

Maybe do a dump every six months or something?


Version: unspecified
Severity: enhancement

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 1:56 AM
bzimport set Reference to bz52170.

There's daily dumps:

https://wikitech.wikimedia.org/dumps/

The directory doesn't currently allow listing, but the file format is:

labswiki-<year><month><day>.xml.gz
labswiki-<year><month><day>-images.tar.gz

Where year is xxxx, month is xx, and day is xx.

I may be able to just transfer them to dumps occasionally via cron.

(In reply to comment #1)

There's daily dumps:

https://wikitech.wikimedia.org/dumps/

Oh, awesome! :-)

I may be able to just transfer them to dumps occasionally via cron.

Or even just a simple explanation (basically duplicating comment 1 of this bug) somewhere on dumps.wikimedia.org would be sufficient.

Unfortunately those files (silver.wikimedia.org:/a/backup/public/labswiki-*.gz) appear to be owned by root, which I'm guessing is why I can't download them from the provided URL.

and also, wikitech's /dumps directory is protected by Require host wikitech-static.wikimedia.org: https://github.com/wikimedia/operations-puppet/blob/acacf97e2df962fef83487a461f3559fa07e4d6f/templates/apache/sites/wikitech.wikimedia.org.erb#L97

Removing that would resolve this and T101803

should we host the lastest dump on dumps.wm.org?

Restricted Application added a subscriber: Matanya. · View Herald Transcript

Maybe. Might need to keep in mind that you can only login to the database as wikiadmin from silver (e.g. it won't work from tin or other appservers).

Change 274405 had a related patch set uploaded (by Dzahn):
wikitech: open up dumps directory to public

https://gerrit.wikimedia.org/r/274405

Dzahn raised the priority of this task from Lowest to Low.Mar 2 2016, 3:18 PM

should we host the lastest dump on dumps.wm.org?

We can open up the dumps directory directly on silver, see patch above. And we can also setup an rsync over to dataset1001/dumps if we want them on dumps.wm.org. Why not both. shrug.

@Andrew @yuvipanda any thoughts on the restriction on that /a/backup/public ? please see https://gerrit.wikimedia.org/r/#/c/274405/1 and do you like https://gerrit.wikimedia.org/r/#/c/274402/ too? it's a dependency right now

and also, wikitech's /dumps directory is protected by Require host wikitech-static.wikimedia.org

So does anyone know why that protection has been added when the comments from 2013 are all about public dumps that already existed?

Change 274405 merged by Dzahn:
wikitech: open up dumps directory to public

https://gerrit.wikimedia.org/r/274405

Mentioned in SAL [2016-03-03T00:19:54Z] <mutante> making wikitech dumps available for T54170

Change 274609 had a related patch set uploaded (by Dzahn):
wikitech-apache: enable (fancy) indexing for dumps dir

https://gerrit.wikimedia.org/r/274609

Change 274609 merged by Dzahn:
wikitech-apache: enable (fancy) indexing for dumps dir

https://gerrit.wikimedia.org/r/274609