Page MenuHomePhabricator

Import old revisions from dump to restore missing revisions
Closed, DuplicatePublic

Description

Restore revisions lost in T39591.


Version: unspecified
Severity: major

Details

Reference
bz39008

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:55 AM
bzimport set Reference to bz39008.
bzimport added a subscriber: Unknown Object (MLST).

I made an xml to import (extracted from zhwiki-20120613-pages-meta-history.xml) at http://toolserver.org/~liangent/-/FAC.xml.gz

Bumping importance as this is a data loss.

I don't see why we need 2 bugs for the same issue..

Wondering who could take a look at this, CC'ing some folks.

(In reply to comment #3)

I don't see why we need 2 bugs for the same issue..

I think we want to find the cause in bug 37591 (but since that doesn't happen so often it doesn't need to have high priority). Once the dump is successfully imported this bug is resolved and in meantime those missing revisions can't be viewed by users so this bug should have higher priority.

Is there any reason we can't just use that XML with importDump?

Ughh... so the page ids are different now than they were at the time of the bug report (I checked this on the production db), the old text entries are indeed still there and are obviously orphaned now.

If we use the XML file, new text entries wil be created. This isn't awesome, but at least the missing revisions will be back in there. I have no idea how importDump responds to importing revisions for a page where a page with that title exists already; it will silently skip existing revisions however, and that's good for us.

If we wanted to try to construct the revision rows with the original text ids, there's a stub dump from around the beginning of June 2012 which would give us the info, and a check of the sql query on bug 37591 shows that there was nothing new since then. Is it worth the work?

tomasz set Security to None.

See T39591 for recovering the loss in history on zh.wikipedia.org at WMF.

See T41007 for the underlying issue in MediaWiki core.