Page MenuHomePhabricator

Revisions existing in both archive and revision tables
Closed, DuplicatePublicBUG REPORT

Description

Author: overlordq

Description:
Example dupe

When doing some investigating for FT2 for bug 18104, I took a dump of rev_id's from revision and ar_rev_id, sorted then for giggles ran uniq -d on it to find some duplicates. Out of my ~17M row sample (~16M from rev, ~550k from archive), I came across about 2000 instances where a given revision id was found in both rev_id and ar_rev_id.

From the example it seems either the row wasn't cleared from revision on delete, or it wasn't remove from archive on restore since the information is the same.


Version: unspecified
Severity: trivial

Attached:

Details

Reference
bz22392

Related Objects

StatusSubtypeAssignedTask
OpenFeatureNone
DuplicateBUG REPORTNone

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:51 PM
bzimport set Reference to bz22392.
bzimport added a subscriber: Unknown Object (MLST).

overlordq wrote:

I forgot to include the link to the duplicate rev_id's in my original post, this output[1] was only generated up to revision id ~17.7M and change. And from a cursory glance appears to be mostly attributable to [[Wikipedia:Village_pump_(assistance)]]

1 - http://toolserver.org/~overlordq/duplicate_revisions.txt

Yeah, the Village pump (assistance) problem should be cleand up at some point. See section 16 of:
http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/IncidentArchive327

happy.melon.wiki wrote:

I managed to restore the old revisions... ish. They now appear in the live history of the page (and I used RevDelete to clear up the privacy issue that prompted the incident in the first place); but the revisions haven't been deleted from the archive table, and I didn't get a deletion log entry. So in principle, there are now 10,000 duplicates from VPA, not 2000... However, I notice that the old revisions universally have a different ar_page_id to the current VPA page (1099394 instead of 14276593), which may be why they've become disconnected from the live page. Every deleted revision of VPA now has a corresponding live revision; the deleted duplicates can be safely deleted.

The page history for Wikipedia:Village pump (assistance) doesn't show any duplicate revisions. Were they deleted by the sysadmins, or by some other method? Are they automatically filtered out by MediaWiki, or did something else happen to them?

happy.melon.wiki wrote:

Yes, they seem to have gone (from the toolserver db as well as live server). Interesting.

Are there any remaining examples of this bug?

happy.melon.wiki wrote:

(In reply to comment #6)

Are there any remaining examples of this bug?

OverlordQ, do you mind running your original analysis again?

mysql:wikiadmin@db1057 [enwiki]> select count(*) from revision, archive where rev_id = ar_rev_id;
+----------+
| count(*) |
+----------+
|       26 |
+----------+
1 row in set (30 min 46.92 sec)
Aklapper changed the subtype of this task from "Task" to "Bug Report".Feb 5 2022, 2:33 PM