Page MenuHomePhabricator

Edit conflict detection by timestamp should be deprecated
Open, MediumPublic

Description

In theory, we perform an inexpensive 1-second granularity timestamp check to detect conflicts, then a second test is performed atomically during updateRevisionOn, which uses the actual revision ID of the article content that was used as the edit base. However, this atomic test is broken because getLatest is returning the *new* latest id rather than the oldid stored at the time the user opened the article edit page.

This can be easily confirmed by disabling the timestamp test at EditPage.php line 1613, opening two tabs, and editing then submitting in both. This allows you to simulate two edits occurring within one second of each other. The outcome will be a silently ignored edit conflict, and the second edit will overwrite the first.

One issue that jumps out immediately is that we were relying on the "oldid" hidden form parameter, but this always has a value of "0".

The edittime method has been deprecated by editRevId, so we can remove the parameter.


See Also:

Related Objects

StatusSubtypeAssignedTask
OpenNone
OpenNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Change 94584 had a related patch set uploaded (by Awight):
WIP (bug 56849) Pass true oldid for atomic page save

https://gerrit.wikimedia.org/r/94584

Change 201928 had a related patch set uploaded (by Awight):
Accessor to get EditPage parent revision ID

https://gerrit.wikimedia.org/r/201928

Change 201929 had a related patch set uploaded (by Awight):
Helper to get a page's revision ID at a given time

https://gerrit.wikimedia.org/r/201929

Change 203632 had a related patch set uploaded (by Awight):
Use parentRevId for three-way merge

https://gerrit.wikimedia.org/r/203632

Change 204713 had a related patch set uploaded (by Awight):
Introduce parentrevid parameter in the API

https://gerrit.wikimedia.org/r/204713

Change 204714 had a related patch set uploaded (by Awight):
WIP Compatibility to guess the parentRevId from legacy timestamp

https://gerrit.wikimedia.org/r/204714

Change 201928 merged by jenkins-bot:
Accessor to get EditPage parent revision ID

https://gerrit.wikimedia.org/r/201928

Change 255278 had a related patch set uploaded (by Aaron Schulz):
[WIP] Switch EditForm to using editRevId in place of edittimestamp

https://gerrit.wikimedia.org/r/255278

@awight @aaron are you still working on this? Since edit conflicts are a big topic on both the interational wishlist and the technical wishes of the German-speaking community, we are thinking to further work on your patches at the Wikimania hackathon. Would that be ok for you? And if yes, what would be the next steps to do?

I can probably rebase that one patch and remove the WIP tag. Aside from that, I probably won't make more of these patches soon.

I reviewed the latest patch addressing the edit conflict race condition with timestamps by @aaron ( https://gerrit.wikimedia.org/r/#/c/255278/ )

Apart from minor general code improvements, for me some questions remain and I added comments in the patch.

As I understand it, this patch adds checks to compare the revision ids of the the expected base and the actual ( potentially updated ) base. Prior to the patch theses checks relied on the timestamp instead. The patch keeps the timestamp checks for cases, where the expected base id is not set.

Q:

  • In which cases can the base rev id be null?
  • Do we still need the additional timestamp checks?
  • If we still need the timestamp check, is there still the possibility that we have problems there?
  • Do we still need a timestamp fallback when loading the content for the resolve conflict page?

Answers to my questions were given in the patch. I will sum them up in this comment:

Timestamp checks are kept to be backwards compatible, bots or other scripts might not use the new fields and therefore it can be null in some cases. The original problem should be solved with the patch.

@WMDE-Fisch Thank you for throwing new energy at this problem! I'll experiment with Aaron's patch, and I can give lots of time during the Hackathon. This is still high on my list of public and personal vendettas in MediaWiki.

Change 255278 merged by jenkins-bot:
Switch EditForm to using editRevId in place of edittimestamp

https://gerrit.wikimedia.org/r/255278

Woohoo! That patch should do it...

The next step is to write some tests to confirm the fix and prevent regressions. FWIW, this draft patch might have some useful bits that can be reused when writing the tests: https://gerrit.wikimedia.org/r/#/c/94584/

It doesn't look like we ever solved the race condition for ApiEditPage, which doesn't use editRevId. I'd like to require this param everywhere, it's quite risky to edit a page without knowing what revision we're basing changes on.

It doesn't look like we ever solved the race condition for ApiEditPage, which doesn't use editRevId. I'd like to require this param everywhere, it's quite risky to edit a page without knowing what revision we're basing changes on.

Yes, please !

Now I'm thinking that we shouldn't make the editRevId parameter mandatory, and can point to the other ApiEditPage params being optional as precedent. The default behavior of guessing editRevid = page.latest is sane and is usually correct as well, so requiring clients to track page.latest before making an edit adds complexity with no gain.

But I'll go ahead and pass the parameter through ApiEditPage, and recommend adding it for any integration experiencing edit conflicts.

@awight Passing editRevId is important if and only if the new content is based on some old content. And if that is the case, the old revision is known to the client.

So, editRevId needs to be included if the edit was, e.g., a search and replace operation, or inserting categories into an existing page text.

Since blindly replacing the entire page without looking at its current content should be rare, supplying editRevId should be the norm. It probably doesn't need to be a hard requirement, but I'd support issuing a warning of page content is overwritten without editRevId being present. We could support "editRevId=blind" to suppress the warning.

The append/prepend modes are a bit of a special case. Passing editRevId doesn't make sense for them, they are always "blind". The same is true for section=new.

When creatign a new page, clients should send editRevId=0 to indicate that they intended to create a page. That would be equivalent to the EDIT_NEW flag used internally.

Other section edits however should be assumed to be based on that section's previous content.

awight renamed this task from Edit conflict detection suffers a race condition to Edit conflict by epoch seconds has a race condition.Jul 5 2018, 12:23 PM
awight lowered the priority of this task from High to Medium.
awight updated the task description. (Show Details)

Lowered the priority because the epoch seconds check is a fallback to more accurate editRevId, the latest revision ID. The remaining work is to remove the fallback.

Change 443963 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/core@master] Drop "edittime" parameter

https://gerrit.wikimedia.org/r/443963

awight renamed this task from Edit conflict by epoch seconds has a race condition to Edit conflict detection by timestamp should be deprecated.Jul 5 2018, 4:28 PM

Change 445230 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/FlaggedRevs@master] Use editRevId rather than edittime

https://gerrit.wikimedia.org/r/445230

Change 445236 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/core@master] Make $editRevId public for naughty extensions to access

https://gerrit.wikimedia.org/r/445236

Change 445237 had a related patch set uploaded (by Legoktm; owner: Awight):
[mediawiki/extensions/TemplateSandbox@master] Stop relying on deprecated edittime

https://gerrit.wikimedia.org/r/445237

Change 445240 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Commentbox@master] Update to use editRevId

https://gerrit.wikimedia.org/r/445240

Change 445245 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/BlogPage@master] Remove deprecated wpEdittime reference

https://gerrit.wikimedia.org/r/445245

Change 445246 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/DynamicPageList@master] Use editRevId for compatibility with newer MediaWikis

https://gerrit.wikimedia.org/r/445246

Change 445249 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/CreateRedirect@master] Use newer "editRevId" conflict detection

https://gerrit.wikimedia.org/r/445249

Change 445250 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/PageForms@master] Use editRevId for better edit conflict detection

https://gerrit.wikimedia.org/r/445250

Change 445253 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Drafts@master] Update to use editRevId conflict detection

https://gerrit.wikimedia.org/r/445253

Change 204714 abandoned by Awight:
WIP Compatibility to guess the parentRevId from legacy timestamp

https://gerrit.wikimedia.org/r/204714

Change 204713 abandoned by Awight:
[WIP] Introduce parentrevid parameter in the API

https://gerrit.wikimedia.org/r/204713

Change 201929 abandoned by Awight:
[WIP] Helper to get a page's revision ID at a given time

https://gerrit.wikimedia.org/r/201929

Change 94584 abandoned by Awight:
WIP: Use the parentRevId for more atomic editing

https://gerrit.wikimedia.org/r/94584

Change 445245 merged by jenkins-bot:
[mediawiki/extensions/BlogPage@master] Remove deprecated wpEdittime reference

https://gerrit.wikimedia.org/r/445245

Change 445237 abandoned by Awight:
Stop relying on deprecated edittime

https://gerrit.wikimedia.org/r/445237

Change 443963 abandoned by Awight:
Drop "edittime" parameter

https://gerrit.wikimedia.org/r/443963

Change 445249 abandoned by Edward Chernenko:
Use newer "editRevId" conflict detection

https://gerrit.wikimedia.org/r/445249

Change 443963 restored by Awight:
Drop "edittime" parameter

Reason:
Working on this again.

https://gerrit.wikimedia.org/r/443963

Change 445237 restored by Awight:
Stop relying on deprecated edittime

Reason:
I'm sure this can be accomplished.

https://gerrit.wikimedia.org/r/445237

Platform Engineering is interested in this in the context of the REST/CRUD interface, see T230843.

I found something surprising while playing with a proof-of-concept for T245523: Investigation: Permanent or long-term storage of edit conflict text. It seems that wpEdittime is required and without it, EditPage will ingore edit conflicts based on editRevId. So far, I have no reason to think this causes MediaWiki to ignore real conflicts in the wild, since the server should in theory calculate this timestamp correctly, but it's certainly fragile and will have to be fixed before we can fully deprecate.

I found something surprising while playing with a proof-of-concept for T245523: Investigation: Permanent or long-term storage of edit conflict text. It seems that wpEdittime is required and without it, EditPage will ingore edit conflicts based on editRevId.

Please see if this patch fixes that issue for you:
https://gerrit.wikimedia.org/r/c/mediawiki/core/+/577345

So far, I have no reason to think this causes MediaWiki to ignore real conflicts in the wild, since the server should in theory calculate this timestamp correctly, but it's certainly fragile and will have to be fixed before we can fully deprecate.

Well, edits in the wild come either from the UI, which provides wpEdittime in addition to the revision ID, or from the API, which so far supports conflict detection only based on the basetimestamp parameter. So the only way to get into this situation in the wild would be to fake a POST request from the edit form, providing a revision ID but no timestamp. I think doing so would currently indeed cause the conflict to be ignored.

@aweight please also note my comment on T34037#5948430 about the fact that conflict detection based on timestamp will ignore self-conflicts, while detection based on revision id will not.

awight removed awight as the assignee of this task.EditedMay 26 2021, 2:30 PM
awight added a subscriber: Daniel.Cardenas.

Unassigning myself. I think the next step would be to deprecate edittime, but this is a big undertaking. Lots of external code relies on the parameter, and there's the self-conflict question brought up by @daniel.

Change 445250 merged by jenkins-bot:

[mediawiki/extensions/PageForms@master] Use editRevId for better edit conflict detection

https://gerrit.wikimedia.org/r/445250

Change 445246 merged by jenkins-bot:

[mediawiki/extensions/DynamicPageList@master] Use editRevId for compatibility with newer MediaWikis

https://gerrit.wikimedia.org/r/445246

Change 445240 merged by jenkins-bot:

[mediawiki/extensions/Commentbox@master] Update to use editRevId

https://gerrit.wikimedia.org/r/445240

Change 445236 abandoned by Thiemo Kreuz (WMDE):

[mediawiki/core@master] Make $editRevId public for naughty extensions to access

Reason:

Unclear which exact code needs this. Can easily be restored/redone if needed.

https://gerrit.wikimedia.org/r/445236