Page MenuHomePhabricator

Selser corruption?
Closed, DeclinedPublic

Description

https://pl.wikipedia.org/w/index.php?title=Squat&curid=6048&diff=36327967&oldid=36176788 was VE edit.

It looks like some parts were swapped with some other parts in german (and this is the polish wikipedia).

The user says this wasn't visible in the "Review and save" diff.

(reported by MatmaRex in #mediawiki-parsoid.)


Version: unspecified
Severity: normal

Details

Reference
bz47998

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:32 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz47998.

Looks like a selser problem to me; the corrupted regions correspond to <a> and <b> tags in the HTML DOM.

Interesting. So, I suppose the wikitext source fetched for this page came from the German wikipedia rather than Polish wikipedia. I wonder if the wiki prefix was incorrect, or if Parsoid used this from a cache and the cache-key didn't include the wiki prefix.

The polish wikipedia wikitext is: https://pl.wikipedia.org/w/index.php?title=Benutzer_Diskussion:EvaK&action=edit&oldid=36176788

The german wikipedia wikitext is:
https://de.wikipedia.org/w/index.php?title=Benutzer_Diskussion:EvaK&action=edit&oldid=36176788

From the pl page diff, it is clear that the incorrect text came from the german wikitext above.

So, I suspect either a caching bug or incorrect params received from VE.

Note that the issue was reported to not have shown in the diff.

I think we are all assuming that this actually showed up in the diff as otherwise the corruption would have happened while the VE sent *wikitext* to the API to save it, without any Parsoid involvement.

I have traced the code from the POST end point to the TemplateRequest for selser, and did not find any issues. The language is determined by making the API request to the right wiki's apiURI which is set in env.conf.parsoid, and afaik never written to.

If there was a bug in this area we would see a lot of issues, not just one.

So my guess is that we got an incorrect prefix passed in.

The VE API module uses $wgVisualEditorParsoidPrefix to construct the prefix, which is likely correct for pl as the GET uses it too and succeeded. An incorrect prefix could be passed in if VE posted to the de.wikipedia.org API instead of the pl.wikipedia.org one. This is determined by host header, so DNS would not be the issue. The incorrect URL would need to be constructed client-side.

Lowering priority for now as the report seems to be less than reliable and we have not been able to reproduce this.

Closing for now, please reopen if this happens again.

[Parsoid component reorg by merging JS/General and General. See bug 50685 for more information. Filter bugmail on this comment. parsoidreorg20130704]