Page MenuHomePhabricator

beta: Parsoid is returning the wrong articles
Closed, ResolvedPublic

Description

  • go to any article
  • edit it with visual editor
  • instead of text of the article in visual editor, text of another article appears

Version: unspecified
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=57233

Details

Reference
bz57926

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 22 2014, 2:43 AM
bzimport set Reference to bz57926.
bzimport added a subscriber: Unknown Object (MLST).

This very much looks like an artefact of the changes for bug 57233 - marking as such.

I have updated Parsoid code on deployment-parsoid2 in /srv/deployment/parsoid/Parsoid

Forgot to run npm install to update the node modules, might be related :/

The articles returned do not seem to originate from beta enwiki or production enwiki

I upgraded the varnish on parsoidcache instance while I was investigating.

Now it seems VE is querying the parsoid with some invalid URL and it popup a 404 error :-/ Nothing obvious in the logs though, maybe VE has a wgDebugLogGroup we should enable to get more logs.

There is no more 404 errors but some articles still gives content that does not match the page being requested :/

The Parsoid does queries to the MediaWiki API to get some content to render. One of the request is logged in api.log as:

format=json action=visualeditor page=Marching_band paction=parse

I used ApiSandbox to reproduce that query:

http://en.wikipedia.beta.wmflabs.org/wiki/Special:ApiSandbox#action=visualeditor&format=json&page=Marching_band&paction=parse

The response contains among other things:

http://en.wikipedia.org/wiki/Special:Redirect/revision/51845
<title>Marching_band</title>
Describe the new page here.

"basetimestamp": "20120103034343",
"starttimestamp": "20131203212539",
"oldid": 51845

The redirection to old id 51845 yields page https://en.wikipedia.org/w/index.php?oldid=51845 which is the 'Pérez Prado' article.

[13:41] <gwicke> parsoid is not using action=visualeditor
[13:42] <gwicke> I'd double-check that VE is actually using parsoidcache3
[13:42] <gwicke> and that the request goes to the right parsoid backend

On deployment-parsoid2 , I have edited /usr/bin/parsoid to enable some log, namely the launch command is something like:

sudo -E -u parsoid nohup node \

/var/lib/parsoid/Parsoid/js/api/server.js \
>/data/project/logs/parsoid-stdout.log \
2>/data/project/logs/parsoid-error.log &

/data/project being the shared project directory, that might gives some clue.

And it is fixed now!

Gabriel kindly answered all my newbie questions related to Parsoid architecture. Ended up confirming that Parsoid running on deployment-parsoid2 did query the production infrastructure despite a localsettings.js claiming otherwise.

Roan showed up and noticed that /var/lib/parsoid/Parsoid points to the shared NFS directory (/data/project/apache/common-local/php-master/extensions/Parsoid/). That one is autoupdated and did NOT contain the localsettings.js file

Roan copied the settings file, I restarted the server and now it seems to be serving the proper pages.

The root cause is https://gerrit.wikimedia.org/r/98014 from last Friday which I did not bother to verify :-(

hopefully cleared parsoid cache using:

deployment-parsoidcache3$ sudo varnishadm ban.url .