Page MenuHomePhabricator

MessageCaches Parser clone is easily messing with original
Closed, ResolvedPublic

Description

I just had the weirdest problems when fixing Extension:Variables where one variables store should exist per Parser object.
I did this by using a new property from the outside (which is not bad practice for this kind of thing) $parser->mExtVariables = new ExtVariables()

Certain hooks will affect all instances of that store, for example ParserClearState will reset it entirely, practically something like $parser->mExtVariables->mVariables = array()

I just wasn't aware of the fact that there is a cloned Parser instance for the message cache which might call ParserClearState before the original Parser objecdt where the clone comes from.
This means all Changes I do in ParserClearState or any other hook for the cloned parser in $parser->mExtVariables->mVariables (where mExtVariables is an object as you can see) will affect the original Parser where the clone came from as well, since the clone is not a deep clone where all internal objects will be cloned as well.
So by resetting the variables store for the cloned object, I unwillingly do the same for the original.

Conclusion: The clone should be a new instance of the parser with same parser options or a deep clone perhaps?
Or does the clone actually rely on keeping track on object member variables changed in the original parser?

The cloning is done in MessageCache::getParser() since 1.18


Version: 1.20.x
Severity: normal

Details

Reference
bz32368

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:01 AM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz32368.

This problem also affects wfMsg, wfMsgForContent, wfMsgHtml, wfMsgWikiHtml, wfMsgReal and wfMsgGetKey when $transform is true, and wfMsgExt when parse or parsemag or parseinline is passed, when the value of the message contains a template. We recently had a problem on enwiki where someone added a template to [[MediaWiki:Pfunc time error]] leading to Cite.php losing references if there was an error in a #time parser function.

The basic problem seems to be that MessageCache grabs the global parser object and then calls functions like parse and transformMsg on it rather than recursiveTagParse and recursivePreprocess. This particular problem seems like it could be avoided by having MessageCache gets its own private parser object, or by deep-cloning the global parser as Daniel Werner mentioned, or by having MessageCache call recursive-safe functions on the parser object. But I don't know if any of that would break uses of MessageCache outside of a page parse.

Patch for core to add a ParserCloned hook is in
Gerrit change 33506.

Patch for Cite to make use of this is in
Gerrit change 33508.

Patch for Scribunto (which throws an exception if the clone gets its state cleared) is in
Gerrit change 33507.

All three of those are merged. Is this bug fixed now?

Should be, yes.

Maybe bug 34589 too, if someone can reproduce that and test it.