Page MenuHomePhabricator

New parser breaks some template expansion in Semantic MediaWiki
Closed, DeclinedPublic

Description

A parser function demonstrating the bug

SVN r30106 breaks a part of the {{#ask}} function of SMW.
I will first describe what this function does and how it breaks, then I will present a simple test-case.

The #ask function has a mode where it takes a template, and a query for certain properties. For each page that matches the given property, it fills out the template with some parameters relating to the page and the query. It then lets the parser preprocess the constructed string and returns it. This worked fine until r30105, under r30106 it breaks in some cases, namely when the template which is to be expanded contains tags like <nowiki>.

The relevant code is in extensions/SemanticMediaWiki/includes/SMW_QP_Template.php::getResultText() and does basically:

$myparser = clone $wgParser;
$parser_options = new ParserOptions();
$result = $myparser->preprocess($text, $wgTitle, $parser_options);

where $text is a string with many templates concatenated (e.g. $text='{{t|1=foo}}{{t|1=bar}}{{t|1=baz}}')

The testcase:
A page Template:Bugtest containing the following line:

<nowiki>Bug</nowiki>

An extension (also attached):
<?php

$wgExtensionFunctions[] = 'efTest_Setup';
$wgHooks['LanguageGetMagic'][] = 'efTest_Magic';

function efTest_Setup() {

global $wgParser;
$wgParser->setFunctionHook( 'construct', 'efTest_Construct' );

}

function efTest_Magic( &$magicWords, $langCode ) {

$magicWords['construct'] = array( 0, 'construct' );
return true;

}

function efTest_Construct( &$parser, $template = '' ) {

global $wgTitle, $wgParser;
$myparser = clone $wgParser;
$parser_options = new ParserOptions();
$text = '{{'.$template.'}}';
$result = $myparser->preprocess($text, $wgTitle, $parser_options);
return $result;

}

I found that the function works if the preprocess call is replaced by

$result = $myparser->preprocessToDom($text, $wgTitle, $parser_options);
return array( $result, 'isLocalObj' => true );

For SMW, this is not so easy, however, as it currently depends on the preprocessed value being a string (it concatenates it with other strings, for example). Additionally, changing to preprocessToDom would mean that the function would need to check for old vs new parser.

Summary: There should be a way for extensions to properly expand templates, or a flag by which they can tell the parser: "The returned text contains templates, please expand them".


Version: 1.12.x
Severity: normal
Whiteboard: aklapper-moreinfo

Attached:

Details

Reference
bz12906

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:05 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz12906.
bzimport added a subscriber: Unknown Object (MLST).

Just as an additional small data point: I retested today using the new preprocessor (Preprocessor_Hash) and up-to-date svn (r30692); the result is the same, i.e. it does not depend on the choice of preprocessor (DOM or Hash).

OK, changing

$result = $myparser->preprocess($text, $wgTitle, $parser_options);

to

$frame = $parser->getPreprocessor()->newFrame();
$dom = $parser->preprocessToDom( $text );
$result = $frame->expand( $dom );

fixes my issues. The issue seems to be that Parser::preprocess() calls

$text = $this->mStripState->unstripBoth( $text );

immediately after replaceVariables() (which essentially does the preprocessToDom() call). By omitting this, the unstrip happens later (I guess somewhere in parse(), but I haven't checked). mStripState is apparently shared between the original and the cloned parser, so this the unstrip works at the later state.

The question now: Should extensions be changed to the code above, or should preprocess() be changed?

I have now implemented the above workaround for Semantic MediaWiki, using PHP's method_exists() to preserve compatibility with MW1.11.

JFYI: The #lsth parser function (optionally included in Labeled Section Transclusion) also breaks when including templates. The following patch fixes it for me, though I don't know if this is really correct:

  • a/extensions/LabeledSectionTransclusion/lsth.php

+++ b/extensions/LabeledSectionTransclusion/lsth.php
@@ -92,7 +92,11 @@ function wfLstIncludeHeading($parser, $page='', $sec='', $to='')

  $result = substr($text, $begin_off, $end_off - $begin_off);
else
  $result = substr($text, $begin_off);

+
+ $frame = $parser->getPreprocessor()->newFrame();
+ $dom = $parser->preprocessToDom( $result );
+ $result = $frame->expand( $dom );
+

return LabeledSectionTransclusion::parse_($parser,$title,$result, "#lsth:${page}|${sec}", $nhead);

}

A generic fix would be very nice.

Many thanks Thomas! It works for me. I didn't know how to make it work again!

ssanbeg wrote:

(In reply to comment #4)

JFYI: The #lsth parser function (optionally included in Labeled Section
Transclusion) also breaks when including templates. The following patch fixes
it for me, though I don't know if this is really correct:

  • a/extensions/LabeledSectionTransclusion/lsth.php

+++ b/extensions/LabeledSectionTransclusion/lsth.php
@@ -92,7 +92,11 @@ function wfLstIncludeHeading($parser, $page='', $sec='',
$to='')

  $result = substr($text, $begin_off, $end_off - $begin_off);
else
  $result = substr($text, $begin_off);

+
+ $frame = $parser->getPreprocessor()->newFrame();
+ $dom = $parser->preprocessToDom( $result );
+ $result = $frame->expand( $dom );
+

return LabeledSectionTransclusion::parse_($parser,$title,$result,

"#lsth:${page}|${sec}", $nhead);
}

A generic fix would be very nice.

OK, I added that patch in r32963, with the method_exists wrapper.

Thomas Bleher: Is this still something you'd like to see fixed after all those year (and with the workaround from comment 3 in place in SMW), or can this report be closed nowadays?

I don't think I care anymore. After all, any code using the parser will have been updated by now :)