Page MenuHomePhabricator

Cite: Templated syntax for attribute values for <ref> and <references> should be parsed as plain text
Closed, ResolvedPublic

Description

On WP:fr when contributors do wikification they sometimes modify the name of a ref
<ref name="Air Service Journal 1917. p.413"> becomes <ref name="Air Service Journal 1917. {{p.|413}}"
Parsoid parse the name of the ref and then big trouble : https://fr.wikipedia.org/w/index.php?title=Georges_Guynemer&diff=prev&oldid=98198224

Parsoid should not parse name parameter in a ref tag


Version: unspecified
Severity: normal

Details

Reference
bz56916

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:29 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz56916.

We didn't think extension attributes would come from templates, so hadn't built in support for it, but I guess we've been proved wrong as in this example.

Looks like <ref> tag is special and is the only extension of the 3 I tested that supports this.

https://en.wikipedia.org/w/index.php?title=User:Ssastry/sandbox&oldid=582603346

This bug might occur a little bit more often than I thought first. A Kiwix user has reported it and has achieved to give three examples on the English speaking Wikipedia.

error_messages.png (1×1 px, 474 KB)

ssastry added subscribers: matmarex, Jackmcbarn, Anomie.

As per discussion on https://phabricator.wikimedia.org/T75450 .. it seems <ref> is an exception when it comes to extensions that use arbitary wikitext in its attributes. As suggested in https://phabricator.wikimedia.org/T75450#775009, perhaps using {{#tag:..}} might be one way to handle this.

Is there a reason not to remove this special case from the cite extension?

The problematic template is "template:sfn" in "Klondike_Gold_Rush", maybe the best approach is simply to fix it and see if someone complains.

I'm not following what you're getting at here. In the PHP parser, <ref name="{{echo|a}}"> generates a reference with the name being literally "{{echo|a}}", the template is not expanded. I'm not seeing how you're saying that Cite is an "exception" or a "special case".

Nor am I seeing how Template:sfn is a problem here; it's been implemented as a module since 2013, and before that it did use {{#tag}} syntax. What "fix" is being proposed?

Good to know. I am not able to test the wikitext right now, but I am sure then I've goofed up my wikitext on the page in https://phabricator.wikimedia.org/T58916#624765 .. I didn't go verify again. But, will do tomorrow and retitle this bug to remove all vestiges of extension attribute templating from parsoid.

Okay, verified. I had misinterpreted the output on that test page.

ssastry renamed this task from Cite: Add support for templated <ref> attributes to Cite: Templated syntax for attribute values for <ref> and <references> should be parsed as plain text.Dec 9 2014, 7:26 PM

Change 181118 had a related patch set uploaded (by Marcoil):
T58916: Parse extension parameters as plain text

https://gerrit.wikimedia.org/r/181118

Patch-For-Review

Change 181118 merged by jenkins-bot:
T58916: Parse extension parameters as plain text

https://gerrit.wikimedia.org/r/181118

I reopen the bug as it seems to me that not all use-cases were fixed with Marcoil's patch.

Have a look here:
http://parsoid-lb.eqiad.wikimedia.org/enwiki/Klondike_Gold_Rush?oldid=651957368

In "Beginning of the stampede", we still have a reference error, although everything is OK with en.wikipedia.org.

This issue is unrelated to extension parameters being parsed as wikitext, which was the original issue here. I see two problems with that section in http://parsoid-lb.eqiad.wikimedia.org/enwiki/Klondike_Gold_Rush?oldid=651957368, though, so I've opened two new bugs for them:

  • T93580: The <ref> in the image's caption isn't detected.
  • T93578: The <ref group=n> inside the <gallery> produces an error from Cite.php.

As they are unrelated issues, I'm closing this one again.

T176170 (HTML5 ID attributes) seems to reintroduce the issue -- the wikitext is now being expanded again on the PHP side. Looking into this.

T176170 (HTML5 ID attributes) seems to reintroduce the issue -- the wikitext is now being expanded again on the PHP side. Looking into this.

More specifically, it seems if you use core master and your Cite patch and set $wgFragmentMode = [ 'html5' ], Cite internally uses {{echo|a}} as the name but the parser later expands that template when producing the output HTML. If you have $wgFragmentMode as the default [ 'legacy' ] it doesn't occur, presumably because in that case it replaces the braces in the ID with .7B and .7D.

@Anomie Yeah, I've got a partial patch for Cite.

But it gets better: if you set $wgFragmentMode = ['html5']....

$ echo '<sup id="cite_ref-&#123;&#124;a&#125;&#125;_1-0" class="reference">[[#cite_note-&#123;&#123;echo&#124;a&#125;&#125;-1|&#91;1&#93;]]</sup>' | php maintenance/parse.php 

<div class="mw-parser-output"><p><sup id="cite_ref-&#123;&#123;echo&#124;a&#125;&#125;_1-0" class="reference"><a href="#cite_notea}1">&#91;1&#93;</a></sup>
</p></div>

Look at the href attribute: something in the core parser is double-expanding the attribute value, but it is usually protected (when wgFragmentMode = legacy) by the URL escaping. Or something...