Page MenuHomePhabricator

Unclosed strikethrough tag makes strikethrough formatting continue throughout page
Closed, DeclinedPublic

Description

On this page [1], the "Open questions" section ends with a line that has starts with <s> but later has another <s> instead of </s>. The strikethrough ends with the second tag anyway, when the page is being viewed normally. With VisualEditor, however, the strikethrough formatting continues to the end of the page.

[1] = https://www.mediawiki.org/w/index.php?title=Wikipedia_Education_Program&oldid=821826


Version: unspecified
Severity: normal

Details

Reference
bz57196

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:19 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz57196.

Also, when I removed the entire section that included the offending strikethrough tags, the VE still showed strikethrough on the remainder of the page.

When I open it again in VE after the edit [1], it appears as expected.

[1] = https://www.mediawiki.org/w/index.php?title=Wikipedia_Education_Program&diff=821838&oldid=821826

This is a difference between tidy used by MediaWiki in production and the HTML5 spec. Tidy closes unclosed formatting elements eagerly, while the HTML5 spec demands that formatting elements are restored in most cases.

We are considering to prevent formatting elements from bleeding out of transclusions (see bug 50536) for reasons of correctness and performance, but so far we don't have plans to generally move away from HTML5-compliant (and MediaWiki without tidy) behavior for this.

We can detect unclosed tags and provide information for fix-ups in the wikitext though. That is partly tracked in bug 50547. I thought that we had another bug for this, but don't currently find it. Will add a link to bug 50547 when I find it.

This comment is basically a +1/me too. The same issue exhibits on https://en.wikipedia.org/w/index.php?title=Actor_model&oldid=602988700 but with a <tt> rather than a <s>. After some discussion with thedj on IRC he suggested I'd file here anyway, if only to indicate it might be somewhat prevalent in the wild.

Personally, I think this is a bug in tidy rather than in VE, and the most reasonable behaviour would be for tidy to conform to prevalent browser behaviour/the HTML spec, and to emit a warning about the likely location of the breakage (where tidy now closes the element)

(Tracking bug for Tidy being a piece of junk is bug 2542, by the way.)

Arlolra set Security to None.

Tidy is going away and RemexHTML will be used on all wikis sometime mid next year. In that context, all these pages will look broken in RemexHTML as well and the relevant broken markup on these pages has to be fixed. We have added a special linter category to aid that effort.