Page MenuHomePhabricator

Regression in HTML normalization in 1.6 (unclosed <li>)
Closed, ResolvedPublic

Description

Pre-XHTML HTML allows a number of elements to be implicitly closed,
such as <li> within a <ul> or <ol>.

In 1.5, MediaWiki allowed this code through:

<ul>
<li>One
<li>Two
</ul>

While invalid XHTML due to not being well-formed, this did render
in browsers the way people would expect from traditional HTML.

In 1.6 and current trunk, this is smashed up in output as:

<ul>
<li>One
&lt;li&gt;Two
&lt;/ul&gt;

</li>
</ul>

The second <li> and the closing </ul> are rejected and escaped, then closing
</li> and </ul> get added at the end of the document.

Proper application of nesting rules *should* normalize it to
something like this:

<ul>
<li>One
</li><li>Two
</li></ul>


Version: 1.7.x
Severity: normal
URL: http://mail.wikipedia.org/pipermail/mediawiki-l/2006-April/011215.html

Details

Reference
bz5497

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:10 PM
bzimport set Reference to bz5497.

gangleri wrote:

*note* because of question at
http://mail.wikipedia.org/pipermail/mediawiki-l/2006-April/011228.html

(Are <ul> and <li> even allowed on a default install?)

This bug can not be reproduced at Wikimedia Foundation wikies:
http://en.wikipedia.org/wiki/User:Gangleri/tests/bugzilla/05497
http://test.wikipedia.org/wiki/User:Gangleri/tests/bugzilla/05497

It can be seen at
http://test.leuksman.com/view/User:Gangleri/tests/bugzilla/05497

best regards reinhardt [[user:gangleri]]

Fixed in trunk and REL1_6. We forgot to handle $htmlsingle
in Sanitizer.php :(

list-mediawiki2 wrote:

I'm not sure if this is the result of fixing this bug or a completely separate bug, but there's
another regression on both the latest 1.6 and trunk with either of:
<ul>
<li>One</li>
</ul>
or

blah

The closing tags are shown (e.g. as &lt;li) in the browser.

I'm reverting the bogus patch from REL1_6.

A parser change like this absolutely should NOT go into a release branch
without strict testing. There's not even a parser test case to go with it!

I've added 8 parser test cases. Currently only 1 passes, both in REL1_6
and trunk (which has the above and another patch hacked in).

At the moment trunk "looks" better for some cases but produces invalid
output, as 1.5 did.

sspecter wrote:

I have a related problem (i believe it is related) with list parsing. Im using
wiki 1.6.3 but I believe it must be on 1.7. Sorry but i cant test 1.7 with
extensions here.

I have an extension to create MathML tags. It is called <asciimath>. when I do:

  • item <asciimath>x</asciimath>

I get:

<ul><li>item <math ...(mathML)...></li> ... (math commands)... <math>

Or, for simplification:

...<li><math></li>...</math>...

As I need XHTML to get MathML working, my wiki page crash and burn.

Solution proposed:

1- it appears <li> closes after the 1st tag he encounter. it should at least
close before. It is not a very good solution.

2- <li> must allow tags inside him, SPECIALLY tags from extensions. It should
only look for:

  • </ul> of the level hes in, not the children lists, or
  • <li>

...and place before it. Or, if he dont find it, destroy itself (by putting
<li></li> or removing the <li> unclosed.

  • Bug 5977 has been marked as a duplicate of this bug. ***

dov.grobgeld wrote:

My bug report 5977 was marked a duplicate of this bug, which might be. But note
that even if the <li> tags are closed is the rendering not correct:

<ol>
  <li> Something
    <ol>
      <li> Deeper</li>
    </ol>
  </li>
  <li> Something else</li>
</ol>

This is rendered as follows:

<p><ol>

</p>

<li> Something
  <ol>
    &lt;li&gt; Deeper&lt;/li&gt;
  </ol>
</li>
<li> Something else</li>

<p></ol>
</p>

Created attachment 1885
New removeHtmlTags function which is mroe accurate and complete

Handles implied end tags properly, and obeys more nesting prohibitions.
Want to test this a little more before applying to trunk.

attachment work.diff ignored as obsolete

Created attachment 1891
With fix for crash on </body>

Attached:

  • Bug 4373 has been marked as a duplicate of this bug. ***

axelseaa wrote:

This appears to still be an issues in 1.7.1 with definition lists. Any ETA on a
permanent fix?

This appears to have been fixed long ago.