Page MenuHomePhabricator

TOC causes w3c html validation to fail (if Tidy not enabled)
Closed, ResolvedPublic

Description

Author: chrislale

Description:
Wiki markup for a page that shows html validation failure.

The presence of a TOC in combination with headings in a particular sequence of levels causes w3c html validation to fail. This happens both on a live 1.11.0 wiki and on a freshly installed vanilla copy of mediawiki 1.11.0.

<example>

TOC

First heading

subheading

Second heading

</example>

This gives an HTML validation error.

If you remove the Second heading, the page validates OK.

If you remove the subheading instead, the page validates OK.

If you remove the TOC, or replace it with NOTOC in articles with more headings, the page validates OK.

The attached file contains wiki markup of the above example ready for pasting into a new page.


Version: 1.11.x
Severity: trivial

Attached:

Details

Reference
bz12077

Related Objects

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:59 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz12077.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

HTML output (indented here for better readability):

<table id="toc" class="toc" summary="Contents"><tr><td><div id="toctitle"><h2>Contents</h2></div>
<ul>

<li class="toclevel-1">
  <a href="#First_heading"><span class="tocnumber">1</span> <span class="toctext">First heading</span></a>
  <ul>
    <li class="toclevel-2">
      <a href="#subheading"><span class="tocnumber">1.1</span> <span class="toctext">subheading</span></a>
    </li>
  </ul>
</li>
<li class="toclevel-1">
  <a href="#Second_heading"><span class="tocnumber">2</span> <span class="toctext">Second heading</span></a>
</li>

</ul>
</li>
</ul>
</td></tr></table>

Note the stray </li></ul>. Probably a stack maintained incorrectly or something. Tidy fixes this on Wikimedia sites, of course.

ayg wrote:

The fault lies somewhere in Parser::formatHeadings, which gives me a headache, so don't rely on me to fix it. :) If anyone wants to find the problem, or maybe rewrite that whole little mess, feel free to attach a patch and I'll look at it.

The validator reports no problems with my test (despite admittedly ugly HTML output). Could you link to a page that the validator fails?

ayg wrote:

As I say, Tidy will fix it if enabled, as it is on Wikimedia wikis. This is only an issue for third parties that do not use Tidy.

william.tucker wrote:

I have found the cause of this problem and have fixed it my own copy of 1.11.0.

The problem seems to be that the $prevtoclevel gets messed up right after an unindent. The solution was to add a line to set it properly after unindenting.

Add a line after the first call to tocUnindent in Parser.php. The new line should look like: $prevtoclevel = $toclevel;

So, before the change, the code looks like:

3597 if( $toclevel<$wgMaxTocLevel ) {
3598 if($prevtoclevel < $wgMaxTocLevel) {
3599 # Unindent only if the previous toc level was shown :p
3600 $toc .= $sk->tocUnindent( $prevtoclevel - $toclevel );
3601 } else {
3602 $toc .= $sk->tocLineEnd();
3603 }
3604 }

After the change, it looks like:

3597 if( $toclevel<$wgMaxTocLevel ) {
3598 if($prevtoclevel < $wgMaxTocLevel) {
3599 # Unindent only if the previous toc level was shown :p
3600 $toc .= $sk->tocUnindent( $prevtoclevel - $toclevel );
3601 $prevtoclevel = $toclevel;
3602 } else {
3603 $toc .= $sk->tocLineEnd();
3604 }
3605 }

Applied in r32387 with a parser test case. Seems good. :)

  • Bug 13623 has been marked as a duplicate of this bug. ***

Went ahead and stuck this on 1.12 branch, r32998 for 1.12.1.