Page MenuHomePhabricator

<bdo>, <q> and other elements are accepted in wikitext, but their attributes are not
Closed, ResolvedPublic

Description

includes/Sanitizer.php currently includes, in lines 1520/1521:

  1. 8.2.4
  2. bdo

Maybe the intention was to not accept the <bdo> element? But then, in lines 399-403 there is:

$htmlnest = array( [...], 'bdo' );

The result is that the <bdo> element is accepted, but its "dir" attribute is dropped. This is not logical at all because the whole point of the <bdo> element is to specify a "dir" attribute. It is actually required in HTML5: http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-bdo-element


Version: 1.22.0
Severity: normal

Details

Reference
bz55582

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:25 AM
bzimport set Reference to bz55582.

Suggested solution: Change lines 1520/1521 to

8.2.4

'bdo' => $common,

There is more weird stuff happening. I hope the following is complete:

  • <bdo> accepts no attributes; should accept $common
  • <br> accepts just "id", "class", "title", "style" and "clear"; should accept $common and "clear" (not allowing some of these is an ancient HTML 4 restriction)
  • <q> accepts no attributes; should accept $common and "cite"
  • <span> accepts $block; should accept just $common (the difference is <span align="...">, which was never legal and doesn't work)
  • <wbr> accepts just "id", "class", "title" and "style"; should accept $common (restriction was apparently copied from <br>)

Patches accepted. :-) You can read [[mw:Gerrit]] for help.

Created attachment 13477
Proposed additions to and removals from the attribute whitelist

This is completely untested. I have never worked with the PHP code.

Attached:

Bugzilla doesn't show the prose at the beginning of the .diff file, so here it is:

Bug 55582: Put the HTML attribute whitelist closer to HTML5

  • Add the global attributes to <bdo> and <q> and add "cite" to <q>. This is to make these elements actually usable: <bdo> needs a "dir" attribute to be useful for anything, and the whole point of <q> compared to hard-coded quotation marks is its support for the "lang" and "cite" attributes.
  • Drop the "align" attribute from <span> because it was never standards-compliant and does not work in browsers either, unless one constructs such unlikely things as <span align="center" style="display:block;">.
  • Drop the obsolete "char" and "charoff" attributes from <tr>, <td>, <th>. These have not been implemented in browsers anyway.
  • Drop the obsolete presentational attributes "align", "valign" and "width" from <colgroup>, <col>, <thead>, <tfoot> and <tbody>. These elements are currently not accepted in wikitext anyway, but removing these attributes from the whitelist ensures that they are not accidentally enabled in the future.
  • Drop the obsolete presentational attributes "noshade" and "size" from <hr>. They have been overridden by skin-specific CSS for a long time anyway.
  • Allow all global attributes on <br> and <wbr>. Not allowing "dir" and "lang" on <br> was a restriction in HTML 4.01, presumably copied to <wbr>, that has been lifted in HTML5. Allowing these may not be particularly useful, but simplifies the code.

Change 89384 had a related patch set uploaded by PleaseStand:
Put the HTML attribute whitelist closer to HTML5

https://gerrit.wikimedia.org/r/89384

Please sign up for developer access on Wikitech:

https://wikitech.wikimedia.org/wiki/Special:UserLogin/signup

You cannot leave comments on Gerrit (our code review system) without first creating an account on that wiki. Signing up for an account will also allow
you to upload new patch sets there.

Change 89384 merged by jenkins-bot:
Put the HTML attribute whitelist closer to HTML5

https://gerrit.wikimedia.org/r/89384

I merged the patch into master. Please, get a developer account to make everybody's lives easier :)