Page MenuHomePhabricator

Sanitizer:removeHTMLtags fails for <img src=> tag when enclosed in <a> link
Closed, ResolvedPublic

Description

Scenario:

if you want to strip all insane tags but allow "a" and "img" tags, you would use this

$string = Sanitizer::removeHTMLtags( $string, null, array(), array( "a", "img" ) );

This leaves single "a" and "img" tags, but I noticed that the Sanitizer function does not work correctly for such string :

<a href='http://link-url'><img src='http://image-url'></a>

Because this a widely used construct I suggest to fix the removeHTMLtgas have it working for this case, too.

I also noticed that the function fails in the constructed case where the image tag is intentionally incorrectly written as a closed tag <img src='http://image-url' />


Version: 1.20.x
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=46443

Details

Reference
bz35002

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:14 AM
bzimport set Reference to bz35002.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #0)

Scenario:

if you want to strip all insane tags but allow "a" and "img" tags, you would
use this

$string = Sanitizer::removeHTMLtags( $string, null, array(), array( "a", "img"
) );

This leaves single "a" and "img" tags, but I noticed that the Sanitizer
function does not work correctly for such string :

<a href='http://link-url'><img src='http://image-url'></a>

Because this a widely used construct I suggest to fix the removeHTMLtgas have
it working for this case, too.

Works fine if you have $wgAllowImageTag = true; set in your LocalSettings.php. I suppose the fourth parameter of that function is not meant for self-closing tags.

I also noticed that the function fails in the constructed case where the image
tag is intentionally incorrectly written as a closed tag <img
src='http://image-url' />

Valid in XHTML! (although we don't use xhtml any more...)

(In reply to comment #1)

(In reply to comment #0)

Scenario:

if you want to strip all insane tags but allow "a" and "img" tags, you would
use this

I also noticed that the function fails in the constructed case where the image
tag is intentionally incorrectly written as a closed tag <img
src='http://image-url' />

Valid in XHTML! (although we don't use xhtml any more...)

$wgAllowImageTag = true; I already had this, but overlooked to uncomment it!
thanks for pointing me to that self-made problem.

My RSS HTML problems are now ->almost solved, but the <img ..../> remains. I will file a separate bug for it

Regarding the last sentence of the description of this bug:

"I also noticed that the function fails in the constructed case where the image
tag is intentionally incorrectly written as a closed tag <img
src='http://image-url' />"

this is an ongoing problem, and has now been separately filed as bug 46443 .