Page MenuHomePhabricator

Upload should disallow SVGs with third-party content in style elements (not just style attributes)
Closed, ResolvedPublic

Description

Krinkle reported http://upload.wikimedia.org/wikipedia/test/e/e3/Webplatform.svg last night. It passes our filter fine because the url is in the element text, and not an attribute.

For reference:

<svg xmlns="http://www.w3.org/2000/svg" viewBox="6 3 177 153"

   xmlns:xlink="http://www.w3.org/1999/xlink">
<style>@import url("https://fonts.googleapis.com/css?family=Bitter:700&amp;text=WebPlatform.org");</style>
<defs>
  <path id="l" d="m-15.5,0v85a15.5,15.5 0 0 0 31,0v-85a15.5,15.5 0 0 0 -31,0zm9,0a6.5,6.5 0 0,0 13,0a6.5,6.5 0 0,0 -13,0zm0,85a6.5,6.5 0 0,0 13,0a6.5,6.5 0 0,0 -13,0z" fill-rule="evenodd"></path>
  <path id="i" d="m-15.5,0v60.81118318204309a15.5,15.5 0 0 0 31,0v-60.81118318204309a15.5,15.5 0 0 0 -31,0zm9,0a6.5,6.5 0 0,0 13,0a6.5,6.5 0 0,0 -13,0zm0,60.81118318204309a6.5,6.5 0 0,0 13,0a6.5,6.5 0 0,0 -13,0z" fill-rule="evenodd"></path>
  <clipPath id="o"><use xlink:href="#l" transform="translate(52,19)"/></clipPath>
  <clipPath id="r"><use xlink:href="#i" transform="translate(52,104)rotate(-135)"/></clipPath>
  <clipPath id="p"><use xlink:href="#l" transform="translate(138,19)"/></clipPath>
</defs>
<g transform="translate(-.5,-.5)">
  <use fill="#f89c20" xlink:href="#l" transform="translate(52,19)"/>
  <use fill="#e54e26" xlink:href="#i" transform="translate(52,104)rotate(-135)"/>
  <use fill="#2eb3c4" xlink:href="#l" transform="translate(138,19)"/>
  <use fill="#694d9f" xlink:href="#i" transform="translate(138,104)rotate(135)"/>
  <g clip-path="url(#o)"><use fill="#d02e27" xlink:href="#i" transform="translate(52,104)rotate(-135)"/></g>
  <g clip-path="url(#r)"><use fill="#7f1333" xlink:href="#i" transform="translate(138,104)rotate(135)"/></g>
  <g clip-path="url(#p)"><use fill="#263c81" xlink:href="#i" transform="translate(138,104)rotate(135)"/></g>
  <text fill="#474747" x="95" y="150" text-anchor="middle" font-family="Bitter" font-size="20" font-weight="bold">WebPlatform<tspan fill="#a3a3a3">.org</tspan></text>
</g>

</svg>


Version: unspecified
Severity: normal

Details

Reference
bz69008

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 22 2014, 3:28 AM
bzimport set Reference to bz69008.

I was going to file a bug but forgot. Thanks!

Rephrasing slightly.

I assume we're fine allowing <style> in general. Just as we style="" attributes in general and only disallow ones that make use of third-party content like background image urls. It's actually quite useful (and underused) for SVGs to not have to repeat and flatten all styles in inline attributes.

I found a few other issues while I was in working on this, and there were several bits that I've been putting off that we probably just need to do now.

  • I refactored XmlTypeCheck to also filter the data, if any. This requires one extra call to the callback for nodes with data, since we call the callback on element open (like always), concat all the chunks of element data, and then call the callback on the element with data when we hit the end of the element.
  • I refactored the normalization out of Sanitizer::checkCss, and run the svg style elements and attributes through that before checking if they include anything bad. I'd been meaning to do this, and just need to get it out.
  • I broadened the regex looking for non-local urls to include checks for any css elements, instead of the small, previous list. This caught things like inline style="background-image:url(http://www.google.com);", which at least FF will attempt to download.
  • All of these changes with only manual regression testing is error prone, so I'm adding unit tests that runs all of my hostile files through.

I'm testing all of the changes against my local corpus, and finding a lot of new hits-- so especially now looking at the <style> tags is probably going to impact more end users. Once I have the patch ready, I'll pull in some common's admins again so they can be prepared for the impact.

I've been running my patched version against several thousand svgs from commons. I'm seeing several instances of <style> elements with font-face directives that have urls, such as:

@font-face{font-family:'ArialMT';src:url("data:;base64,\
T1RUTwACACAAAQAAQ0ZGINlfD...")}

Should we allow embedding fonts with data: urls, if they are not svg or xml? We don't want to enable http://html5sec.org/#43, but since it was one, non-current browser we might be able to risk it.

I'm also not sure if there are any legal issues with distributing an svg where the font is embedded and distributed. Adding Luis in case he knows the right person to ask about that?

I'm the right person there. This is publicly-uploaded SVGs, right?

(In reply to Luis Villa (WMF Legal) from comment #5)

I'm the right person there. This is publicly-uploaded SVGs, right?

Yep, these are user uploaded and publicly available.

To the extent fonts are protectable, it is through copyright, and many are openly available these days, so the same DMCA+Commons rules that typically apply to art would apply here: we'll accept the upload; Commons (or other wikis) should be vigilant about not using non-free fonts; and if there is a problem, we can deal with it through typical mechanisms (like DMCA notices). So no legal blocker on accepting/displaying rendered fonts.

Created attachment 16146
Initial patch

When I use this and check about 15k commons images, I'm not finding any additional files that would be filtered out. So it doesn't seem like anyone is using the features that this filters out. Other opinions would be welcome.

attachment bug69008.patch ignored as obsolete

Created attachment 16179
Filter updates and tests

Added more tests

attachment bug69008.patch ignored as obsolete

Adding Bawolff since he's helped out a lot with these in the past. Tim or Brian, would you be able to review this patch?

It's kind of strange to call xml_set_character_data_handler() from the elementOpen() handler. There's an excuse for setting the element handler from the root element handler, although maybe not a very good one, but I think xml_set_character_data_handler() could just be done on creation of the parser.

I think the code you have here will be insecure if the element to be checked contains another element, for example:

<style>@import ...<foo/></style>

The <foo/> will clear the malicious text in $this->elementData. You need a stack.

Is there really a reason to call the callback both on element open and on element close? It seems like it would make more sense to call it only on close. In any case, the doc comments for $filterCallback need to be updated, specifically "This gives you access to the element namespace, name, and attributes, but not to text contents".

Created attachment 16288
Filter updates and tests

Updated for Tim's comments

attachment bug69008b.patch ignored as obsolete

Looks good to me. The increased memory usage is unfortunate, but we can always open a bug and fix it later. We need to get this out the door.

Created attachment 16377
Filter updates and tests - 1.24wmf19+

Context updated so git am will apply it without complaining.

Attached:

Deployed:
21:26 csteipp: deployed fixes for bugs 70620, 69008

Markus, this should be included in the next release.

Created attachment 16558
Backport for REL1_19

Tested upload of corrupted svg which is now rejected. Can anyone confirm the backport?

Attached:

Created attachment 16559
Backport for REL1_22

Tested upload of corrupted svg which is now rejected in REL1_22. Can anyone confirm the backport?

Attached:

Created attachment 16560
Backport for REL1_23

Tested upload of corrupted svg, which is now rejected in REL1_23. Can anyone confirm the backport?

Attached:

Tested the 1.19 backport. The unit tests in the 1.19 patch don't work as is (the XmlTypeCheck class doesn't have an $isFile option in the constructor and is missing a validate from string method). But with a bit of modifying to make the tests run, they all pass, and some manual testing looks good as well.

Hi Grunny, can you paste your changes to the tests here? Thx!

Created attachment 16575
Rough 1.19 unit test fix

Hi Markus. I've attached what I did to run the unit tests. It's just backporting the method to verify from a string from current master without modifying anything else. It's rough, but was just to get them running. :)

Attached:

Published the security fix in the tarballs. As of now, this bug can be publicly accessible.

Change 162777 merged by jenkins-bot:
SECURITY: Enhance CSS filtering in SVG files

https://gerrit.wikimedia.org/r/162777