Page MenuHomePhabricator

Forbidding TITLE elements in uploaded SVG files is overkill
Closed, ResolvedPublic

Description

Author: anonmoos

Description:
If you attempt to upload an SVG file with a TITLE.../TITLE element in it, the software currently
gives you a cryptic error message (and aborts the upload process in such a manner that the
information previously entered into the upload form is irretrievably lost, so that you have to start
from scratch in order to try again).

Unfortunately, the formal SVG standard _recommends_ that every SVG document should have a TITLE
element, which "serves the purposes of identifying the content of the given SVG document fragment" -
and in fact the TITLE element is the quickest and easiest way of including document metadata (for
exmple, the contents of the overall TITLE element will be displayed in the window title bar when
using the Adobe SVG viewer plugin).

Furthermore, according to the SVG grammar, an SVG file can contain multiple TITLE.../TITLE elements
annotating each subsection, and this is an essential part of the "SVG Content Accessibility
Guidelines" devised for making the content of SVG files partially accessible to users who can't view
them in the ordinary way.

So forbidding all TITLE.../TITLE elements from SVG files is not particularly desirable (and I find
it annoying, since all my SVG files are annotated with TITLEs).


Version: unspecified
Severity: normal

Details

Reference
bz4388

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:59 PM
bzimport set Reference to bz4388.
bzimport added a subscriber: Unknown Object (MLST).

avarab wrote:

Are you sure that it's the title element that's causing your MediaWiki
installation to reject the file?

anonmoos wrote:

I'm not running it, it's Commons.wikimedia.org. A number of filters were placed on SVG files accepted for upload to filter out unwanted script elements etc., and one of them is blocking
TITLE elements.

fleming wrote:

See Talk:Graphviz ..

On en.wikipedia.org, raw SVG output from Graphviz is rejected. The only way to
get the file accepted is to eliminate nested "<g>" elements *and* "<title>"
elements (at least those inside "<g>" elements).

mediawikibug wrote:

Yes it does reject <title> tags.

There's some more information and reasoning given in the code: SpecialUpload.php
line 792 function detectScript

"Internet Explorer for Windows performs some really stupid file type
autodetection which can cause it to interpret valid image files as HTML and
potentially execute JavaScript, creating a cross-site scripting attack vectors.

Apple's Safari browser also performs some unsafe file type autodetection which
can cause legitimate files to be interpreted as HTML if the web server is not
correctly configured to send the right content-type (or if you're really
uploading plain text and octet streams!)

Returns true if IE is likely to mistake the given file for HTML. Also returns
true if Safari would mistake the given file for HTML when served with a generic
content-type"

This detectScript function will pick up on
<body','<head','<html','<img','<pre','<script','<table','<title' and returns
true, resulting in a red error message: "This file contains HTML or script code
that my be erroneously be interpreted by a web browser." (complete with typo)

So w3c standard SVG files are getting rejected. Nice simple triangle example:
http://www.w3.org/TR/SVG/images/paths/triangle01.svg ...has <title> so it wont
work. Whether the detectScript function should change I dunno, but I commented
it out on my intranet installation.

anonmoos wrote:

(In reply to comment #4)

SpecialUpload.phpline 792 function detectScript: "Returns true if IE is likely to mistake the given

file for HTML. Also returns true if Safari would mistake the given file for HTML when served with a
generic content-type" This detectScript function will pick up on
<body','<head','<html','<img','<pre','<script','<table','<title'. So w3c standard SVG files are
getting rejected.

Well BODY, HEAD, HTML, IMG, PRE, and TABLE elements have no place in an SVG file, while SCRIPT is out
of line with the intended use of SVG files on Wikimedia Commons. But TITLE is a recommended part of
every SVG file, and an essential part of W3C standard "Content Accessibility Guidelines", as mentioned
before (see Appendix H to the SVG 1.1 standard) -- in fact, a well-commented SVG file should often
contain multiple TITLE elements.

I don't think that the SVG file upload code should be defensively checking against things that might
speculatively happen if an SVG file is hypothetically delivered with an incorrect MIME type -- rather,
it should be the job of the file serving code to make sure that SVG files are always delivered with
the correct MIME type. (In most cases, SVG files will actually be viewed in rendered raster image
form, in any case.) Be as zealous in rejecting scripting code as you want, but don't reject a
legitimate part of the SVG file format definition which performs a valid function.

Tuukka wrote:

I uploaded an SVG file created with Graphviz to commons.wikimedia.org now.
Nested g tags passed but I still needed to remove all the good title tags.

Please solve this somehow. May I suggest that you accept uploads with title tags
but strip them on download time as long as you consider the security risk important?

robchur wrote:

It might make sense to whitelist the <title> attribute in SVG images (after the
MIME detection stuff).

avarab wrote:

RESOLVED FIXED, enabled $wgAllowTitlesInSVG which allows svg files with a title
element to be uploaded.