Page MenuHomePhabricator

Many SVG files without namespace (xmlns) attribute have stopped rendering
Closed, DeclinedPublic

Description

Some SVG files have become broken, according to http://commons.wikimedia.org/wiki/Commons:Village_pump#Software_update_leading_to_image_non-display due to a recent change enforcing <tt><nowiki>xmlns="http://www.w3.org/2000/svg"</nowiki></tt>


Version: wmf-deployment
Severity: major

Details

Reference
bz41174

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:59 AM
bzimport set Reference to bz41174.
bzimport added a subscriber: Unknown Object (MLST).

http://upload.wikimedia.org/wikipedia/commons/archive/5/5c/20121018143017%21Electronic_linear_filters.svg views as XML, this isn't likely to be MediaWikis fault, it seems to be a browser related issue. More information would be very useful.

  • This bug has been marked as a duplicate of bug 41113 ***

Reopening, unlikely duplicate - not just thumbnails that are the problem

librsvg upgrade in https://gerrit.wikimedia.org/r/#/c/28495/ ?
CC'ing Tim.

(In reply to comment #0)

http://commons.wikimedia.org/wiki/Commons:Village_pump#Software_update_leading_to_image_non-display

I've added {{tracked|41174}} to that entry (recommended way to do it).
Sorry for the wrong duping of this ticket.

I'm still confused about this.

When serving originals e.g. https://upload.wikimedia.org/wikipedia/commons/archive/5/5c/20060426191653!Electronic_linear_filters.svg MediaWiki does absolutely nothing to it. MediaWiki doesn't modify previously uploaded files (or even uploaded ones, for that matter)

Also, we have no idea when this has actually been broken from. Which really doesn't help matters.

The recent change is the move to swift...

Testing locally, Imagemagick won't thumbnail it either, and also is served to me as XML

anonmoos wrote:

For a very simple case, look at http://commons.wikimedia.org/wiki/File:Urantia_three-concentric-blue-circles-on-white_symbol.svg

During 2009, 2010, 2011, and much of 2012, the simple 241 byte version of this file produced PNG thumbnails within librsvg / Wikimedia Commons. At an unknown time in 2012 (but probably fairly recently), the PNG thumbnails stopped being produced, so that it became necessary to add the xmlns="http://www.w3.org/2000/svg" text within the enclosing SVG element.

If it can't be fixed, we'd at least like a list of files which need to be fixed due to lacking xmlns="http://www.w3.org/2000/svg" text...

(In reply to comment #7)

For a very simple case, look at
http://commons.wikimedia.org/wiki/File:Urantia_three-concentric-blue-circles-on-white_symbol.svg

During 2009, 2010, 2011, and much of 2012, the simple 241 byte version of this
file produced PNG thumbnails within librsvg / Wikimedia Commons. At an unknown
time in 2012 (but probably fairly recently), the PNG thumbnails stopped being
produced, so that it became necessary to add the
xmlns="http://www.w3.org/2000/svg" text within the enclosing SVG element.

If it can't be fixed, we'd at least like a list of files which need to be fixed
due to lacking xmlns="http://www.w3.org/2000/svg" text...

And as I've said, repeatedly, visiting http://upload.wikimedia.org/wikipedia/commons/archive/6/61/20121018045628%21Urantia_three-concentric-blue-circles-on-white_symbol.svg directly, which is NOT a thumbnail, doesn't work either...

Something must be common to rsvg, Firefox and Imagemagick, and it's probably the librsvg component. Firefox is updated every now and then, wikimedia servers updated librsvg on Comment 5, and probably the same for Reedy's test environment.

SVG files without the xmlns are bad-formed files according to the specification, so we can't do nothing about it except providing help to find what files are missing it.

Running find + grep on the filesystem where all Wikimedia Commons images reside would be a fast way to get all of them. If it's not possible, someone should query and download all svg images to check if they're missing the xmlns, or just test if the image isn't rendered at all.

anonmoos wrote:

Sam Reed wrote:

AnonMoos wrote:

For a very simple case, look at
http://commons.wikimedia.org/wiki/File:Urantia_three-concentric-blue-circles-on-white_symbol.svg

And as I've said, repeatedly, visiting
http://upload.wikimedia.org/wikipedia/commons/archive/6/61/20121018045628%21Urantia_three-concentric-blue-circles-on-white_symbol.svg
directly, which is NOT a thumbnail, doesn't work either...

What does that have to do with anything??

First, it's simply not true -- the file on that link won't display in FireFox, but it WILL in fact display in Inkscape or Adobe SVG plugin.

Second, regardless of whether it will display in Firefox, or is theoretically strictly 100% "correct" under some formal theoretical definition, it nevertheless remains true that such SVG files generated PNG thumbnails on Wikimedia Commons continuously from 2005 to some time relatively recently in 2012 -- and that's a problem (a bug, you might even say)...

anonmoos wrote:

P.S. For the record -- the presence of xmlns="http://www.w3.org/2000/svg" is only mandatory in an SVG file if you take an XML-only point of view. From the perspective of traditional SGML rules, other methods (such as including a <!DOCTYPE svg PUBLIC "-W3CDTD SVG 1.0//EN" "http://www.w3.org/TR/2001/REC-SVG-20010904/DTD/svg10.dtd"> header) were fully sufficient to identify an SVG file. As far as I'm aware, Firefox has been consistent since around version 0.9 in refusing to display SVG files without an xmlns="http://www.w3.org/2000/svg" header, but Firefox does not define SVG correctness...

(In reply to comment #6)

I'm still confused about this.

When serving originals e.g.
https://upload.wikimedia.org/wikipedia/commons/archive/5/5c/20060426191653!Electronic_linear_filters.svg
MediaWiki does absolutely nothing to it. MediaWiki doesn't modify previously
uploaded files (or even uploaded ones, for that matter)

Also, we have no idea when this has actually been broken from. Which really
doesn't help matters.

The recent change is the move to swift...

Presumably the move to swift just invalidated a whole whack of cached thumbnails that went back several years. I believe the change that made non-xmlns files fail to render goes back a couple years. (I think it was the change to SVGMetadataHandler that made it namespace aware. If mediawiki cannot determine width of image, it doesn't render it). Firefox only treats the image as an svg image if it has the proper xmlns

I support Bawolff's view on this. Files without xmlns have not rendered for quite a few years already. So if we are seeing more reports now, then it's just the cache invalidation that threw out old thumbnails that had simply not been regenerated for ages.

Safari doesn't render the file either btw, nor does Chrome. (just for the record)

And we serve .svg with the "image/svg+xml" content type, so it's sort of logical in that sense that they don't.

anonmoos wrote:

Derk-Jan Hartman

Files without xmlns have not rendered for quite a few years already.

Unfortunately, that's simply not the case; http://commons.wikimedia.org/wiki/File:POL_Opalenica_COA.svg was allowed to be uploaded without any xmlns="http://www.w3.org/2000/svg" as late as 12:59, 26 April 2011, and generated a thumbnail then (as it still does)...

anonmoos wrote:

Derk-Jan Hartman wrote:

And we serve .svg with the "image/svg+xml" content type, so it's sort of
logical in that sense that they don't.

I don't understand why we can't upload SVG files which are valid SGML files, since the W3C defines the SVG filetype with respect to SGML DTD's. However, if we aren't to be allowed to have valid SGML files which are not also valid XML files on Wikimedia Commons, then we need some plan to deal with the files that were working when they were originally uploaded, but now no longer work, and I haven't seen one yet...

(In reply to comment #17)

I don't understand why we can't upload SVG files which are valid SGML files,
since the W3C defines the SVG filetype with respect to SGML DTD's.

SVG needs the xmlns. SVG may use a DTD, but it's not required, and SVG are XML documents, and not necessarily SGML files. Please see this document that defines what a valid SVG file is: http://www.w3.org/TR/SVG/conform.html

...and the document fragment is determined to be valid as follows: (...)
 8. Set the attributes xmlns="http://www.w3.org/2000/svg" and
    xmlns:xlink="http://www.w3.org/1999/xlink" on D's document element
    and remove any other attributes in the Namespaces in XML namespace
    from D.

The fact is that every SVG file which doesn't have the xmlns declaration need to be fixed and reuploaded. But yes, some help as I suggest in Comment 9 would be much appreciated.

(In reply to comment #18)

The fact is that every SVG file which doesn't have the xmlns declaration need
to be fixed and reuploaded.

(In reply to comment #9)

Running find + grep on the filesystem where all Wikimedia Commons images reside
would be a fast way to get all of them.

Adding "shell" keyword.
CC'ing Aaron.

(In reply to comment #18)

The fact is that every SVG file which doesn't have the xmlns declaration need
to be fixed and reuploaded. But yes, some help as I suggest in Comment 9 would
be much appreciated.

That's only if we really cannot possibly render such SVGs, but comment 12 seems to suggest that this may be fixable. Let's not get ahead of ourselves here. Brian and/or Derk-Jan, could you elaborate on your comments and on how hard it would be to get a workaround in place for SVG files without an xmlns?

(In reply to comment #9)

Running find + grep on the filesystem where all Wikimedia Commons images reside
would be a fast way to get all of them.

find&grep on a 20-TB file repository isn't exactly fast. This is also complicated by the fact that the images now live in Swift rather than on a single file system. Tracking down all the affected SVG files is probably doable, but not trivial.

Aaron, could you weigh in on the practicality of finding all xmlns-less SVGs, and on whether that's even needed? If there's a way we can make these files play nice without having to track them down and reupload them, that would be nice.

From my point of view, SVG files can only be hosted on the web if they are valid XML files. This is because there is only one valid Content-Type for these files, and that is "image/svg+xml". This declares that the 'downloaded file' is an XML file, which is also a requirement by by http://www.w3.org/TR/SVG/conform.html.

Files without xmlns are valid XML files. I think they are even well formed xml. Per the specification in G2 of the SVG conformance specification, they are however not valid SVG files because it specifies as a requirement for 'valid': "Set the attributes xmlns="http://www.w3.org/2000/svg" and xmlns:xlink="http://www.w3.org/1999/xlink" on D's document element", which these files do not have.

I note however, that for our parser, probably only xmlns is required, not xmlns:xlink...
Our parser also doesn't trip over files with missing DTD information and that is a requirement for valid SVG files.

This is all details. In theory we could have our parser retry the file without namespace information, if the root matches exactly <svg. However, considering the fact that neither Firefox, Chrome, Safari nor Mac OS X itself are able to display this file, I'm not sure if that is a good idea.

Are these files right now accepted for upload ?

anonmoos wrote:

Derk-Jan Hartman wrote:

Files without xmlns are valid XML files. I think they are even well formed xml.
Per the specification in G2 of the SVG conformance specification, they are
however not valid SVG files because it specifies as a requirement for 'valid':
"Set the attributes xmlns="http://www.w3.org/2000/svg"
Our parser also doesn't trip over files with missing DTD information and that
is a requirement for valid SVG files.

If xmlns="http://www.w3.org/2000/svg" is not actually formally required by either the XML or SGML standards, then what does it practically accomplish to be ultra-persnickety about it? I'm with Roan Kattouw on this. If you're implementing such a requirement for reasons of Firefox compatibility, then you should be clear about why you're doing it (and there would appear to be no imminently urgent reason to disable the PNG display of a significant number of SVG files right now in the service of an ultimate goal of Firefox compatibility).

P.S. A number of programs require the xmlns:xlink="http://www.w3.org/1999/xlink" header for those SVG files which have links in the graphic code (not just in the metadata), but not for other SVG files...

Innocenti.Maresin wrote:

Actually, MediaWiki has little respect to Content-Type. It is dropped on upload, and generated from a file name suffix (extension) on download.
I do not advertise this approach, but an argument from Content-Type (which is next to ignored in MW) looks bizarre.

(In reply to comment #23)

Actually, MediaWiki has little respect to Content-Type. It is dropped on
upload, and generated from a file name suffix (extension) on download.

In general MediaWiki doesn't rely on the suffix either. For the major filetypes that it supports it looks at the content to determine content-type. The suffix is only used for consistency with expectations of both humans and 'dumb' clients, because most of the rest of the world uses suffix and content-type with hardly any double checking.

For MediaWiki it matters what is INSIDE the file, not how that file is advertised. But as I stated earlier, it seems that XML files without xmlns are valid XML files, so i'm not so worried about the content-type issue anymore.

My main concern now is that these are not valid SVG files and what would be required for accepting non-valid files, without impacting functionality of other SVG files (or security).

This might work:

https://gerrit.wikimedia.org/r/29968 but I will have to do a bit more verification to be sure.

SVG files must include the SVG namespace per spec, and browsers don't render them without it. At some point we'll probably start pushing at least some SVG files directly to browsers, so we don't want broken files in the system that won't work.

If there are broken SVG files, we should make sure they get fixed.

I'm recommending against the patch in https://gerrit.wikimedia.org/r/29968

I think we can close this as wontfix. It doesn't seem there are developers that think we should accept files like this.

I've compiled a list of [[https://commons.wikimedia.org/w/index.php?oldid=283987037|SVGs missing the xmlns tag]] for anyone who wants to fix them.