Page MenuHomePhabricator

SVG thumbnail generation fails due to bad namespace handling in SVGMetadataExtractor
Closed, ResolvedPublic

Description

Author: wilfredor

Description:
There are problems in generating thumbnails for svg files, please see "Archivo:QA icon.svg", instead, should have an image


Version: unspecified
Severity: major
URL: http://es.wikipedia.org/wiki/Wikipedia:Portada

Details

Reference
bz27465

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 11:20 PM
bzimport set Reference to bz27465.
bzimport added a subscriber: Unknown Object (MLST).

Bryan.TongMinh wrote:

cc thedj, he did something with SVG metadata, I believe.

wilfredor wrote:

In commons "This image rendered as PNG in other sizes: 200px, 500px, 1000px, 2000px." not work

Several SVGs at Commons report invalid tags while rendering (i.e. <a:midPointStop
in http://upload.wikimedia.org/wikipedia/commons/b/b9/Gtk-media-play-ltr.svg ). Many of these are untouched for some years already.

that doesn't mean that the SVGs are necessarily ok. just that we were more lenient in the past.

For the first one:

SvgHandler::getMetadata: Expected <svg> tag, got svg:svg
File::getPropsFromPath: /Users/hartman/Development/phase3/images/4/44/QA_icon_own_version.svg loaded, 11021 bytes, image/svg+xml.
LocalFile::upgradeRow: upgrading QA_icon_own_version.svg to the current schema
DatabaseBase::query: Writes done: UPDATE image SET img_width = '0',img_height = '0',img_bits = '0',img_media_type = 'DRAWING',img_major_mime = 'image',img_minor_mime = 'svg+xml',img_metadata = '0',img_sha1 = 'dr2uqjc96l4dzt8jkgw5997pa9098jj' WHERE img_name = 'QA_icon_own_version.svg'
Class SkinVector not found; skipped loading

That seems like a fixable bug. i'll make a patch.

  • Bug 27359 has been marked as a duplicate of this bug. ***

wilfredor wrote:

The problem is not fixed. See "Archivo:QA icon.svg" in http://es.wikipedia.org/wiki/Wikipedia:Portada

bug is fixed in r82307, and es.wiki has r82223 now - that's the reason, I think.

wilfredor wrote:

This needs to go to the production server as soon as possible. You have a date?

Probably whenever this is pushed it out. Since we just completed the push of 117wmf1, I'm not sure when that will be. I'll make sure this is done then, though.

(Adding Roan, who knows more about pushes than I do and may have something to add.)

Probably whenever this is pushed it out. Since we just completed the push of 117wmf1, I'm not sure when that will be. I'll make sure this is done then, though.

(Adding Roan, who knows more about pushes than I do and may have something to add.)

wilfredor wrote:

Problem fixed. Thanks

r82307 does not fix the problem correctly, and still breaks on files such as https://secure.wikimedia.org/wikipedia/en/wiki/File:US_states_by_total_state_tax_revenue.svg

The fix appeared to simply make the assumption that the SVG namespace will either be the unlabeled root namespace or will be given the symbolic name 'svg'; there is absolutely no such guarantee. The above file defines it as 'ns0', which is 100% legit to do, but it gets rejected.

Proper fix is probably to simply use a namespace-aware XML parsing mode. Use the namespace URL to check against, and let the XML tools worry about matching up prefixes.

Added test cases in r88865, including the file from comment 14 which currently fails, and the files from comment 1 and comment 4 which do work with the current code.

$ php phpunit.php includes/media/SVGMetadataExtractorTest.php
PHPUnit 3.5.13 by Sebastian Bergmann.

..E

Time: 1 second, Memory: 14.00Mb

There was 1 error:

  1. SVGMetadataExtractorTest::testGetMetadata with data set #2 ('/var/www/trunk/tests/phpunit/includes/media/US_states_by_total_state_tax_revenue.svg', array(593, 959))

MWException: Expected <svg> tag, got ns0:svg

/var/www/trunk/includes/media/SVGMetadataExtractor.php:105
/var/www/trunk/includes/media/SVGMetadataExtractor.php:78
/var/www/trunk/includes/media/SVGMetadataExtractor.php:30
/var/www/trunk/tests/phpunit/includes/media/SVGMetadataExtractorTest.php:14
/var/www/trunk/tests/phpunit/MediaWikiTestCase.php:63
/var/www/trunk/tests/phpunit/MediaWikiPHPUnitCommand.php:20
/var/www/trunk/tests/phpunit/phpunit.php:60

FAILURES!
Tests: 3, Assertions: 2, Errors: 1.

Trunk r88870 reverts the previous fix for a cleaner diff; r88871 commits a new fix that checks the localName & namespaceURI properties instead of checking for particular prefixes in the combined name.

Seems to work ok with the test cases I added earlier and the Wikimedia logo SVG which doesn't use a namespace; also made a slight tweak to metadata extraction (eg RDF) to trim whitespace.

We should add test cases for animation and any other property that can be extracted as well, but that's for another story.

Will need merge to 1.17, 1.17-wmf & 1.18.

Gilles raised the priority of this task from High to Unbreak Now!.Dec 4 2014, 10:21 AM
Gilles moved this task from Untriaged to Done on the Multimedia board.
Gilles lowered the priority of this task from Unbreak Now! to High.Dec 4 2014, 11:20 AM