Page MenuHomePhabricator

The language used to render SVGs should be definable
Closed, ResolvedPublic

Description

Because we convert our SVGs to PNGs before display, we effectively "lose" the translations that be encoded as part of the SVG1.1 spec in the process -- the server just renders everything as though the systemLanguage is English.

We should therefore look to support a "lang=" parameter, for example:

[[File:Foo.svg|thumb|lang=en|English]]
[[File:Foo.svg|thumb|lang=fr|French]]

We can then implement pretty tools for encoding the translations.


Version: unspecified
Severity: enhancement

Details

Reference
bz32987

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:02 AM
bzimport set Reference to bz32987.

Functional patch

Functional patch, in the style of "page=" support.

attachment 32987.patch ignored as obsolete

Took a quick peek, haven't tested yet.

A few offhand notes:

  • I'm not sure I like adding more parameters to ThumbnailImage's constructor, it feels like the whole thing should be refactored to take an array with a map of params
  • if no language is specified, language should probably default to the current parsing context's content language -- this would allow eg an image from Commons to simply be included as [[File:whatever.svg]] on fr.wikipedia.org without manually adding |lang=fr. However this may add default params or otherwise be funky...
  • we're going to want ways to serve SVG files directly in which case we may need to actually modify the file when serving a localized version; this'll probably want some future infrastructure though so don't worry about it too much yet.
  • we may want to turn a lot of the magic words on files from handler-specific things into generic items -- in which case we should make sure 'lang' is available
  • need some way to pass this lang parameter over the API for ForeignApiRepo ("InstantCommons"), so third-party sites using images from Commons can also use the localization.
  • thumbnail 404 handling _may_ need adjustment to match the extra lang parameter. Need to check on status of this; we were talking about moving some of the logic around.

Some specific notes:

  • SvgHandler::parseParamString looks wrong; it assumes language codes are in form ([a-z]{2,}). They may be more like ([a-z]+(?:-[a-z]+)*) -- extensions for script or language variants (pt-br, zh-hans, etc).
  • I'd recommend against having two variants of the magic word; stick with just "lang" for simplicity

Patch v2

Updated patch to incorporate Brion's specific notes.

I'm not sure I like adding more parameters to ThumbnailImage's constructor,

it feels like the whole thing should be refactored to take an array with a map
of params

Agreed, will have a crack at this.

if no language is specified, language should probably default to the current

parsing context's content language -- this would allow eg an image from Commons
to simply be included as [[File:whatever.svg]] on fr.wikipedia.org without
manually adding |lang=fr. However this may add default params or otherwise be
funky...

Others have raised this idea. The problem I saw is that it makes the deployment very heavy on the servers as they would immediately go and re-render every SVG to a filename that includes "langfr" (or whatever the default was set to) -- even SVGs that had no translations (99.99% of them). You could, perhaps, mass-change the filenames of SVG-originally PNGs to include "langfr-" but... yeah...

we're going to want ways to serve SVG files directly in which case we may

need to actually modify the file when serving a localized version; this'll
probably want some future infrastructure though so don't worry about it too
much yet.

This is a tricky one. I mean, you can actually just serve the SVG with translations: so we have to work out if the majority of users visitors browsing "foreign" Wikipedias wants to see the SVGs in their language or the one the caption writer had in mind.

Not intelligent enough to comment on your other points :)

attachment 32987.patch ignored as obsolete

Patch that also includes rewrite of parameters of ThumbnailImage constructor

Had a crack at it.

Of those extensions in SVN,

./OggHandler/OggHandler_body.php
./FlvHandler/FlvImageHandler.php
./PagedTiffHandler/PagedTiffHandler_body.php
./PdfHandler/PdfHandler_body.php
./VipsScaler/VipsScaler_body.php

would also need to be updated, but it doesn't take a minute to do so.

Attached:

Brion: what's the way forward on this one? What should I be doing?

Okay, so neither patch now applies cleanly against trunk... let me know when someone wants to test it and I can regenerate.

Patch v2.1 (fix bitrot)

Attached:

sumanah wrote:

Jarry1250, what's the status with this? Is this something you've already submitted into Gerrit?

(In reply to comment #8)

Jarry1250, what's the status with this? Is this something you've already
submitted into Gerrit?

Basically, the bigger patch rots so quickly (or at least it did under SVN, not sure about Git) that I need to get it through on a concerted effort (both by me and a reviewer). I haven't fixed it since last time, but I will: since this bug is a prerequisite for my GSoC project, that will probably be sometime in July-August.

So, broken this down into some different patches:

And another one later today.

I'm a bit wary of this change for two reasons:

  • extending file inclusion syntax further (to include |lang=) is a bit frightening; and
  • given recent thumbnail purging troubles on Wikimedia wikis, I'm concerned about additional code complexity in this area (adding thumbs that are in the form of "/images/thumb/f/ff/Foobar.svg/langXX-180px-Foobar.svg.png").

(In reply to comment #12)

I'm a bit wary of this change for two reasons:

  • extending file inclusion syntax further (to include |lang=) is a bit

frightening; and

  • given recent thumbnail purging troubles on Wikimedia wikis, I'm concerned

about additional code complexity in this area (adding thumbs that are in the
form of "/images/thumb/f/ff/Foobar.svg/langXX-180px-Foobar.svg.png").

This isnt really different. PagedTiffhandler adds stuff to the file link syntax as well as adding stuff before the width specifier in the thumb url

"In theory" at least, this shouldn't be any different to the multipage support and the time parameter for video thumbnails. If there are issues with having differences between what derivative files MediaWiki sees (and can thus purge) and those that are in the caches, then that's not going to be a new problem here...

Should this already work on WMF Wikis? There was an announcement and the patch is merged, also the inclusion sytax is working, but the thumbnailer doesn't seem to be updated yet?

E.g. http://upload.wikimedia.org/wikipedia/commons/thumb/4/45/Gerrit_patchset_25838_test.svg/langde-220px-Gerrit_patchset_25838_test.svg.png is not yet working which should display the German version of https://commons.wikimedia.org/wiki/File:Gerrit_patchset_25838_test.svg.

(In reply to comment #16)

Its pending one of either https://gerrit.wikimedia.org/r/#/c/69027/ or
https://gerrit.wikimedia.org/r/#/c/69019/ being merged

First is merged, second abandoned.

Gilles raised the priority of this task from Medium to Unbreak Now!.Dec 4 2014, 10:24 AM
Gilles moved this task from Untriaged to Done on the Multimedia board.
Gilles lowered the priority of this task from Unbreak Now! to Medium.Dec 4 2014, 11:20 AM