Page MenuHomePhabricator

Author and license extraction fails on the German Wikipedia for local files
Closed, ResolvedPublic

Description

Reported at https://commons.wikimedia.org/w/index.php?title=Commons:Forum&oldid=131486934#Wieso_gibt_es_keinen_lizenzkonformen_Einbettungscode.3F

I have verified this. There is indeed a problem.

https://de.wikipedia.org/wiki/Datei:2014_-_Olympic_Stadium_%28Athens%29.JPG#mediaviewer/Datei:2014_-_Olympic_Stadium_%28Athens%29.JPG

Author and license are not shown, and the embeddable HTML by the "Use this" function doesn't have them either. As a result we propose reuse that will not conform to the license.

It appears that the MediaViewer relies on metadata embedded in the license templates (the licensetpl_link etc. spans). This metadata does not exist in the license templates at the German Wikipedia. Their templates do have some metadata, but in the form of RDF inside a HTML comment in a noinclude-block.

Looks to me that wherever MediaViewer is deployed, one needs first to make sure that the license templates (and the information template, for author extraction) need to be adapted.

Although this could be fixed by the community, I have reported this here because I presume the MediaViewer team knows best what metadata needs to be in which templates.


Version: unspecified
Severity: normal

Details

Reference
bz69496

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:45 AM
bzimport added a project: CommonsMetadata.
bzimport set Reference to bz69496.
bzimport added a subscriber: Unknown Object (MLST).

Just found the related bug 57248 (metadata missing in templates on beta cluster). So that's not a new or unknown issue.

MediaViewer relies on [[commons:COM:MRD]]-style notation to parse template metadata. [[mw:Multimedia/Media_Viewer/Template_compatibility]] has some more information on how to make templates compatible.

(In reply to Lupo from comment #1)

Just found the related bug 57248 (metadata missing in templates on beta
cluster). So that's not a new or unknown issue.

That's an unrelated issue, probably caused by a failure to replicate remote file descriptions on the beta cluster. Also, it seems to have been fixed by now.

  • Bug 69545 has been marked as a duplicate of this bug. ***

Yeah it was one of my suggestions that there should be a planned campaign to stimulate the local communities to fix this before a full launch...

Apparently, it did not come to fruition, and we launched with what users interpret as broken views on top of the data. I don't think that was a good idea.

(In reply to Derk-Jan Hartman from comment #4)

Yeah it was one of my suggestions that there should be a planned campaign to
stimulate the local communities to fix this before a full launch...

Apparently, it did not come to fruition, and we launched with what users
interpret as broken views on top of the data. I don't think that was a good
idea.

That's a pretty major goof.

Lesson to learn: do *not* deploy MediaViewer on a wiki unless the templates there have been updated accordingly!

Frankly said, no wonder many people at the de-WP community don't like this feature! (Well, it's probably only a small detail in the whole story, but details matter. And proposing re-use that violates the conditions of the license is really bad.)

In the future, engage the community early on (before deployment!) to get the templates updated. And if they don't co-operate, you'll have no choice but try to update them yourselves. Which, I fear, may now be the only way forward at de-WP. I image you might have a hard time finding admins there willing to do the work after that superprotect nonsense.

Software side, make MediaViewer not propose any re-use unless the license and author information is available. If MediaViewer doesn't have this information, it must not propose reuse outside of WMF projects. (I.e., no embeddable HTML etc.)

For no reuse without info see bug 69557.

(In reply to Erik Moeller from comment #7)

Suggested fix posted here:
https://de.wikipedia.org/w/index.php?title=Wikipedia_Diskussion:
Lizenzvorlagen_f%C3%BCr_Bilder&diff=133308165&oldid=132577906

Erik: don't forget the license templates!

https://de.wikipedia.org/w/index.php?title=Wikipedia_Diskussion:Lizenzvorlagen_f%C3%BCr_Bilder&diff=133319428&oldid=133316289

Adapting the information-template is only a tiny part of fixing this.

Yep, I know (thanks for the prod, though); I sent Gergo the link to https://de.wikipedia.org/wiki/Kategorie:Vorlage:Lizenz_f%C3%BCr_Bilder earlier and he was going to take a look.

I'm building a JS/Gadget that will SHOW people what data is there, and can report errors, if stuff is broken

https://commons.wikimedia.org/wiki/User:TheDJ/datacheck.js

Early start, but hopefully that will make lack of machine readable data more visible to maintainers of files.

This is an excellent tool, Derk-Jan, thank you :-)

Some of the most common licensing templates on de.wp have been adapted (thank you to User:Umherirrender), so https://de.wikipedia.org/wiki/Olympiastadion_Athen#mediaviewer/Datei:2014_-_Olympic_Stadium_(Athens).JPG renders with full licensing visibility now.

Gilles triaged this task as Medium priority.Nov 24 2014, 2:07 PM
Gilles subscribed.
Tgr claimed this task.

Closing this; the example works now, and in general these issues are being taken care of as part of the metadata cleanup drive.