Page MenuHomePhabricator

New extension: Image metadata parser functions
Open, LowPublicFeature

Description

Add a {{#EXIF}} parser function to use in File: namespace, which can extract metadata from EXIF tags.

This would allow to add information from EXIF tags in description, which is a desirable feature to automatize some Commons descriptions templates.

[ Adding esby in CC, who wanted such a mean to print EXIF data into images descriptions ]


Version: unspecified
Severity: enhancement
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=43436
https://bugzilla.wikimedia.org/show_bug.cgi?id=52522

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 12:52 AM
bzimport set Reference to bz41498.
bzimport added a subscriber: Unknown Object (MLST).

I think you'd have to specify the internal key of the data you want to extract, so we'll need to come up with some way of exposing it to the user... Maybe in the table on the image page, add another column which is shown to users with a certain preference, or just in the HTML source?

Do we assume the file is the current page? Or does the file name need to be specified in an argument?

Also are we sure this should be in an extension, rather than just another core parser function?

Note - the mediaFunctions extension already supports this. (However, that extension is super old. probably want something new maybe)

(In reply to comment #0)

Add a {{#EXIF}} parser function to use in File: namespace, which can extract
metadata from EXIF tags.

Lets not call it EXIF. We extract all sorts of metadata. Exif is but one type.

Do we assume the file is the current page? Or does the file name need to be

specified in an argument?

That's kind of a minor detail, but I would say default to current page, and allow override in an argument.


The main issue with this, is the implementation of how metadata is stored is kind of messy.
*Different file handlers store the metadata different ways
**Which raises the question of how much knowladge do we want to put in the extension about how metadata is stored.

Most bitmap images store metadata in a somewhat similar way, but still use different keys. Some things use a totally different scheme (ogg files for example). Some field values for the metadata can have multiple values - how should we expose that to the user - do we show all of them, can they select which one. Certain values can have language alternatives (certain XMP properties and PNG iTxT fields).

Related URL: https://gerrit.wikimedia.org/r/67047 (Gerrit Change I49b7d8a05173090f8173e19f8c0d9a878fa346f9)

Wouldn't it make more sense to do this with a Lua module instead of adding more parser functions ?

(In reply to comment #5)

Wouldn't it make more sense to do this with a Lua module instead of adding
more
parser functions ?

Perhaps. OTOH lua is not garunteed to be installed on all third party wikis.

(In reply to comment #6)

(In reply to comment #5)

Wouldn't it make more sense to do this with a Lua module instead of adding
more
parser functions ?

Perhaps. OTOH lua is not garunteed to be installed on all third party wikis.

Actually, personally I would like to have both, with more advanced functionality available to lua (Such as getting a table of all the metadata)

(In reply to comment #7)

(In reply to comment #6)

(In reply to comment #5)

Wouldn't it make more sense to do this with a Lua module instead of adding
more
parser functions ?

Perhaps. OTOH lua is not garunteed to be installed on all third party wikis.

Actually, personally I would like to have both, with more advanced
functionality available to lua (Such as getting a table of all the metadata)

Ok, I tried my hand at adding this to lua as well - https://gerrit.wikimedia.org/r/67588

For the lua part, I just added a method to the title class that can retrieve a table of all the (standard) file metadata. This is the raw unformated data.

I imagine that people using lua for this are more interested in the raw data, and will probably want to format it themselves. If they don't, there's still the parser function (lua after all outputs wikitext).

One of the use cases I imagine for the parser function version, is someone might want to put on the image description page of their image the description from the metadata. This is a simple non-complex use case, and I think is well suited to a parser function.

One point that I'm worried might be sticky in review of this, is the parser function, when in formatted output mode, formats according to user language. There's a number of reasons for this:
*Output then looks exactly like the formatted output in the metadata box on the image page, which varies by user language.
*Commons people (who realistically are probably going to be the main people using this) tend to like everything changing with user language
*Image description pages on commons effectively already always vary by user language. (Almost every template on commons has an {{int:...}} in it), so its not like we're splitting the parser cache, its already split.
*And the bad reason, the formatMetadata class already assumes using user language via a bunch of global state (This is really a non-reason. It can be changed if people don't like the user language thing. The main reason is the first 3)

The idea of adding yet another parser function, particularly a complex one, gives me weird feelings.

I haven't looked at any of the code here, to be clear, and I'm sure it's fine. I'm only speaking as to whether the overall idea here is sound. It'd be helpful to have a better idea of what this parser function will be used for. If it's only going to add more magic hackery to Commons, I'd prefer that we ... not. :-)

(In reply to comment #9)

The idea of adding yet another parser function, particularly a complex one,
gives me weird feelings.

I personally like parser functions, for simple quick jobs, and reserving lua for complex templates, but I may just be weird.

Change 67588 had a related patch set uploaded (by Ricordisamoa):
Add a Lua interface for getting file metadata

https://gerrit.wikimedia.org/r/67588

Change 67588 abandoned by Brian Wolff:
Add a Lua interface for getting file metadata

Reason:
This isn't going anywhere

https://gerrit.wikimedia.org/r/67588

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:14 AM
Aklapper removed a subscriber: wikibugs-l-list.