Page MenuHomePhabricator

Request for access to media commons items using the API
Closed, InvalidPublic

Description

Author: burobjorn

Description:
Request:
In order to request photo's or other media from the mediawiki API we would like to have the ability to search for media items in for instance Wikimedia Commons.
The results would be returned preferably as JSON with the metadata such as the author, dimensions, filesize and off course the uri to the media file itself.
Preferably the results could be filtered based on metadata and limited. Similar API's exists already for services such as Yahoo's Flickr. Perhaps that API can be used as an example.

Why?
Extending the API in this manner allows other parties to benefit from the media available on the Wikimedia Commons. In our case we would like to offer a WordPress plugin which allows WordPress user to search and download Wikimedia Commons Media files for their WordPress installation and insert these in their blog posts including important metadata such as the author and license.


Version: 1.16.x
Severity: enhancement

Details

Reference
bz25590

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:23 PM
bzimport set Reference to bz25590.

Michael, how does Add Media Wizard handle this?

mdale wrote:

The api does allow for searching. The add media wizard uses this: http://en.wikipedia.org/wiki/Wikipedia:Multimedia_beta

Adding support for it in a wordpress plugin is on my todo list. The AMW works stand alone: http://www.kaltura.org/apis/html5lib/kplayer-examples/Add_Media_Wizard.html

There are some things that would be nice to add to the index such as content type and license. So you could search for just images or just video etc. but thats probably a separate bug.

Thanks for the update Michael! I'll look and see what i can do with your current code, i've started a Github repo for the plugin that's called 'photocommons' for now:

http://github.com/hay/photocommons

Author and license will be problematic, since that is not yet standardized, but in Commons, we are now introducing classes that contain that type of info and this is used by the Stockphoto gadget.

http://commons.wikimedia.org/wiki/MediaWiki:Stockphoto.js

For now however, such information will have to continue to be scraped from the html descriptions.

AMW, does not scrape. It makes best guesses based on the categories that images are in, but that really is not a foolproof system. And actually, based on feedback stockphoto so far has gotten, might even be heavily objected to by the Commons community. However what AMW does is a rather simple thing.

http://commons.wikimedia.org/w/api.php?callback=jsonp1287747638197&action=query&generator=search&gsrsearch=Gondizalves&gsrnamespace=6&gsrwhat=text&gsrlimit=30&gsroffset=0&prop=imageinfo%7Crevisions%7Ccategories&iiprop=url%7Cmime%7Csize%7Cmetadata&iiurlwidth=80&rvprop=content&format=json

This runs a jsonp callback with the results generated by a search for Gondizalves in the File namespace. max 30 results. It gives for each results, the imageinfo, revisions (wikicode of the description page) and categories that said results are connected to.

Of the imageinfo, descriptionurl, fileurl, mime, size, dimensions and metadata are provided, as well as links for 80px wide thumbnails (with dimension info). As you can see, that part is the easy part. licenses and authors are the hard part :D