Page MenuHomePhabricator

[Epic] Provide a plain linked data interface for accessing entities
Open, MediumPublic

Description

In the spirit of the linked data web, entity URIs should resolve to a machine readable representation of the respective entity. While the API and Special:Export already provide access to this data, their output contains wrapper structures and meta-information that does not belong to the entity denoted by the URI.

To provide a machine readable representation of the entity, a special page should be created that generates the canonical JSON representation for a given entity. URL rewriting can then be used to make this special page accessible via a canonical URI/URL schema, as described by https://meta.wikimedia.org/wiki/Wikidata/Notes/URI_scheme#Machine-readable_access. The representation returned by the data endpoint should be consistent with the representation used in the web API (no necessarily with the internal representation, which is exposed by Special:Export and does not even need to be JSON).

Care should be taken to allow caching of this data on the HTTP level, making use of the appropriate Cache-Control, ETag and If-Modified-Since headers.

Patch-For-Review:

Related Objects

View Standalone Graph
This task is connected to more than 200 other tasks. Only direct parents and subtasks are shown here. Use View Standalone Graph to show more of the graph.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Lydia_Pintscher renamed this task from Provide a plain linked data interface for accessing entities (story) to [Epic] Provide a plain linked data interface for accessing entities.Aug 13 2015, 8:12 PM
Lydia_Pintscher edited projects, added Epic; removed Tracking-Neverending.
Lydia_Pintscher set Security to None.
Lydia_Pintscher removed a subscriber: Wikidata-bugs.

[adding the Tracking-Neverending project to tasks blocking (now deprecated) T4007 as part of T93366]

I'm investigating making https://www.mediawiki.org/wiki/Extension:FileAnnotations implement the W3C standard for annotations ( https://www.w3.org/TR/annotation-model/ ) and REST API ( https://www.w3.org/TR/annotation-protocol/ ). These are based on the LinkedData standard, so it seems the functionality in this bug would be a prerequisite.

At the moment, I'm stuck on supplying a proper LD context for Wikidata properties. As I understand it, https://www.wikidata.org/wiki/Special:EntityData/Q142.ttl uses wdt:P18 and wdt is http://www.wikidata.org/prop/direct/ so in theory an application/ld+json request to http://www.wikidata.org/prop/direct/ ought to return a proper context which would let me use "@type": "P18" in the LD json of my annotation.

Does that make sense? I'm still very new to this linked data stuff.

See also T164655: Store and serve annotations in W3C standard format where I try to flesh out the JSON-LD for annotations.

Unfortunately neither Daniel nor me quite understand how those two things go together. I suggest we talk about it at the hackathon if that works for you.

As Lydia says, it's a bit unclear to us what FileAnnotations has to do with Wikidata RDF. But here's a few bits of information that could be relevant:

It appears that we could support JSON-LD format without much trouble, which is the basis of the W3C Linked Data platform. In fact, there already exist JSON-LD processors for PHP, like https://github.com/digitalbazaar/php-json-ld and https://github.com/lanthaler/JsonLD -- the later of which is available in composer and already offers a conversion from N-Quads to JSON-LD.

It is possible our "native JSON respresentation" could be transformed into compliant JSON-LD by the simple addition of a @context property and an appropriate MIME type. I'd have to look into that.

This project relates to https://commons.wikimedia.org/wiki/Commons:Structured_data in so far as we eventually want to be able to annotate images with Wikidata relations. See https://www.mediawiki.org/wiki/Extension:FileAnnotations/Design#Research_and_User_testing and T146397: Edit interface should be nicer for mockups. These would be of the form "M18 is-image-of Q42" if I understand the current commons structured data proposal correctly.

@cscott making our JSON LD-compatible by just adding @context sounds tempting. But if I understand correctly, this would mean we now have two incompatible RDF representations of our data. That sounds like asking for trouble.

@daniel I don't think they are incompatible. They are both W3C standards; I believe JSON-LD is just an alternate representation of the RDF data. In fact, it ought to be "more compatible" than publishing an alternate JSON representation which is *not* grounded semantically. But I'm quickly leaving my depth here: if you think it would be helpful I can connect you to the editor of the JSON-LD spec (who I met at the I Annotate conference at the beginning of the month) who'd be better placed to answer.

@cscott you said "I believe JSON-LD is just an alternate representation of the RDF data" - that's exactly the problem. If we base the JSON-LD on our native JSON, the resulting RDF would not be compatible with our manually defined RDF mapping. It would have a different semantic grounding. We would end up with two incompatible RDF models.

To avoid this, our JSON-LD would have to be based on our RDF binding, not our native JSON. So it would look very different from our native JSON (and would probably be much more complex).

Change 371446 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[purtle@master] Add JSON-LD support.

https://gerrit.wikimedia.org/r/371446

Change 379509 had a related patch set uploaded (by Thiemo Mättig (WMDE); owner: Thiemo Mättig (WMDE)):
[purtle@master] Minor code structure clean ups in JsonLdRdfWriter

https://gerrit.wikimedia.org/r/379509

Change 371446 merged by jenkins-bot:
[purtle@master] Add JSON-LD support.

https://gerrit.wikimedia.org/r/371446

Change 379509 merged by jenkins-bot:
[purtle@master] Minor code structure clean ups in JsonLdRdfWriter

https://gerrit.wikimedia.org/r/379509

Change 379669 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[purtle@master] Tweak JSON-LD support to generate more idiomatic JSON-LD

https://gerrit.wikimedia.org/r/379669

The patch has been merged, but https://www.wikidata.org/wiki/Special:EntityData/Q142.jsonld doesn't work yet. I assume a new version of purtle needs to be released, then wikidata's composer.json patched to require it? (https://gerrit.wikimedia.org/r/379669 is still waiting for review as well.)

@cscott Yes, this needs a release, and a version bump in composer.json. Also, jsonld needs to be added to the default of the entityDataFormats setting in Wikibase.default.php (and possibly to production config). JSON-LD also need to be added to RdfWriterFactory. The code should be self-explanatory.

I can do the release, but you'll probably want to wait until the "tweak" patch is in. But I can't review it, I have no clue if it's doing the right thing. A +1 from someone who knows more about JSON-LD would help.

@daniel RdfWriterFactory comes from purtle, and has already been patched to add .jsonld. The patch to Wikibase should look something like https://gerrit.wikimedia.org/r/384050 but I can't test this locally.

Change 379669 merged by jenkins-bot:
[purtle@master] Tweak JSON-LD support to generate more idiomatic JSON-LD

https://gerrit.wikimedia.org/r/379669

I removed this ticket from our sprint boards for the moment. The only remaining patch https://gerrit.wikimedia.org/r/384050 is work in progress, but I do not find it obvious what exactly is needed to get it out of that state. I suggest to focus on the not yet closed sub-tickets instead, and possibly create more sub-tickets if needed.

I also see https://gerrit.wikimedia.org/r/390321 waiting for an agreement. I'm willing to continue working on this patch, but it should be linked to a sub-ticket so we can properly track it.

Change 384050 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/extensions/Wikibase@master] Enable JSON-LD entity export

https://gerrit.wikimedia.org/r/384050

Pinging @Lydia_Pintscher and @Lea_Lacroix_WMDE about whether we want to merge JSON-LD support with https://gerrit.wikimedia.org/r/384050 or defer it to 1.33. More details in the gerrit link.

Change 384050 merged by jenkins-bot:
[mediawiki/extensions/Wikibase@master] Enable JSON-LD entity export

https://gerrit.wikimedia.org/r/384050

This task has been assigned to the same task owner for more than two years. Resetting task assignee due to inactivity, to decrease task cookie-licking and to get a slightly more realistic overview of plans. Please feel free to assign this task to yourself again if you still realistically work or plan to work on this task - it would be welcome!

For tips how to manage individual work in Phabricator (noisy notifications, lists of task, etc.), see https://phabricator.wikimedia.org/T228575#6237124 for available options.
(For the records, two emails were sent to assignee addresses before resetting assignees. See T228575 for more info and for potential feedback. Thanks!)