Page MenuHomePhabricator

ItemPage.get() chokes on labels with a 'removed' value
Closed, ResolvedPublic

Description

Example item: https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q3595042&format=jsonfm

"labels": {
    "tt": {
        "language": "tt",
        "removed": ""
    },

I'm not sure what 'removed' means, but at the very least the bot shouldn't die on it.


Version: unspecified
Severity: normal
URL: https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q3595042&format=jsonfm

Details

Reference
bz54767

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:26 AM
bzimport set Reference to bz54767.
bzimport added a subscriber: Unknown Object (????).

Change 86442 had a related patch set uploaded by Legoktm:
Don't choke on 'removed' labels.

https://gerrit.wikimedia.org/r/86442

Change 86442 merged by jenkins-bot:
Don't choke on 'removed' labels.

https://gerrit.wikimedia.org/r/86442

Ricordisamoa set Security to None.
Ricordisamoa removed subscribers: MediaWiki-extensions-WikibaseRepository, Unknown Object (MLST).

What does "removed" mean?

Probably need to create a # documentation task for that query :P

What does "removed" mean?

From the commit message:

A label will have 'removed' if the serializer recieved bad input like an empty string

What does "removed" mean?

From the commit message:

A label will have 'removed' if the serializer recieved bad input like an empty string

@daniel can you confirm? Aren't such labels supposed to be removed from the serialization altogether?

"removed" should only occur as a marker in an API response to a request that removed a label. A Deserializer can/should ignore such labels, unless it is particularly interested in labels that got removed by an API request.

The Serializer should not mark empty labels (encountered in faulty data retrieved from the database) as "removed", but drop them from the output completely. The "removed" marker for removed labels should be injected by the ResultBuilder, not the Serializer, since it's not part of the data model as such.

CCing Addshore, since he is currently working on the serializers for API output.

"removed" should only occur as a marker in an API response to a request that removed a label.

The Serializer should not mark empty labels (encountered in faulty data retrieved from the database) as "removed", but drop them from the output completely.

Instead, as you can see, "removed" comes from wbgetentities:
https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q3595042

@Ricordisamoa yes, it's quite possible that the current serializer code does handle empty labels by marking them as "removed". I'm just saying that it shouldn't do that. But client code should still be able to deal with this kind of output (by ignoring such labels).

Is 'removed' in wbgetentities JSON only ever possible for labels, or could it also appear in aliases or descriptions, or (heaven forbid please) also in claims?

@jayvdb "removed" markers are possible for pretty much everything, including labels, descriptions, aliases, sitelinks, and I think even statements.

"removed" markers are possible for pretty much everything, including labels, descriptions, aliases, sitelinks, and I think even statements.

Nice. Can they be killed by the Wikibase API before returning them? Please and thanks.