Page MenuHomePhabricator

Refine behavior of the wbsetitem API module
Closed, ResolvedPublic

Description

The wbsetitem module seems to be underspecified and exhibits surprising behavior in some cases:

Labels, descriptions and aliases are given per language. For each language given, the labels, descriptions or list of aliases is set, replacing any previous value. If set to an empty value, the respective entry is removed. Language versions of labels, descriptions and aliases that are not present in the posted data structure remain untouched.

That is, it is not possible to completely replace the data of an item without specifying all values for all possible languages. It would be nice to have a flag for "strip other languages" or some such. Maybe that should even be the default.

For Sitelinks, the situation is similar but not quite the same: Site links are set for the sites that are given in the posted data set, links to other sites remain unchanged. Again, it would be nice to have a way to strip links to all sites not specified in the present call to wbsetitem.

But there's one major inconsistency: From looking at the code, it seems that setting the sitelink for a specific site to an empty string does not remove that sitelink, but causes an error. It would be nice if this would change, so the behavior for sitelinks is consistent with how setting of labels/descriptions/aliases work.

For aliases, it might also be nice to have a mode where the new aliases are just added to the existing list - but then, maybe not.


Version: master
Severity: major
Whiteboard: storypoints: 1

Details

Reference
bz38843

Related Objects

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:12 AM
bzimport set Reference to bz38843.
bzimport added a subscriber: Unknown Object (MLST).

In my opinion, whatever is given with wbsetitem completely replaces whatever was there before. Very simple and clear semantics. No other modes. This seems to need a bit of discussion it seems, though.

It seems there are two options:

  1. the &data= argument overrides the whole document; what is passed to it becomes the new contents.
  2. the &data= argument extends the document; already present fields are not removed unless its content is an empty string. [Regardless, this does not work for sitelinks while it works with others and needs to be fixed].

I'd prefer the first option, but it's not my choice :). The second option could be useful for the last thing Daniel pointed out, or just to batch add something without fetching it before. Maybe both? wbsetitem and wbextenditem?

The wbsetitem was initially an module for development support only, but has since been the module preferred by the some of the bot programmers. As such there is one mode of operation that is important; set those fields from the item that are specified and leave the rest of the fields unchanged.

The primary use case is to add or remove a number sitelinks and leave the rest as they are. To replace the whole list with a new one will make it necessary to do an additional query, and also introduce a race condition.

To make the operation as similar as possible an set operation was used for aliases, that is also motivated by what was simplest to implement, but should operate on an add/remove -basis and not replacing the whole list. Labels and descriptions should replace whats presently set, including remove the value. Sitelinks should be set and removed.

A solution might be to give the new values as "operational objects" that carry an operation and additional parameters. This makes it possible to specify if a value should be replaced, set, added or removed - or something completely else.

I wonder if there are some kind of more generic modules that are so alien to the core project scope that they really should be created and maintained outside the project, perhaps as separate extensions. For example get-modules more in line with the ordinary query-modules, and probably also set-modules with more iterator-like behavior.

We now have bug 39142 to have the "set item completely" behavior. So the standard behavior that is described here should be the updating behavior. It does not need to cover every possible update, as the wbpatchitem will do that, see bug 39149.

so the behavior should be: what is given overwrites what is there. what is an empty string in a value deletes the given attribute.

Verified in Wikidata demo time for sprint 13