Page MenuHomePhabricator

[Task] Ensure normalized ordering of claims and snaks in the API
Closed, InvalidPublic

Description

Claims of the same rank should be grouped together and these groups should be ordered by rank. (preferred first, then normal, then deprecated)

See also: T74297: [RFC] Have API return and accept lists instead of maps to maintain order

Details

Reference
bz57666

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:25 AM
bzimport set Reference to bz57666.
bzimport added a subscriber: Unknown Object (MLST).

Summary of feature discussion with Lydia, Henning, Tobi and me:

  • The order of claims (and qualifiers, and snaks-in-references) is not necessarily the one stored in the database, because:
  • The Serializer/UI groups claims (and qualifiers, and snaks-in-references) by property id (the order of the property ids is user-defined though, and reflects the order in the database)
  • The Serializer/UI groups statements also by rank (after grouping by property) (the order within a rank is, again, user defined, and reflects the order in the database).
  • When changing the order of claims in the UI, the wbsetclaim module is used to change the claim's "index". That index refers to a position in the *sorted* (normalized) list of claims, not to the order of claims as stored in the database.
  • Thus, wbsetclaim needs to restore the order, apply the change, and the save the entire list using the new order. This may cause "magic" changes to happen that the user did not knowingly initiate. We agreed that this is regrettable, but unavoidable (we could shift around when this happens, but it will have to happen some times).
  • To avoid inconsistencies, the order of claims (and qualifiers, and snaks-in-references) should always be normalized when generating public output (that is, it should be done by the Serializer hierarchy)
  • This means that a client that loads an entity, modifies it, and writes it back as a whole, might inadvertently change the order of things in the entity. This, we agreed, was regrettable but acceptable.
  • It would be useful to also enforce the canonical ordering when *un*serializing, so no "bad" ordering can be introduced via the API. This avoids subsequent edits "magically" restoring the "right" order. This would be done in the Unserializer hierarchy.

For reference, the algorithm for normalizing the order of claims. It's based on bubble sort, since that is simple, stable, works well on almost-sorted input, and can easily be adopted to partial sorting:

First pass, grouping by property id (done for claims, qualifiers, and snaks in references):

  • let seenProperties() be an empty set of property IDs
  • for each claim, check whether we have "seen" the property ID of its main snak.
    • if we have not yet seen it, mark it as seen
    • if we have already seen it, let the claim "bubble up" until it is located just after another claim that uses that same property ID.

Second pass, grouping by rank (done only for Statements):

  • for each claim, let it bubble up until
    • either the claim before it uses a different peroperty ID
    • or the claim before it has a higher rank

Note how the original, user defined order of properites as well as the original ordering inside property/rank groups is maitained.

  • Bug 68729 has been marked as a duplicate of this bug. ***
Lydia_Pintscher removed a subscriber: Unknown Object (MLST).
Lydia_Pintscher removed a subscriber: Unknown Object (MLST).Dec 1 2014, 2:31 PM
thiemowmde renamed this task from Ensure normalized ordering of claims and snaks in the API to [Task] Ensure normalized ordering of claims and snaks in the API.Aug 13 2015, 4:38 PM
thiemowmde lowered the priority of this task from High to Medium.
thiemowmde updated the task description. (Show Details)
thiemowmde set Security to None.
thiemowmde removed a subscriber: Wikidata-bugs.