Page MenuHomePhabricator

Ability to check for existence and uniqueness of a property
Closed, ResolvedPublic

Description

This is related to the branching forms report:

https://bugzilla.wikimedia.org/show_bug.cgi?id=30536

A means of checking for the existence and uniqueness of a property is required when 1 or more properties of a page are inherently unique to the page, and checking them for uniqueness can prevent duplicate entries into a wiki.

For example, if a wiki were to be about objects that have unique serial numbers (cars, firearms, aerospace parts, social security numbers, national ID numbers, etc), the form for entering these objects into the wiki should check if the field for the serial number would be unique on the wiki. If it already exists, it should take the user to the form for editing the existing entry.

Right now, Semantic Forms can only do this with article titles, not with properties. Article titles can vary and be unpredictable, but the serial number is always the same, and ideal for uniqueness checking.

Another more detailed example would be a university using a wiki to keep records for the employees and students. Many students have the same name, but they all have unique student ID numbers. All employees have a unique social security number. Most students have unique social security numbers too, but some do not.

Since everyone on the campus will have at least one of a social security number, and/or a student ID number, but it is not known which ones they will have, it is necessary to create a special record ID instead of just using a social security number or student ID number as the wiki page title (page titles are harder to use in queries, also).

Each person on the campus could be automatically assigned a unique record ID to be used as the article title for their wiki page, including non-students who do not have a student ID.

Being able to check for uniqueness and existence of social security numbers and student ID numbers would help ensure that no person gets entered into the system twice, or enrolled under two different student ID numbers. It would also help in keeping records accurate for students who change their names, social security numbers, or student ID's because it is unlikely that all three of those pieces of information would change at the same time.


Version: unspecified
Severity: enhancement

Details

Reference
bz30537

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 21 2014, 11:52 PM
bzimport set Reference to bz30537.

I have discovered a workaround for this bug, but the workaround also requires another workaround for the "Page creation date" portion of this bug:

https://bugzilla.wikimedia.org/show_bug.cgi?id=30610

By manually producing a creation date property that is close to the actual creation date, it becomes possible to use it in a duplicate checking scheme to determine whether a discovered duplicate is actually the original. This is important, because otherwise the duplicate checking scheme will mark the original as a duplicate too.

Basically, it just checks to see if the creation date is the smallest (in unix time) of all the duplicates found. If yes, the unique property (serial number, in this case) is stored as-is. If not, the error "DUPLICATE!" is stored instead. This method works better than previous methods that relied on caching to not mark the original entry as a duplicate.

[[Certification number::{{#if: {{#ask: [[Serial number::{{{Serial number|}}}]] [[Creation date::<{{{Creation date|}}}]] | ?Creation date }} | DUPLICATE! | {{{Certification number|}}} }}]]

For the above code, $smwStrictComparators = true is set, as described here:

http://semantic-mediawiki.org/wiki/Help:Strict_comparators

This workaround is not a solution, since it requires that the duplicate be entered into the system before it is checked and marked as a duplicate. Branching forms can solve this problem completely by checking for duplicates of critical information in the first step, before sending the user on to invest their effort to fill out the remainder of the form(s).

However, branching forms (and any other possible technique for dupe checking) still needs a page creation date property in order to know which page is the original. I have created a new bug requesting a page creation date property in SMW core:

https://bugzilla.wikimedia.org/show_bug.cgi?id=30784

I marked that bug as a blocker for this bug, since a creation date property is required to do dupe checking. I did not mark it as a blocker for branching forms, because that can still be done without it, with the caveat that it will have the limitation of not being able to do dupe checking.

I forgot to mention that the workaround for this bug requires caching to be disabled on the wiki, at least for form pages. There's a beta-quality extension that allows this to be done on single pages (such as just the form page):

http://www.mediawiki.org/wiki/Extension:MagicNoCache

but the only way to do it without installing buggy extensions is to turn caching off for the entire wiki, which obviously is only going to work well on very small wikis.

That's another reason why the workaround is insufficient to solve this problem, and the reasoning behind creating bug 30784 and marking it as a blocker for this one.

Yaron_Koren claimed this task.
Yaron_Koren subscribed.

"unique", etc. parameters now exist.