Page MenuHomePhabricator

Creation date special property.
Closed, ResolvedPublic

Description

Author: van.de.bugger

Description:
There is Modification date' special property. I would like to have Creation date' special property.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=30610
https://bugzilla.wikimedia.org/show_bug.cgi?id=30628

Details

Reference
bz32165

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:58 PM
bzimport set Reference to bz32165.
bzimport added a subscriber: Unknown Object (MLST).

van.de.bugger wrote:

Implementation of the `Creation date' property.

attachment SemanticMediaWiki-creation-date.patch ignored as obsolete

Patch looks valid at first glance. Not sure it's a good idea to add more special props into SMW itself.

van.de.bugger wrote:

Hmm... I want to get recently created pages, which also have to meet some other conditions, so I created a query:

{{ #ask: ...conditions...

| ?Creation time
| sort  = Creation date
| order = desc

}}

If there is no `Creation date' property, how to get recently created pages then?

BTW, http://www.mediawiki.org/wiki/Extension:Semantic_MediaWiki/Manual says:

Needed semantic "meta" properties

  • Page name ...
  • Page modification date - already implemented
  • Page creation date

I just copied implementation of modification date. If you think it should be implemented in another way, then show it.

Like I said, your patch looks valid. IIRC Markus did not want to add any such additional special properties though - Markus, can you comment on this?

That page was more of a personal wish/rant-list of Badon, disguised as a manual. I deleted it now, sorry for the confusion.

In general, having more special properties does not really hurt us. However, getting the values for such meta-properties can hurt for two reasons:

(1) More code is needed in some cases to get at the right values. This might even require new hooks.

(2) Setting the values can take time since you get additional database lookups on all edits, refreshs and maybe even previews.

Concern (1) does not seem to be an issue here since "creation date" can be set in the same places as "modification date". To tackle (2), it would be good to make it configurable which additional meta-properties should be stored by SMW. This could be combined with some better general architecture for adding such properties.

But "modification date" is probably not a good role model for new properties of this kind. Many properties could even be determined during parsing when all other property values are set (e.g., size, creation date, metadata about files). "Modification date" is special because during parsing you do not know if the date should be the current time (happens if a user just edited) or the time from the MediaWiki database (happens when the page data is refreshed by a job). This is why the code for _MDAT is more complicated and found in two places (one that sets the date when saving an edit, and a second one that retrieves the date from the database if it was not set already). For most such properties, it should be possible to get the right information in the second place (just before saving the data). The only exception are properties that change on edits but not on refreshs ("last editor" is such an example).

Overall, there are three places to set such property data:

(a) Right after parsing (happens on every preview)
(b) Right before saving an article change (happens only when pressing "save")
(c) Right before saving SMW data (happens on all changes and updates)

No property can rely on (b) only since this step is skipped when updating page data in a job. Relying on (a) would be good for properties that can easily be obtained from the page without extra database lookups (page size, maybe file properties, ...). Moreover, (a) is necessary if the property value should show up in the Factbox (data that is added in (c) will be too late for the Factbox in some cases). Finally, (c) is good for setting properties that are harder to obtain (DB lookup).

Modification date uses (b) and (c); I assume your code does the same. But "creation date" would only need (c) to work. Overall, we should probably have hooks in each of the places to separate such code for easier maintenance. Then one could have a separate file with simple hook implementations for a/b/c as needed for various special properties, and one could have a configuration to state which of them should be used.

reachouttothetruth wrote:

If a "Creation date" special property was added, what would happen to existing "Creation date" properties that might be in use on some wikis?

I think the user defined property will be ignored then - that's just a guess though.

van.de.bugger wrote:

Ok, thanks for responses, but let us elaborate plan of actions. I can prepare another patch (drop setting Creating date' in point (b) (point (c) is storeData' function I guess)). Will you accept it? Or will you insist on adding hooks first? Who will add hooks then?

van.de.bugger wrote:

Attempt #2.

attachment SemanticMediaWiki-creation-date.patch ignored as obsolete

Like Markus mentioned, it would be good if you could configure which special properties you want to have enabled. So we probably want to have a setting which is an array of enabled special properties for that.

I do think it's worth creating these hooks, if there are none that can be used for this function yet, so extensions can add their own special properties.

van.de.bugger wrote:

Do I understand correctly that you ask me to implement something like this:

$smwgSpecialProperties[ '_CDAT' ] = true;

so SMW site admin can selectively enable/disable properties trough its LocalSettings.php?

I think this way of storing config is odd, even though it's used at some places. If you just have a list of values, why bother with the bools. I'd rather have a simple array with as values the special props to enable, similar to the result formats setting. ie $smwgSpecialProperties = array( 'foo', 'bar', 'baz' );

van.de.bugger wrote:

Attempt #3

`Creation date' is disabled by default and can be enabled by

$smwgSpecialProperties = array( 'Creation date' );

It is far from being ideal. Problems:

  1. Fragile. If user makes a typo, e. g.

    $smwgSpecialProperties = array( 'Craetion date' );

Special property will now work but user will not see no warnings/errors.

  1. There are many "mandatory" special properties enabled by default, like Allows value, Has type, etc. These special options are not affected by $smwgSpecialProperties. Should we name it $smwgExtraSpecialProperties?
  1. There are some redundancy:

    if ( in_array( 'Creation date', $smwgSpecialProperties ) ) { $pcdat = new SMWDIProperty( '_CDAT' );

Relation between 'Creation date' and '_CDAT' is hard-coded here, but this relationship already coded in SMWLanguage. However, SMWLanguage does not provide convenient way to map 'Creation date' to '_CDAT'.

Any thoughts how to improve it?

What about adding

define( 'SMW_CreationDate', '_CDAT' );

so user can write

$smwgSpecialProperties = array( SMW_CreationDate );

and we can write

if ( in_array( SMW_CreationDate, $smwgSpecialProperties ) ) {
    $pcdat = new SMWDIProperty( SMW_CreationDate );

?

attachment SemanticMediaWiki-creation-date.patch ignored as obsolete

Fragile. If user makes a typo

I disagree this is an issue. This argument applies to all other settings as well. If an admin enters wrong stuff in the config file, you can't expect the software to magically guess what was intended. As long as it does not horribly break, it's good enough. Although a lot of stuff does break badly when entering crap.

There are many "mandatory" special properties enabled by default, like
Allows value, Has type, etc. These special options are not affected by
$smwgSpecialProperties. Should we name it $smwgExtraSpecialProperties?

Good point. I think we can make a distinction between 2 types of special properties. Those that hold info about pages and those that don't. Making the fist ones configurable makes sense, but the second group you can't simply disable without breaking behavior of SMW or extensions. So I propose having a setting such as $smwgaPageSpecialProperties, which holds modification date, creation date, and any other special props applying to pages.

What about adding
define( 'SMW_CreationDate', '_CDAT' );

I don't like the idea of introducing another list of constants. Furthermore, constants are not good for setting values. If you disable SMW, these will causes fatal errors if present in your config. You can just have the user enter '_CDAT' and the like in the setting. As long as this is properly documented, I don't see any issues with this.

van.de.bugger wrote:

So I propose having a setting such as $smwgaPageSpecialProperties,
which holds modification date, creation date, and any other special props
applying to pages.

So, what the bottomline? Should I rename $smwgSpecialProperties to $smwgPageSpecialProperties and implement disabling "Modification date" in order to get the patch applied?

If you do these things, I'll be happy to apply it yes.

  • Bug 30784 has been marked as a duplicate of this bug. ***

I marked bug 30784 as a duplicate of this one, even though it's older, because progress is being made on this one. However, the info in the older report is useful, so I'm copying it below. The older report was marked as a blocker for the reasons described, so I'm marking this one as a blocker too. I'm not sure it makes much difference either way, but feel free to change that status back to "enhancement" if that is preferred. Here's the text of the older report:

This bug report is an element of this one, for needed metaproperties:

https://bugzilla.wikimedia.org/show_bug.cgi?id=30610

and it is a blocker for this one, for existence and uniqueness validation:

https://bugzilla.wikimedia.org/show_bug.cgi?id=30537

Without a creation date property, it is not currently possible to use a
workaround to check for duplicates without also marking the original as a
duplicate. Being able to find the oldest page creation date makes it possible
to allow that original page to not be marked as a duplicate, while all
subsequent pages are marked as duplicates.

An inaccurate manually created page creation date property can be employed as a
workaround for the problem, but it is only able to record the date that the
form was loaded, not the actual creation date of the resulting page produced by
the form.

In order to do this for all pages, as reported in bug 30537, a creation date
metaproperty is required, and very probably should be in SMW core instead of an
extension, due to its "blocker" nature.

Once a creation date metaproperty becomes available in the core, bug 30537 can
be fixed.

Also, I forgot to mention that the workaround for this bug requires caching to
be disabled on the wiki, at least for form pages. That makes the workaround
very inadequate for most people, since there are not many ways to do that
reliably without causing other problems on the wiki.

van.de.bugger wrote:

Attempt #4.

  1. Global variable renamed to $smwgPageSpecialProperties.
  2. `Modification date' is enabled by default, but can be disabled.
  3. `Creation date' is disabled by default, but can be enabled.

Attached:

van.de.bugger wrote:

Alternative variant.

Compared to the previous patch:

  1. getPropertyId' function added to SMWLanguage' class. It returns property id (_MDAT') for given property name (Modification date'). Localized property names are accepted.
  1. storeData' function is refactored for easier adding more page properties. An implementation of a new property will require 3-4 lines of code (in this function; it will also require additions to language files and to SMW_SQLStore2.php' and `SMW_DI_Property.php' files).

Changes for SMW site admin: $smwgPageSpecialProperties can include localized property names.

Attached:

Patch applied in r102350 - great work :)

I'm not sure why, but on my local wiki, when I do a query for the creation date, I'm not getting any values for it, while modification date works as expected.

This is after adding $smwgPageSpecialProperties[] = 'Creation date'; to my localsettings file ofc.

Is it working for you?

Did some debugging, and it looks like the value is getting stored as it should.

Jeroen, your issue might be related to this bug I reported on 2011-07-15:

https://bugzilla.wikimedia.org/show_bug.cgi?id=29909

I don't think it's related.

Van de Bugger: it would be awesome if you could document this new functionality on the SMW wiki :)

van.de.bugger wrote:

Is it working for you?

  1. In order to get Creation date' working, you should go to Special:SMWAdmin' and push `Initialize or update tables' (wording is not exact).
  1. I am not sure, but Creation date' values do not appear instantly. You have to edit and save a page to get Creation date' set.

if you could document this new functionality on the SMW wiki :)

Sure.

That is the exact procedure to workaround bug 29909, so it appears that it is not only related, but actually the same bug that Jeroen encountered.

Unknown Object (User) added a comment.Nov 9 2011, 12:36 AM

After adding $smwgPageSpecialProperties[] = 'Creation date'; to LocalSettings and running Special:SMWAdmin, everything works as advertised (Main namespace, File: namespace, custom-defined namespace).

Testing environment: 1.19alpha (r102469); Semantic MediaWiki (Version 1.7 alpha) (r102469);Semantic Result Formats (Version 1.7 alpha) (r102469)

Thanks

(In reply to comment #26)

  1. In order to get Creation date' working, you should go to Special:SMWAdmin'

and push `Initialize or update tables' (wording is not exact).

Ah! I didn't know this was needed. Works now I did that.

  1. I am not sure, but `Creation date' values do not appear instantly. You have

to edit and save a page to get `Creation date' set.

This makes sense since this data gets saved on save, so it won't magically appear on all pages when enabling it. If you want it to show up everywhere immediately, then you can run the data update/refresh thing on SMWAdmin and then run maintenance/runJobs.php. If you can't do the later, at least the data will be set eventually (depending on your wikis size and jobQueue settings) without having to do more manual work.

(In reply to comment #27)

That is the exact procedure to workaround bug 29909, so it appears that it is

not only related, but actually the same bug that Jeroen encountered.

As you can deduce from what I wrote here, this is incorrect. However, the same solution can be used. It's not a bug as far as I'm concerned but a result of SMWs architecture.

I guess instead of a "bug" we can call it a "quirk" :) When I wrote the report for bug 29909, I was still pretty new to SMW. I can't remember if I tried the SMWadmin functions or not. If not, I'd be embarrassed, but I do feel better to know that even veterans like Jeroen can be temporarily confused by such a quirk.

I suppose as long as we note the quirk in the docs and remind people of how to deal with it in SMWadmin, it's OK.

van.de.bugger wrote:

I suppose as long as we note the quirk in the docs and remind people of how to

deal with it in SMWadmin, it's OK.

Updated: http://semantic-mediawiki.org/wiki/Property:Creation_date

I updated the docs to mention an issue that can occur if a server moves, and how to use UTC time to avoid the problem. Also, is the new Creation date special property in unix time format, with 1 second resolution? I think there should be some mention of time formatting. If it's unix time, I can write it up.

Replies on two issues:

  • Time format: times like these are stored internally with second precision. The formatting is not stored, just the actual time. It will be formatted during output depending on how the value is displayed. Typically, SMW uses a human readable, MW-like format. In query results, there are some options to get other formats (e.g., ISO dates). Some output formats such as Timeline have their own date rendering code that may differ from SMW.
  • Configuration: the setting "$smwgPageSpecialProperties[] = 'Creation date';" still seems to depend on the user language setting. It would be better to have "$smwgPageSpecialProperties[] = '_CDAT';" to keep LocalSettings.php language independent.
  • Quirk: nice name. I am open to any ideas to make such changes instantaneous. But as it is, all you can do to add the data is to run the relevant source code for all pages, which simply takes so long that a job-based method is the best we can do. This is not really an architectural issue but simply a problem of updating large databases in ways that are determined only in PHP code. The only way of getting the change propagated is to run this code, which will always be too slow for "instantaneous".

I think the change IS instantaneous on a fresh MediaWiki install. That's why I described it as a quirk - it's not always mysteriously not working from the beginning - only if you install it on an already active wiki.

van.de.bugger wrote:

Configuration: the setting "$smwgPageSpecialProperties[] = 'Creation date';"
still seems to depend on the user language setting.

Err... Not exactly. English aliases are accepted if $m_useEnDefaultAliases is true. Which is true for all languages but English.

It would be better to have "$smwgPageSpecialProperties[] = '_CDAT';" to keep
LocalSettings.php language independent.

My original intention was similar:

$smwgPageSpecialProperties['_CDAT'] = true;

because

  1. It allows direct access:

    if ( $smwgPageSpecialProperties['_CDAT'] ) { ... }
  1. There is no need to take care about duplicates. SMW administrator can mistakenly do

    $smwgPageSpecialProperties[] ='_MDAT';

for already enabled property, so when looping through properties special care should be done not to proceed the same property twice.

But Jeroen proposed current implementation.

"_CDAT" is not more clear than "Creation date", regardless of the language in use. Making it more difficult in English does not make it easier in Swahili. If anything, it just ensures that machine translation will fail, and everybody is equally obfuscated. If you have to pick a language, pick one, but don't obfuscate it. I don't speak Chinese, but I could at least look it up if it were in Chinese.

"_CDAT" is not an improvement over "Creation date", and I suggest it be left as "Creation date", in any language.

van.de.bugger wrote:

"_CDAT" is internal name of the property, it is not visible to the wiki user. Only wiki administrator see it (probably once) when setup LocalSettings.php. It is not too readable, but it simplifies implementation. However, since current implementation is already in place, I do not care too much.

BTW, I would like to attract you attention to bug 32351. It is about #time and #timel ParserFunctions. These functions (if the bug fixed) allow to represent either creation or modification date in UTC or user-preferred timezone, virtually any format, including format preferred by user. Please vote for the bug to show your interest in the topic (of course, if you are interested).

(In reply to comment #35)

Configuration: the setting "$smwgPageSpecialProperties[] = 'Creation date';"
still seems to depend on the user language setting.

Err... Not exactly. English aliases are accepted if $m_useEnDefaultAliases is
true. Which is true for all languages but English.

It would be better to have "$smwgPageSpecialProperties[] = '_CDAT';" to keep
LocalSettings.php language independent.

My original intention was similar:

$smwgPageSpecialProperties['_CDAT'] = true;

because

  1. It allows direct access:

    if ( $smwgPageSpecialProperties['_CDAT'] ) { ... }
  1. There is no need to take care about duplicates. SMW administrator can

mistakenly do

$smwgPageSpecialProperties[] ='_MDAT';

for already enabled property, so when looping through properties special care
should be done not to proceed the same property twice.

But Jeroen proposed current implementation.

Not really. You are mixing up 2 things now. I proposed having the current array with values instead of having the props as keys with associated boolean values. I did never argue for having the internationalized name of the prop be accepted in this setting. In fact, like Markus, I argued _against_ it.

Thanks to all for clarifying their positions ;-) As was already explained above, _CDAT is simply the internal name of the property, nothing a user will ever see. But if you use User Interface strings like "Creation date" in LocalSettings.php, it means that SMW first has to do a language (alias) lookup to understand its own configuration. This is not desirable: the configuration should be directly machine readable as it is. Otherwise, machine cycles will be wasted in each request just for evaluating the configuration (and this effort is not reduced by bytecode caching which otherwise reduces the effort of reading LocalSettings.php on each run).

We could define a PHP constant that is more readable, but as Jeroen said above, this would mean that the setting would cause errors if SMW is not included. Yet I wonder if this is a problem in practice. In particular, it seems to me that an assignment to "$variable[]" should also cause problems in PHP if $variable is not defined before (which would be the case if SMW was not loaded). So maybe we could have English based surface constants for the setting?

The implementation details obviously should not require sacrificing performance for the sake of clarity, which I think everyone agrees with.

I'm not sure if errors caused by a PHP constant with SMW not being included is a problem in practice. Perhaps when moving a wiki, and setting up MediaWiki first, then SMW later. The error might be confusing then.

I'm not opposed to using "_CDAT" on technical grounds, I just didn't want it to be used for the purpose of clarity, which it does not achieve.

van.de.bugger wrote:

What is the bottomline? Who is the decision maker? I can fix implementation one more time or left it to someone… Anyway I would like to close the problem and do not worry about Creation date any more (since it is critical to my application).

"_CDAT" is OK for now because the proposed method of changing it can be done later, if needed. I think this bug is ready to close.

van.de.bugger wrote:

"_CDAT" is OK…

But what about the rest:

$smwgPageSpecialProperties[] ='_MDAT';

or

$smwgPageSpecialProperties['_MDAT'] = true;

…for now because the proposed method of changing it can be done later,
if needed.

I want to close the issue now. Not just move this record to "CLOSED" status, but finalize the implementation, while the code is fresh in my memory. While SMW with the fix is not yet released. Later I will have to get myself familiar with the code. Later somebody will say "Oh, no, do not break backward compatibility". And so on…

I will defer to Markus on those questions, since I'm sure he knows what would be best. Which way will be better when there will someday be many special properties, like in the list here?:

https://www.mediawiki.org/wiki/User:Badon/Extension:Semantic_MediaWiki/Manual#Needed_special_properties

and here:

https://bugzilla.wikimedia.org/show_bug.cgi?id=30610

@Van de Bugger: My vote goes for the below form, since this is what is done for the result formats. Either way, I think both forms are acceptable. The main issue with the current version of the patch is that it does the language thing, while it's better to use the identifiers themselves. If that gets fixed, this can be closed.

$smwgPageSpecialProperties[] ='_MDAT';

van.de.bugger wrote:

Using '_MDAT' and '_CDAT' instead of 'Modification date' and 'Creation date'.

This patch does not obsoletes previous one. It should be applied to current sources. `$smwgPageSpecialProperties' accepts internal property identifiers, like '_MDAT', not language-dependent strings 'Modification date'.

One more minor change: $smwgPageSpecialProperties' initialization is moved from SMW_Setup.php' to `SMW_Settings.php', it seems the latter is more appropriate place for initialization of the global variable.

Attached:

van.de.bugger wrote:

Reopened to because code is not yet finalized -- one patch should be applied to recognize _CDAT' not Creating date' in configuration file.

This bug is gaining some progress:

https://bugzilla.wikimedia.org/show_bug.cgi?id=30610

It may be a good idea for participants here to have a look at it, since things like naming conventions have already been decided here. If the work on that bug follows the conventions laid out here, it might be possible to integrate the code into SMW a little easier.

Applied in r105233 - thought I already did that one, but apparently not :)

van.de.bugger wrote:

Verified on v1.7.0.2.

It works very well! Thank you Van de Bugger!