Page MenuHomePhabricator

Extend ParserFunction {{#time:}} to work with all valid ISO 8601calendar dates
Closed, ResolvedPublic

Description

Current version of {{#time:}} as used on Wikimedia Commons do not support many valid ISO 8601 dates. Dates in unsupported formats should report an error, instead of rendering incorrect dates. Currently at Wikimedia Commons very complicated and computationally expensive templates (http://commons.wikimedia.org/wiki/Template:ISOdate and http://commons.wikimedia.org/wiki/Template:ISOyear) are used to fix this problem.

For example:

  1. {{#time:Y|1945}} renders "2011" instead of expected "1945"
  2. {{#time:F Y| 0099-01}} renders "January 1999" instead of expected "January 0099" ({{#time:F Y| 0099-01-01}} works fine)
  3. {{#time:F Y| 0009-08 }} renders "August 2009" instead of expected "August 0009"
  4. {{#time:d F Y| 002008-01-01 }} renders "21 April 2011" instead of "01 January 2008" or "Error: invalid time" (since it is not valid ISO 8601 date)
  5. {{#time:d F Y| 02008-01-32 }} renders "08 January 2032" instead of "Error: invalid time"

Also ideally user would be allowed to also specify language to be used when rendering months, this would allow us on commons to retire or simplify series of templates duplicating {{#time:}} which allow display of dates in "d F Y" or "F Y" format in the language specified by outside parameter, see http://commons.wikimedia.org/wiki/Template:Date.

All those templates are heavily used:

  1. Template:Date > 7.1M transclusions
  2. Template:ISOdate > 9.3M
  3. Template:ISOyear ~.4M

Complexity of those templates causes frequent bumping into template expansion limit in templates using them.


Version: unspecified
Severity: normal

Details

Reference
bz28655

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:36 PM
bzimport set Reference to bz28655.

In case #1, the output seems to fail for all years where the last two digits are between 00 and 59 inclusive. Even years like 2010 fail.

The problem here is that PHP's DateTime constructor is interpreting '1945' as 7:45pm.

Regarding cases 2 and 3: These are also caused by PHP's DateTime constructor. From php.net: "...years below 100 are handled in a special way when the y or yy symbol is used. If the year falls in the range 0 (inclusive) to 69 (inclusive), 2000 is added. If the year falls in the range 70 (inclusive) to 99 (inclusive) then 1900 is added."

Regarding cases 4 and 5: This is just due to how PHP's DateTime constructor parses the dates. Why would we need to support those formats anyway? They seem pretty nonsensical to me.

Regarding languages, unfortunately, PHP's date functions only support English, so we would probably have to write our own functions from scratch to fix this.

Regarding cases 2 and 3: Can anything be done for those? PHP's DateTime constructor might have made nonsensical (for us) design choices, but does it mean we are stuck with them forever?

Cases 4 should be handled or labeled as "Error: invalid time". Case 5 is invalid date and should also return error. Both cases should not return valid looking nonsense dates, since those are very hard to spot as errors and correct. So support in this case means recognize as nonsensical and report the fact.

As for language support I thought (based on http://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23time) that month names can be displayed in the "site language", so there is support for all the languages of wikipedias using this function. Under this assumption I thought that the capability to choose language should not be hardwired but passed as an optional parameter. I might be building house of cards here based on possibly incorrect documentation.

Regarding cases 2 and 3: Yes we could hack these to work by changing 00XX-XX to 00XX-XX-01 before it is passed to DateTime. However, that won't solve the similar cases:
{{#time:F Y|January 0099}} -> January 1999
{{#time:F d, Y|January 12, 0099}} -> January 12, 1999
Unfortunately, there doesn't seem to be any way to trick PHP into interpreting
the years correctly in the above cases. Perhaps we should file a bug against PHP in this case, rather than creating a half-baked workaround.

Regarding case 4: Currently we're relying on DateTime's error handling, and since DateTime accepts it, #time accepts it. We could write some error condition exceptions into the extension, however, if there are some common error conditions we need to detect. Is XXXXXX-XX-XX actually common enough to warrant creating separate error handling for it?

Regarding case 5: This seems to be fixed on trunk, although I don't know if it's because of a change in PHP or the extension.

Regarding languages, perhaps you're right. I'll see if I can look more deeply into how this works.

Language support fixed in r86927.

You can now pass a language code or 'user' as a 2nd parameter:
{{#time:d F Y|1988-02-28|ja}} -> 28 2月 1988
{{#time:d F Y|1988-02-28|user}} -> outputs in user's interface language

Note that you cannot use:
{{#time:d F Y|ja}}
{{#time:d F Y|user}}
As the language code is interpreted as a date string if it is the first param.

The language support is a great news and hopefully PHP folks can fix bugs 2 and 3. As for issues with XXXXXX-XX-XX, I encountered them when trying to recognize dates in format where someone forgot to pad years with zeros for dates lower than 1000. As you can see in template:ISOyear I have a series of tests which most of the time can deduce the format without using any heavy string manipulation templates. Towards the end I test for dates like 9-08-18 which should have been 0009-08-18 and I test for it by pasting zeros in front of it and than testing it with the {{#time}}, as a result I often pass to {{#time}} dates in 00XXXX-XX-XX format. However once other improvements to {{#time}} are made I might be able to retire large portions of this template, including padding with zeros.

The {{#time:Y|1945}} bug is now fixing on the live Commons.

For the other bugs (which are actually PHP bugs), please go to http://bugs.php.net/bug.php?id=54597 and vote for the bug there.

The language support should be deployed soon, although it will be slightly different than I describe above. I'll update the documentation at http://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23time once it gets deployed.

I'm going to go ahead and close this bug now. Feel free to open a new bug for cases 4 and 5 specifically (for handling invalid ISO 8601 dates). They don't really fit the bug title anyway.

*Bulk BZ change: -bugsmash from closed bugs. (7 Bugs)*

Still misrendered: If a four digit input is a plain year, its associated date should be 1 january instead of current one.

{{#time: Y m d H:i:s | 1998 }}
should render → 1998 01 01 00:00:00 (given year, 1 january, 0 hour)

instead of → 1998 08 19 00:00:00 (mixing given year with current date [but not current hour])

See also this request and a hack to bypass it:
[http://www.mediawiki.org/wiki/Help_talk:Extension:ParserFunctions#Request:_new_flag_to_force_date_interpretation_in_.23time:]

That's also a bug on the PHP side:

$tz = new DateTimeZone("Europe/Amsterdam");
$dateObject = new DateTime( '1998', $tz );
echo $dateObject->format( 'Y m d H:i:s' );

outputs: 1998 08 19 20:14:31

Guess we should override it locally.

Related bug 30472 opened regarding truncated timestamp as date/time object.

The string "user" as second parameter does not work. When I have set my user language to "nl" on mediawiki.org, I expect the last line in https://www.mediawiki.org/wiki/User:Siebrand/test to show me the date in Dutch (user language), but it is shown in English (content language).

If I do the same test on https://or.wikipedia.org/wiki/User:Siebrand/test, the "user" test is also output as English.

I've added a parser test for {{#time:d F Y|1988-02-28|nl}} in r106987, but cannot do that for "|user", as far as I know.

Per discussion with Niklas, we never implemented support for "|user". It's been a long time, so I don't remember the reasoning, but maybe Niklas does.

(In reply to comment #18)

Per discussion with Niklas, we never implemented support for "|user". It's been
a long time, so I don't remember the reasoning, but maybe Niklas does.

I would imagine cache pollution of anonymous caches with uselang and the like.

That would make sense. You can work around it on some wikis though (Commons), by using {{int:lang}} as the parameter.

happy.melon.wiki wrote:

(In reply to comment #20)

You can work around it

You mean, "you can use the *other* cache-polluting hack we haven't got round to fixing yet"... :p

I'm not advocating it, just stating that it's possible :) I'll be glad when we have Bug 32618 resolved so we can deal with this stuff sensibly.

BTW, PHP bug 54597 was finally fixed (http://bugs.php.net/bug.php?id=54597). No idea how long it will take to make it live here though.

The 'user' parameter is gone. Anything documenting it should be updated.

(In reply to comment #18)

Per discussion with Niklas, we never implemented support for "|user". It's been
a long time, so I don't remember the reasoning, but maybe Niklas does.

It may have been with me (or perhaps you had similar talks with both of us) https://www.mediawiki.org/wiki/Special:Code/MediaWiki/86927#c16501

(In reply to comment #21)

You mean, "you can use the *other* cache-polluting hack we haven't got round to
fixing yet"... :p

Yes. The goal was to avoid adding more cache-polluting hacks. "If you want this, you need to use int: Oh, btw int: is evil" :)

All of the cases originally listed in this bug are fixed now, except for #4 which isn't a valid ISO 8601 date. I'm going to close this bug, but feel free to open new ones for specific cases.