Page MenuHomePhabricator

Language::formatNum() should prefix negative values with − (minus sign U+2212)
Closed, ResolvedPublic

Description

Author: robchur

Description:
Some users are complaining that negative values, when formatted, have a "-" prefix, as opposed to the "more correct −".

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 9:29 PM
bzimport set Reference to bz8327.
bzimport added a subscriber: Unknown Object (MLST).

robchur wrote:

Proposed patch

This patch fixes the issue in a brute-force manner, but I didn't want to commit
it without some review, since I'm sure there's going to be a regression in
behaviour somewhere if I do.

Attached:

Er, sounds like bullshit to me?

christian.kaul wrote:

(In reply to comment #2)

Er, sounds like bullshit to me?

Are you kidding?

No?

  1. What would be the benefit?
  1. With patch as written, negative numbers would be corrupted when output as

plain text, certainly a regression in behavior.

(In reply to comment #0)

Some users are complaining that negative values, when formatted, have a "-"
prefix, as opposed to the "more correct −".

What is being done with these values? One would suspect that if fed to ParserFunctions
like {{#expr}} the latter would choke...

christian.kaul wrote:

Isn't it possible to show a minus but to calculate with the "-" form?

leon wrote:

As this bug has been reported due to complaints about the change size value on
recentchanges, I've commited r18468 as a workaround until this one's fixed.

ayg wrote:

(In reply to comment #4)

  1. What would be the benefit?

More professional-looking output? (Admittedly, only to the 5–10% of people who
would notice.)

  1. With patch as written, negative numbers would be corrupted when output as

plain text, certainly a regression in behavior.

UTF-8 would be the more reasonable thing to output it as. The function should
certainly have a switch to determine whether it should output hyphens or
minuses, though, not always return minuses.

robchur wrote:

I favour Leon's method, instead; this should be altered for display only, and
the output of Language::formatNum() left alone.

broken.arrow wrote:

(In reply to comment #7)

As this bug has been reported due to complaints about the change size value on
recentchanges, I've commited r18468 as a workaround until this one's fixed.

Assuming it's not intentional, r18468 has introduced a typo in the class name
"mw-pluminus-bold"; "mw-pluminus-neg" had the same typo from earlier on. This is
inconsistent with the names of the two other classes, "mw-plusminus-pos" and
"mw-plusminus-null".

ayg wrote:

(In reply to comment #10)

(In reply to comment #7)

As this bug has been reported due to complaints about the change size value on
recentchanges, I've commited r18468 as a workaround until this one's fixed.

Assuming it's not intentional, r18468 has introduced a typo in the class name
"mw-pluminus-bold"; "mw-pluminus-neg" had the same typo from earlier on. This is
inconsistent with the names of the two other classes, "mw-plusminus-pos" and
"mw-plusminus-null".

Fixed in r18471, thanks for pointing it out (although a new bug report would
have been preferable).

Have reverted r18468, as it doesn't appear to accomplish anything at all and the
whole subject seems rather silly.

Not a single reason for fulfilling this request has been given, which makes me
consider it a huge waste of time.

Resolving WONTFIX.

christian.kaul wrote:

The God Emperor of MediaWiki has spoken.

ayg wrote:

*** Bug 12029 has been marked as a duplicate of this bug. ***

Should be easy to simulate with something like this:

{{#ifexpr:{{{1}}}<0|&minus;{{formatnum:{{#expr:-{{{1}}}}}}}|{{formatnum:{{{1}}}}}}}

  • Bug 35146 has been marked as a duplicate of this bug. ***

mr.heat wrote:

Reopening. Yes, the hyphen-minus (U+002D) is a common replacement for the minus sign (U+2212). But since we are using Unicode everywhere I don't see a reason why the {{formatnum:}} parser function should not do the same. Why does it need to be ASCII compatible? See http://en.wikipedia.org/wiki/Hyphen-minus for more details.

You can not use the output of a {{formatnum:}} call in an {{#expr:}} anyway. Stuff like {{#expr: {{formatnum:12345}} / 1000}} does not work anyway no matter which character is used. Depending on the language it causes an expression error (if your languages uses commas for digit grouping) or prints "0,012345" instead of "12,345" (if your languages uses dots for digit grouping).

That said I don't see a reason why {{formatnum:-12345}} can not use the proper minus sign. It can't break existing code because, as I said, {{formatnum:}} can not be used inside other parser functions.

Currently we are using code like {{#ifexpr: number < 0 | &#x2212;{{formatnum: {{#expr: abs number }} }} | {{formatnum: number }} }}. This code will not break if {{formatnum:}} uses the proper minus sign.

p.selitskas wrote:

(In reply to comment #17)

Reopening. Yes, the hyphen-minus (U+002D) is a common replacement for the minus
sign (U+2212). But since we are using Unicode everywhere I don't see a reason
why the {{formatnum:}} parser function should not do the same. Why does it need
to be ASCII compatible? See http://en.wikipedia.org/wiki/Hyphen-minus for more
details.

You can not use the output of a {{formatnum:}} call in an {{#expr:}} anyway.
Stuff like {{#expr: {{formatnum:12345}} / 1000}} does not work anyway no matter
which character is used. Depending on the language it causes an expression
error (if your languages uses commas for digit grouping) or prints "0,012345"
instead of "12,345" (if your languages uses dots for digit grouping).

That said I don't see a reason why {{formatnum:-12345}} can not use the proper
minus sign. It can't break existing code because, as I said, {{formatnum:}} can
not be used inside other parser functions.

Currently we are using code like {{#ifexpr: number < 0 | &#x2212;{{formatnum:
{{#expr: abs number }} }} | {{formatnum: number }} }}. This code will not break
if {{formatnum:}} uses the proper minus sign.

Totally agree. {{formatnum}} (and a PHP method on its back) is a presentation-level function, and it won't be used in {{#expr}} and other calculations by any means. It is just this function's nature to be used the last, when the resulting value is done.

I also checked against pasting the negative value with the Unicode minus into Windows' Calculator, and it all worked just fine. Perhaps, Gnome/KDE/... calculators do so as well.

(In reply to comment #18)

I also checked against pasting the negative value with the Unicode minus into
Windows' Calculator, and it all worked just fine. Perhaps, Gnome/KDE/...
calculators do so as well.

Gnome's probably does, while KCalc has problems even for handling input with dots as thousands separator; but all this should indeed not affect this feature request.

This is still a problem, at least with the population estimate in the infobox of https://en.wikipedia.org/wiki/Novgorod_Oblast .

To be clear, the infobox Doc just mentioned uses < - >, which is the hyphen-minus character: https://en.wikipedia.org/wiki/Hyphen-minus rather than < − >, the minus sign: https://en.wikipedia.org/wiki/Plus_and_minus_signs#Minus_sign

To be clear, the infobox Doc just mentioned uses < - >, which is the hyphen-minus character: https://en.wikipedia.org/wiki/Hyphen-minus rather than < − >, the minus sign: https://en.wikipedia.org/wiki/Plus_and_minus_signs#Minus_sign

Oh, okay. (I'd still like the minus sign to be used.) (I was referred here by https://www.mediawiki.org/wiki/Help_talk:Extension:ParserFunctions/2014-2020#Request:_Implement_minus_sign_for_negative_numbers_(#expr,_etc.) , which popped up in my notices tonight.)

Agreed: minus should be used, not hyphen-minus. I was just clarifying for others who can't obviously tell which figure is in the infobox.

Well, I couldn't either. <chuckles>

Bump. I just ran into this again on another Wikipedia article, https://en.wikipedia.org/wiki/Novgorod_Oblast . Or given my editing history, probably the same article.

I'd support changing the output of {{formatnum}} to use U+2212 minus. This will probably break a bunch of pages which pass the output of formatnum back to expr but as T10327#126094 says this is broken anyway for (a) fractional values when user preference is for comma as decimal separator, (b) values greater than 100 when locale specifies two-digit grouping, (c) values greater than 1000. Adding "(d) when value is negative" may actually prompt needed cleanup.

Discussion arose in T237467: Invariant failed: Bad UTF-8 (full string verification) which added warnings for various bits of non-numeric input to {{formatnum}} in the process of un-breaking some non-US/non-Euro wikis which were passing garbage to commafy and ending up with bad UTF-8 as a result.

Change 635546 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/core@master] Use Unicode minus in output of {{formatnum}}

https://gerrit.wikimedia.org/r/635546

Daimona subscribed.

I believe this should be mentioned in Tech News once approved.

I believe this should be mentioned in Tech News once approved.

I added a mention to this week's Tech News as a future change. There are other refactorings to {{formatnum}} rolling out on the next train, so I'd prefer not to complicate that deploy by merging this, but I'd consider this for 1.36.0-wmf.18 (Nov 17-19) if no one objects before then.

Please explain in simple speech or in German why to change hyphen minus to minus sign! What is advantage and disadvantage?

The advantage is the formal correctness of the output. The disadvantage is it will break things in templates (well, might, but this being Wikimedia it's almost a certainty).

I do have some questions about the change though:

  1. Will formatnum accept hyphen as input? It seems so from the patch, I just want to make sure.
  2. Can this be made translatable for languages that might use a different character or just prefer ease of use over correctness?

I can tell you that it will cause massive problems with templates on fawiki which use formatnum to allow the arguments to be either in local digits (۱ ۲ ۳) or in Arabic digits (1 2 3). We will essentially have to wrap many of those use cases into another template that undoes the change proposed here, so math would continue to work (many of our uses of {{formatnum}} are inside a mathematical expression). That is expensive server time.

One could even think of this proposal as a trade-off between making a small group of nit-picky users happy at the expense of being harmful to the environment (yes, you read it right; extra processing time needed per above means a larger carbon footprint by the servers).

If it is decided to implement this feature, it should be optional (as in, maybe enwiki can turn it on but fawiki can keep it off). I would imagine that wikis that use an alphabet with Arabic numerals will be more likely to turn this on, and wikis in languages with other alphabets would want it off.

Oh, it's a great idea to implement it as option. This will be the best for all the wikis using different languages like Huji said before.

Let's not make wikitext parsing even more unpredictable across languages. Also, in the longer term we are aiming to have global modules and templates, so we can't really rely on language-specific behavior.

In the short term, making #expr handle minus signs doesn't seem too hard, there should be some kind of sunset period though. As said above, using formatted numbers in expressions does not make any sense and is liable to break in any number of ways. Just use #formatnum with the R switch to convert them back to mathematical numbers before using in expressions.

{{formatnum}} will take hyphen as an input. {{formatnum:...|R}} will accept either hyphen or unicode-minus. As @Tgr says, giving formatted numbers to {{#expr}} and the like is already a source of bugs: it tends to work only for US-like wikis and break for everything else. The new warnings help you find these sorts of bugs.

Change 635546 merged by jenkins-bot:
[mediawiki/core@master] Use Unicode minus in output of {{formatnum}}

https://gerrit.wikimedia.org/r/635546

cscott claimed this task.