Page MenuHomePhabricator

#expr doesn't correctly handle minus or endash as minus sign
Closed, ResolvedPublic

Description

Author: doug.strain

Description:
Hi! You guys are doing a great job!

The MOS states that the unicode minus (and variously or not(!?), the en dash) are to be used to negate numbers and to represent subtraction. Sometimes these representations are used for display and calculation in the same template call.

Minus and en dash are not evaluated correctly by the #expr parser function (see http://en.wikipedia.org/wiki/User:Saintrain/A/NegTest).

If we can't use the hyphen (as the Lords of FORTRAN intended) then the parser functions should handle the typographic characters.


Version: unspecified
Severity: normal

Details

Reference
bz15349

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:18 PM
bzimport set Reference to bz15349.
bzimport added a subscriber: Unknown Object (MLST).

ayg wrote:

I don't think en dash should be handled as a minus sign. That's just wrong. The minus sign should, though, for all us typography Nazis.

doug.strain wrote:

en dash is no longer in the MOS as a couple weeks ago. (So Aryeh, that's a vote "for", yes? :-)

Looks like a simple "str_replace" in Expr.php.

ayg wrote:

If somebody were willing to write a patch, I'd be willing to test and commit it. Brion has expressed reservations about the necessity of typographic exactitude in MediaWiki numerical formatting before, though, over at bug 8327 comment 2. :) So I can't promise it will stick, but I don't see why not.

soxred93 wrote:

I am a little concerned about this, but I don't know why. I can't see any other sensible way to do this, though. (template, str_replace, they won't work on the wiki).

ayg wrote:

(In reply to comment #4)

(template, str_replace, they won't work on the wiki).

doug.strain wrote:

(In reply to comment #3)

Brion has expressed reservations about the necessity of typographic
exactitude ...

And I agree with him; hyphen, minus, I don't care but Expr.php does. I just want to calculate with old numbers without getting reverted by people who do care about the typo-crap.

doug.strain wrote:

(In reply to comment #3)

If somebody were willing to write a patch,

Sorry, I don't have no steenking patches, but changing

  1. Unescape inequality operators $expr = strtr( $expr, array( '&lt;' => '<', '&gt;' => '>' ) );

to

  1. Unescape inequality operators. Change minuses to hyphens $expr = strtr( $expr, array( '&lt;' => '<', '&gt;' => '>', '&minus;' => '-' ) );

at about line# 169 of Expr.php (near the top of doExpression()) should do it. (Looks like this issue has come up before, typographical-to-calculationical. :-)

Thanks.

ayg wrote:

Committed in r40762, and allowed to tolerate U+2212 "−" as well.

doug.strain wrote:

(In reply to comment #8)

Committed in r40762, and allowed to tolerate U+2212 "−" as well.

Many thanks.
Doug

P.s. Very roughly, round numbers, how long to get to en? That is, what's the usual time for such things?

ayg wrote:

Typically on the order of a week.