Page MenuHomePhabricator

formatnum works wrong in german
Closed, InvalidPublic

Description

Author: dasch

Description:
EN: As described here http://de.wikipedia.org/wiki/Schreibweise_von_Zahlen in german tousends are not delimited by a point "." but by a space " " This should be changed

DE: Wie auf http://de.wikipedia.org/wiki/Schreibweise_von_Zahlen beschrieben wird als Tausendertrennzeichen nicht ein Punkt sondern ein Leerzeichen verwendet.


Version: unspecified
Severity: trivial

Details

Reference
bz18957

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:40 PM
bzimport set Reference to bz18957.
bzimport added a subscriber: Unknown Object (MLST).

dasch wrote:

to read about in english http://en.wikipedia.org/wiki/Decimal_separator even when it's no precise there

dasch wrote:

Maybe the formating for other languages should be checked also. ISO 31-0 tells that a point or colon should never be used for grouping numbers. Another point is that grouping numbers should start at numbers with more then 5. So in german there should be 3456 an not how it is now 3.456 and 23 567 an not 23.567 how it is now. I think this should be checked in all languages, don't know how it looks like others.

(In reply to comment #2)

Maybe the formating for other languages should be checked also.

Usually we change them on request. If you have a good an reliable source, feel free. Otherwise no.

This matches function commafy($_) in LanguageUk.php exactly: Ukrainian numeric format is "12 345,67" but "1234,56". So we have the solution. Now we need the proper source.

I'm asking you not to put this live for now.

  1. "DIN 1333 says a space should be used as thousands separator" - that's true, but it is not widely followed in practice. In practice, the traditional dot is still in use, except for Switzerland and Liechtenstein, where the dot is never used and an apostrophe may be used instead.
  1. "Grouping numbers should start at 5 digits" - DIN 1333 doesn't say that. What DIN 1333 really says is that a thousands separator may be used, but doesn't have to be. It doesn't say when it should be used. The idea of handling four-digit numbers specially is just a private mannerism of some Wikipedians. Maybe it can be found in old-fashioned typography textbooks, maybe even in some modern ones, but not it any standard. It is inappropriate for tables and at least questionable for the body of the text. Some typographers indeed propose to not group 4-digit numbers *unless* they are compared to a 5-ore-more-digits number within the same sentence or paragraph. formatnum can't know the context.
  1. There is at least one more standard touching the problem that has been ignored so far: DIN 5008.

Also note the discrepancy between http://de.wikipedia.org/wiki/Schreibweise_von_Zahlen and http://de.wikipedia.org/wiki/Wikipedia:Schreibweise_von_Zahlen. There is no consensus to change anything at all yet. Should you decide to change it anyway, it would be cool if it didn't break existing formatnum:|R usages like {{formatnum:1.234,56|R}}, which should give 1234.56.

dasch wrote:

Well, first: MediaWiki is not only Wikipedia Software.

So we have to use something that is written down and not any "widely followed practice". By the way the international ISO gives the same as DIN. Seperation by dot should only be used with currencys. That most people use it wrong should not be an argument to everybody else to use it wrong

DaSch, I'm not asking for this to be reverted in general, I'm asking for it not to be put live on Wikimedia sites, mainly Wikipedia.

Regarding "we have to use something that is written down": Yes, that's a valid point, but then let's please restrict ourselves to what is actually written down in the standards. None of the standards mentioned recommends omitting the thousands separator for 4-digit numbers. That idea is taken from different typographic literature (what you call "widely followed practice"). Typographic literature is more than DIN and ISO, and as you're searching for references to the 4-digit special handling thingy, you will find references to the dot thingy (and other thingies, such as the apostrophe in Switzerland and Liechtenstein) as well.

The primary reason I'm objecting is, it's unthinkable that it will be accepted. Wikipedia (at least the German-speaking community) simply doesn't work that way. All that putting this change live will achieve is users asking "Who decided that?" and boosting the popularity of circumvention templates like http://de.wikipedia.org/wiki/Vorlage:FormatZahl. If you really like to find out by experiment what the community thinks about this "most people use it wrong" attitude, propose it at http://de.wikipedia.org/wiki/Wikipedia:Fragen_zur_Wikipedia or maybe revive http://de.wikipedia.org/wiki/Wikipedia:Meinungsbilder/Typographie_(Zwischenräume). That's a poll (still in preparation) about whitespaces in general, it's supposed to cover number formatting as well.

Btw. as a (weak) supporter of the dot -> space switch, I really wonder about three things:

(1) How will you explain our typographers that the usual non-breaking space is used? They will tell you that a narrower space must be used.
(2) How will you explain that {{formatnum:1234}} gives 1234, but {{formatnum:1234.5}} gives 1 234,5? That's the behaviour I found while testing on the Ukranian Wikipedia. Is that a bug or a feature? Nobody can know because nobody discussed it.
(3) How will you explain that in table-row templates like http://de.wikipedia.org/wiki/Liste_der_Verkehrsflughäfen_in_Deutschland (last coloumn), 4-digit numbers don't get a thousands separator, but 5-digits numbers do?

This needs to be discussed in advance, on a Wikipedia page where enough people see it for the result to be representative.

Please withdraw this from dewiki ASAP, the implementation that has been put live just now isn't DIN 5008 compliant (a correctly formatted number looks like 1 234,567 89 according to DIN 5008 - note the spaces even for numbers from 1000 to 9999 and even for decimal places) and there is nothing that even resembles a consensus. The discussion mentioned in comment 10 isn't serious and won't be until someone participates who actually read DIN 5008 (and maybe DIN 1333, if it matters).

dasch wrote:

did you read it? or anybody else the money to pay for it? maybe you could send me a copy

(In reply to comment #11)

Please withdraw this from dewiki ASAP
and there is nothing that even resembles a consensus.

We know it is affected, but MediaWiki isn't developed by consensus and this isn't just some configuration option that can be turned on or off. If dewiki still wants something different from what we will eventually settle on, it is an another bug.

(In reply to comment #7)

it would be cool if it didn't break existing formatnum:|R usages like
{{formatnum:1.234,56|R}}, which should give 1234.56.

That might be possible, but in any case I do recommend against using formatnum anywhere else than in interface messages.

dasch wrote:

The bugs was discussed and anounced and there was no objection

xeddvok7bd7464n2 wrote:

To sum up, there are following errors that should be fixed:

  • {{formatnum:1234}} → 1234 instead of 1 234 (international and ISO 31-0 compatible notation! Furthermore this is a must for tables, where 4-digit and 4+-digit numbers are listed below. However the error does not occur with negative numbers and numbers with a decimal separator.)
  • {{formatnum:-1234}} → -1 234 instead of −1 234 (minus sing! U+2212)
  • {{formatnum:1234.56789}} → 1 234,56789 instead of 1 234,567 89 (also negative numbers should get a thousands separator, see ISO 31-0)

visi-on wrote:

(In reply to comment #15)
*the minus sign U+2212 is thypographicaly correct but be shure you insert then om the reverse option R a normal "-"!
*please insert instead of "&nbsp;" → </span><span style="margin-left:0.167em"> and append add the beginning of the number <span> and at the and </span> this will improof screenreaders, make c&p easier and will be more acceptable be users because the space is nearer. At last, if you give the span-tags css-classes (for example "number" and 'ngs'=number group seperator) everybody can change the appearance in his privat monobook.css file and no additionaly discussions are needed.

*maybe national standards need an optional parameter for currencies

(In reply to comment #16)

*please insert instead of "&nbsp;" → </span><span ...

No, that would restrict it to html or wikitext contexts only and would break where the result is escaped or display in non-html context.

*maybe national standards need an optional parameter for currencies

MediaWiki doesn't do currency formatting.

visi-on wrote:

(In reply to comment #17)

(In reply to comment #16)

*maybe national standards need an optional parameter for currencies

MediaWiki doesn't do currency formatting.

the point is that there are diffrent number formats for currencies i didn't asked for currency formatting

visi-on wrote:

for example
*in switzerland currencies must have a decimal point groped by a space
*in germany currencies must have a comma and grouped by a point
which rule takes place is regulated by the WP community

  • Bug 19255 has been marked as a duplicate of this bug. ***

Reverting r51209 in r52300.

Apparently there is wild disagreement on what the proper formatting is; since no one complained in the previous 8 years I'm returning it to the status quo.

Reverted and resolving INVALID based on unclear request.

dasch wrote:

The rules for german formatting are clear, there are just an bunch of people that have problem with this in german wikipedia, but because of the fact that this is not only wikipedia software there should be a solution for this

a problem with the solution made was that positiv and negativ numbers where not formatted equaly

dark.jedi wrote:

DaSch, please realize your pushing of ISO or DIN norms has not been discussed widely and thus lacking of concensus. Please also realize, that these norms are almost never applied in electronic media because it has a very simple reason: DIN recommends using a *small* space, not the usual standard space. But lots of programs have problems displaying the small space character defined in UTF8 correctly. Due to this, in the de-DE cultural settings the dot is used as delimiter for thousands. Even Microsoft uses the point as delimiter for German regional settings in their latest OS Windows Vista. So German Wikipedia advised to use the point as delimiter due to the technical difficulties.

By implementing this format setting, you would all in a sudden create a massive inconsistency between the formatting applied by this template and the formatting used in the more than 920,000 articles created so far. Evenmore, you would not solve the problem but even replace a valid alternative formatting (dot) with a completely wrong formatting (standard space). I consider this as a non desired behavior.

As I already recommended in the marked as duplicate bug 19857, it may be a good idea to make the formatting rule configureable via an option in the config file or via mediawiki system message.

dasch wrote:

maybe it's widely used but it's not valid!

(In reply to comment #23)

DaSch, please realize your pushing of ISO or DIN norms has not been discussed
widely and thus lacking of concensus.

MediaWiki is not developed by concensus.

Please also realize, that these norms are
almost never applied in electronic media because it has a very simple reason:
DIN recommends using a *small* space, not the usual standard space. But lots of
programs have problems displaying the small space character defined in UTF8
correctly. Due to this, in the de-DE cultural settings the dot is used as
delimiter for thousands. Even Microsoft uses the point as delimiter for German
regional settings in their latest OS Windows Vista. So German Wikipedia advised
to use the point as delimiter due to the technical difficulties.

If Microsoft and CLDR use a dot, there is no reason for us to use something else. Germans have accepted it as status quo, like almost everybody else has done for "-quotes instead of the typographically correct ones.

By implementing this format setting, you would all in a sudden create a massive
inconsistency between the formatting applied by this template and the
formatting used in the more than 920,000 articles created so far. Evenmore, you
would not solve the problem but even replace a valid alternative formatting
(dot) with a completely wrong formatting (standard space). I consider this as a
non desired behavior.

It's not a template. It is a parser function for simple formatting of numbers in the user interface.

As I already recommended in the marked as duplicate bug 19857, it may be a good
idea to make the formatting rule configurable via an option in the config file
or via mediawiki system message.

Not a system message at least. Let's see if the new localisation cache makes it easier to override.

As we cannot change number formatting for de without dewiki complaining, and the current version has been and is acceptable (for who it is not?), there is nothing to do in this bug.

dark.jedi wrote:

What do you guys don't understand? A standard size space as used now is *NOT*!!! a valid formatting at all! You changed a valid alternate formatting (the dot) into a completely wrong! How do you justify a formatting not defined in the DIN cited by DaSch?

And Niklas again is praising his non consensus phrase - but obviously doesn't get the point at all. Read what I've written, think about what's wrong now and then find a solution. Letting the format as it is now is NOT a solution - evenmore you now created a bug with this format. Maybe it's true MediaWiki is not developed by concensus, but hey: I don't care in which way you implement stuff, let it be a parser function, a hook or whatever you like, but if you are not able to make such an elementary thing like number formatting customizable, there is something running horrible wrong! Evenmore when you don't realize for yourself that you can't change the formatting for a single project - ever thought what customization options are for?

(In reply to comment #26)
Please read the whole story before commenting and try to be polite. The change was reverted and nothing will change. You are creating useless friction and making me unhappy.

(In reply to comment #7)

The idea of handling
four-digit numbers specially is just a private mannerism of some Wikipedians.

Well, this particular claim is blatantly false: the BIPM states that «however, when there are only four digits before or after the decimal marker, it is customary not to use a space to isolate a single digit» (http://www.bipm.org/en/si/si_brochure/chapter5/5-3-2.html#5-3-3).
But «For numbers in a table, the format used should not vary within one column», but

(In reply to comment #15)

formatnum can't know the context.

so we should always use the space (previously this wasn't the case, see bug 3408#c2).

(In reply to comment #15)

To sum up, there are following errors that should be fixed:

  • {{formatnum:1234}} → 1234 instead of 1&nbsp;234 (international and ISO 31-0

compatible notation! Furthermore this is a must for tables, where 4-digit and
4+-digit numbers are listed below. However the error does not occur with
negative numbers and numbers with a decimal separator.)

I've tested it on fr.wiki and now it's correct (see also bug 13435 for fi.wiki).

  • {{formatnum:-1234}} → -1&nbsp;234 instead of −1&nbsp;234 (minus sing!

U+2212)

This isn't: see bug 8327.

  • {{formatnum:1234.56789}} → 1&nbsp;234,56789 instead of

1&nbsp;234,567&nbsp;89 (also negative numbers should get a thousands separator,
see ISO 31-0)

This isn't neither: see bug 10454.

dasch wrote:

BTW: i build a parserfunction that enables the user to choose which format he would like to use

Number formatting is a complex issue and PHP's number_format does only subset of it.

dasch wrote:

sure, I'm not denying this. Just wanted to show one fast solution I mad. Maybe there are some people who are interested in it