Page MenuHomePhabricator

Make #expr accept localized digits in expressions
Open, MediumPublic

Description

We support display of non-Arabic number systems. We should add support to parser functions for so that "{{#expr: {{CURRENTYEAR}} + 10}}" (see example in Bug 31371) will work.

Ideally, wikitext like "{{#expr: ২ + ৩}}" would work as well.

https://bn.wikipedia.org/wiki/User:MarkAHershberger/sandbox


Known workaround, use {{formatnum:{{{1}}}|R}} in template - e.g. {{#expr:{{formatnum:{{CURRENTYEAR}}|R}} + 10 }}

Details

Reference
bz34193

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:10 AM
bzimport set Reference to bz34193.
bzimport added a subscriber: Unknown Object (MLST).

*** This bug has been marked as a duplicate of bug 31371 ***

let's live with this bug for sometime. I'll work on it eventually.

This is not a duplicate of Bug 31371 -- it is a request to add functionality to MW that MW does not currently have. Bug 31371 was a request to disable numeric conversion on aswiki.

let's live with this bug for sometime. I'll work on it eventually.

*** This bug has been marked as a duplicate of bug 31371 ***

shijualex wrote:

As already mentioned by Mark A. Hershberger in Comment 3, this bug is not a duplicate of bug 31371.

And adding this functionality to MediaWiki is not only for Assamese or Bengali language. There are many other languages in this world using non-Arabic numerals. In India itself among Wikipedias atleast Kannada, Gujarati, Oriya, Punjabi, and Marathi are using their own native language scripts. I am sure there are many other languages out side India which require similar support.

Prabhakar, for the sake of pushing your POV please do not close the bugs. Mark has created this bug to add a major functionality to Mediawiki. And that functionality is very much required.

(In reply to comment #0)

We support display of non-Arabic number systems. We should add support to
parser functions for so that "{{#expr: {{CURRENTYEAR}} + 10}}" (see example in
Bug 31371) will work.

Ideally, wikitext like "{{#expr: ২ + ৩}}" would work as well.

https://bn.wikipedia.org/wiki/User:MarkAHershberger/sandbox

for this {{#expr: {{CURRENTYEAR}} + 10}}
we use {{#expr: {{#time:xnY}} + 10}} in bengali Wikipedia

And for this {{#expr: ২ + ৩}} it will not work . even any unicode numeric number will not work like that.

wikichaipau wrote:

*** Bug 34174 has been marked as a duplicate of this bug. ***

Okey Shiju; I am sorry. However, this is not a POV comment. I am a good speaker and reader, well cultured in both Assamese and Bengali and I feel as if both are my mothers. Because of that only my post read like a POV comment.

(In reply to comment #6)

for this {{#expr: {{CURRENTYEAR}} + 10}}
we use {{#expr: {{#time:xnY}} + 10}} in bengali Wikipedia

This is helpful, thanks!

And for this {{#expr: ২ + ৩}} it will not work . even any unicode numeric
number will not work like that.

Since we only deal with base10 systems (AFAICT), this is just a small matter of programming. We have the en->bn mapping for numbers stored in a file:

https://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/languages/messages/MessagesBn.php?revision=110165&view=markup

So changing "২ + ৩" into something that parserfunctions can understand ("2 + 3" in this case) is fairly simple. Here is an example of some code I just came up with:

$revert = array_flip($digitTransformTable);  # $revert now contains the
                                             # flipped numeral translation
                                             # from MessagesBn.php
$bn = "২ + ৩";
$bnTR = strtr($bn, $revert);
echo "$bnTR\n";
echo eval("return $bnTR;"), "\n";

This code prints the following:

2 + 3
5

In fact, poking around a bit, most of this functionality is built into MW and just not used in ParserFunctions.

I'll attach a patch to fix this bug for ParserFunctions {{#expr}}, but, after applying it on my local wiki where $wgLanguageCode = "bn", I can create a page with the following:

{{#expr: ২ + ৩}} <br>
{{#expr: {{CURRENTYEAR}} + 10 + + ৩}}<br>
{{CURRENTYEAR}}

And it will display:

৫
২০২৫
২০১২

I *think* this would solve a great deal of the problem.

Created attachment 9958
Allow #expr to use non-arabic numerals

We're in the middle of a code slush and this needs review, but it is a start.

attachment ParserFunctions.diff ignored as obsolete

I've put this on http://winkyfrown.com/wiki/ so you can try it out and let me know what you think.

wikichaipau wrote:

(In reply to comment #11)

I've put this on http://winkyfrown.com/wiki/ so you can try it out and let me
know what you think.

That seems to work! But I want to be sure---you are seeking a solution for all languages and not only for Bengali, right?

(In reply to comment #12)

That seems to work! But I want to be sure---you are seeking a solution for all
languages and not only for Bengali, right?

The code there will work for any language that MW has a numeral mapping for. This includes Arabic, several Indic languages and more.

Updated patch and testwiki with code for #time

Created attachment 9964
Allow #expr and #time to use non-latin numerals

update based on test wiki use

Attached:

note some use of the test wiki depends on templates that aren't there. I'll add those later today.

Added the templates. All that is needed now is a reverse conversion of month names.

Yes non-Arabic number works fine , so need to fix Bug 19412

  • Bug 34335 has been marked as a duplicate of this bug. ***

at http://winkyfrown.com/wiki/

{{#time:Y F j|{{{1|{{CURRENTYEAR}}}}}-{{{2|{{CURRENTMONTH}}}}}-{{{3|{{CURRENTDAY}}}}}}} output shown ২০১২ ফেব্রুয়ারি ১৪(today 2012 February 14) (OK)

But {{#time:F j|{{{2|{{CURRENTMONTH}}}}}-{{{3|{{CURRENTDAY}}}}} -2 days}}=Error:Invalid time (not OK).

If First one is working fine, why next one is not working?

And others Error as shown in main page.

(In reply to comment #20)

But {{#time:F j|{{{2|{{CURRENTMONTH}}}}}-{{{3|{{CURRENTDAY}}}}} -2
days}}=Error:Invalid time (not OK).

If First one is working fine, why next one is not working?

If you look at the docs https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23time you'll see that this is expected behavior and why

I've updated the code on my testwiki to parse month names .... w000! Now, safesubst

Created attachment 10008
Month name parsing for core

Attached:

after updating to 1.19 it doesn't have any effect on fa.wiki ! specially {{#expre ۱+۱}} shows error

(In reply to comment #26)

after updating to 1.19 it doesn't have any effect on fa.wiki ! specially
{{#expre ۱+۱}} shows error

Yes, sorry. It was too late to make it into 1.19.

The patch looks good to me, in the sense that it doesn't

[The previous comment was submitted by mistake in a very funny way.]

The patch looks good to me, in the sense that it doesn't seem to break anything major. I also tested in Devanagari on my local wiki and it worked. A test page in the live Hindi Wikipedia: https://hi.wikipedia.org/wiki/User:Amire80/native_numbers .

However:

  1. (Probably) Small problem: It always runs parseFormattedNumber, even when that is not needed. Maybe it should run it only on wikis that use such numbers.
  2. (Somewhat) Larger problem: This is correct if the requirement is to always present the result in the native numbers and if it is considered OK to mix native and non-native numbers. This may be fine, but is there a specification somewhere or is it just a random decision?

Note this is a dupe of bug 30318 which is marked wontfixed. I personally think this will break stuff (The expr part. I have no opinion on the date stuff). It will cause expressions to be interpreted totally differently depending on language. Do we really want {{#expr: 10.1+1}} to be either 101 or 11.1 depending on wiki language? It would prevent people from copying templates from different languages.

(Note there is a work around of doing {{#expr: {{FORMATNUM:{{CURRENTYEAR}}|R}} + 10}} for the example use case presented in comment 0)

  • Bug 32807 has been marked as a duplicate of this bug. ***

(In reply to comment #30)

Do we really want {{#expr: 10.1+1}} to be either 101 or 11.1
depending on wiki language? It would prevent people from copying templates from
different languages.

I don't understand how it would prevent copying templates, unless the templates deal with formatting numbers -- and this change would make formatting numbers clearly dependent on the language of the wiki.

I think this is most clearly useful for wikis that target people in India (http://en.wikipedia.org/wiki/Indian_numbering_system). Since that is an area the WMF is targeting, I think that should be considered.

We target language users by providing wikis in their language so that they're comfortable using them, but, then, when it comes to a basic part of their interaction with the wiki -- numbers and dates -- we require that they adapt themselves to Western conventions.

Sophisticated users are probably fine with the current situation, but from the brief look I've had at discussions on hiwiki and bnwiki, there are a significant number of people there who would like something they feel more comfortable with.

(In reply to comment #29)

  1. (Somewhat) Larger problem: This is correct if the requirement is to always

present the result in the native numbers and if it is considered OK to mix
native and non-native numbers. This may be fine, but is there a specification
somewhere or is it just a random decision?

Agreed, there should be broader discussion about this.

I don't understand how it would prevent copying templates, unless the templates
deal with formatting numbers -- and this change would make formatting numbers
clearly dependent on the language of the wiki.

Well all decimal numbers are "formatted". Some templates need constants in them.

Example: Put this patch on your test wiki. Set wiki language to nl, import [[template:Precision]] from en.wikipedia, watch it break (It gives wrong answers for non-formatted, and gives errors for formatted).

We target language users by providing wikis in their language so that they're
comfortable using them, but, then, when it comes to a basic part of their
interaction with the wiki -- numbers and dates -- we require that they adapt
themselves to Western conventions.

I wouldn't call using {{#expr a basic part of wiki-editor. Its very easy to create templates using #expr that read formatted numbers so that the average user doesn't have to deal with it. Yes it would be nice if it all magically worked, but i'm worried this introduces further problem. (I suppose one could call {{Formatnum:...}} on every constant in a template, and hence this would just shift responsibility for who has to call formatnum)

Would the scope of this bug also include [[Chinese numerals]]?

(In reply to comment #34)

Would the scope of this bug also include [[Chinese numerals]]?

At the moment the chinese language files are set to use plain old 0123456789 type numerals. I believe this bug is more about supporting just whatever the default formatted number output for a language is, which does not include [[Chinese numerals]].

Ping, Bug 19412 must fixed with corresponding this bug.

(In reply to comment #34)

Would the scope of this bug also include [[Chinese numerals]]?

This would be a start to supporting [[Chinese numerals]], but AFAICT,
[[Indian numerals]] are more straight-forward.

That is, I don't think the scope of this bug would cover [[Chinese numerals]], but a bug to support [[Chinese numerals]] would imply that the more straight-forward [[Indian numerals]] are already supported.

I say this because the representation for 12,345,678,902,345 from
[[Chinese numerals]] is not a one-to-one mapping. Instead it is
十二兆三千四百五十六億七千八百九十萬兩千三百四十五 where in Devanagari it would be
१२३४५६७८९०२३४५

(In reply to comment #29)

[The previous comment was submitted by mistake in a very funny way.]

The patch looks good to me, in the sense that it doesn't seem to break anything
major. I also tested in Devanagari on my local wiki and it worked. A test page
in the live Hindi Wikipedia:
https://hi.wikipedia.org/wiki/User:Amire80/native_numbers .

However:

  1. (Probably) Small problem: It always runs parseFormattedNumber, even when

that is not needed. Maybe it should run it only on wikis that use such numbers.

  1. (Somewhat) Larger problem: This is correct if the requirement is to always

present the result in the native numbers and if it is considered OK to mix
native and non-native numbers. This may be fine, but is there a specification
somewhere or is it just a random decision?

Note about the hi-wp subpage that it is set up assuming hi-wp uses devanagari numerals by default. However, hi-wp has default numerals set to arabic numerals, hence it is half incorrect. A subpage on hi-wikt for the response test is [[:w:hi:wikt:User:Siddhartha Ghai/native numbers]]. You'll find that using arabic numerals gives arabic numerals, using devanagari numerals or arabic-devanagari mixed gives errors.

Note:hi-wikt uses devanagari numerals as default.

(In reply to comment #39)

Note:hi-wikt uses devanagari numerals as default.

But devanagari numerals don't work with parser functions which (part of) what this bug is about.

That is {{#expr: १ + १}} returns an error, though the output of
{{#expr: 1 + 1}} would be 2.

I've set up a test on
http://hi.wiktionary.org/wiki/User:MarkAHershberger/bug34193

(In reply to comment #40)

But devanagari numerals don't work with parser functions which (part of) what
this bug is about.

I know. Just wanted to clarify that a test on hi-wp does not provide the correct picture (which can be seen at hi-wikt).

Just to add, its really important from a user perspective to have the ability to use parser functions using non-arabic numerals.

This doesn't seem like a tracking bug, so removing bug 2007 as depending on this one (and updating title and removing "tracking" keyword).

See https://en.wikipedia.org/wiki/User_talk:MarkAHershberger#bn:Template:Convert where [[User:Johnuniq]] indicates that he is doing Lua work on this bug. Maybe making this superfluous?

Certainly, it seems on-wiki control is better (in some sense) than this patch.

We're slowly switching away from parser functions to LUA. Would LUA have the same problem?

this is very old problem. Can some developers add this in their Wishlist. Because almost 6 or 7 years gone for this.

Problem
When we use temples in Wikipedia, can use Devanagari as input. like - https://hi.wikipedia.org/w/index.php?title=विनायक_दामोदर_सावरकर&action=edit

{{death date and age|1966|2|26|1883|5|28}} this can't be worked like this
{{death date and age|१९६६|२|२६|१८८३|५|२८}}

Who would benefit
All Indic Languages will be benefited. Because most of have these Devanagari numbers or similar numbers. And All those Wikipedias in which number system is not 1234567890. Like १२३४५६७८९० or ૧૨૩૪૫૬૭૮૯ or ১২৩৪৫৬৭৮৯০ or ୧୨୩୪୫୬୭୮୯୦ or ੧੨੩੪੫੬੭੮੯੦

Proposed solution
I don't know, but If we can make different template or something for this. All Indiac Languages can use it.

@MarkAHershberger Sir Can you update us? Now this need to solve. Hindi Community Want to use Devanagari Number.

Problem
When we use temples in Wikipedia, can use Devanagari as input. like - https://hi.wikipedia.org/w/index.php?title=विनायक_दामोदर_सावरकर&action=edit

{{death date and age|1966|2|26|1883|5|28}} this can't be worked like this
{{death date and age|१९६६|२|२६|१८८३|५|२८}}

Who would benefit
All Indic Languages will be benefited. Because most of have these Devanagari numbers or similar numbers. And All those Wikipedias in which number system is not 1234567890. Like १२३४५६७८९० or ૧૨૩૪૫૬૭૮૯ or ১২৩৪৫৬৭৮৯০ or ୧୨୩୪୫୬୭୮୯୦ or ੧੨੩੪੫੬੭੮੯੦

Proposed solution
I don't know, but If we can make different template or something for this. All Indiac Languages can use it.

You can fix it by replacing every instance of {{{1}}} with {{formatnum:{{{1}}}|R}} in the template (and also for {{{2}}} and {{{3}}}, etc)

Bawolff renamed this task from Add support for non-Arabic number systems to Make #expr accept localized digits in expressions.Oct 28 2017, 1:02 PM

Problem
When we use temples in Wikipedia, can use Devanagari as input. like - https://hi.wikipedia.org/w/index.php?title=विनायक_दामोदर_सावरकर&action=edit

{{death date and age|1966|2|26|1883|5|28}} this can't be worked like this
{{death date and age|१९६६|२|२६|१८८३|५|२८}}

Who would benefit
All Indic Languages will be benefited. Because most of have these Devanagari numbers or similar numbers. And All those Wikipedias in which number system is not 1234567890. Like १२३४५६७८९० or ૧૨૩૪૫૬૭૮૯ or ১২৩৪৫৬৭৮৯০ or ୧୨୩୪୫୬୭୮୯୦ or ੧੨੩੪੫੬੭੮੯੦

Proposed solution
I don't know, but If we can make different template or something for this. All Indiac Languages can use it.

You can fix it by replacing every instance of {{{1}}} with {{formatnum:{{{1}}}|R}} in the template (and also for {{{2}}} and {{{3}}}, etc)

Yes, it is possible but for importing or updateing many templates from en.wikipedia its so difficult and boring.