Page MenuHomePhabricator

Math expressions in RTL wikis should have LTR alignment
Open, MediumPublic

Assigned To
None
Authored By
Yamaha5
Aug 30 2011, 11:30 AM
Referenced Files
F7991: beforLTR.png
Nov 4 2015, 2:59 PM
F7997: IMAG0012.jpg
Nov 21 2014, 11:48 PM
F7994: irans_high_school_books.png
Nov 21 2014, 11:48 PM
F7995: url.txt
Nov 21 2014, 11:48 PM
F7993: collection.pdf
Nov 21 2014, 11:48 PM
F7992: after_JS_LTR.png
Nov 21 2014, 11:48 PM

Description

In wikis with RTL (right-to-left) content, if a line only contains a <math> tag, it should be left-aligned. Right now, this is

Observed:

beforLTR.png (310×1 px, 32 KB)

Expected:

after_JS_LTR.png (306×1 px, 32 KB)

This is currently achieved by a hack in FA WP Common.js which identifies <p> or <dd> elements that only contian a <math> element, and applying a CSS modification to the p or dd, namely setting direction to ltr.

Details

Reference
bz30630

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:48 PM
bzimport set Reference to bz30630.
bzimport added a subscriber: Unknown Object (MLST).

Created attachment 8986
after JS code

after using JS code in fa.wiki

Attached:

after_JS_LTR.png (306×1 px, 32 KB)

Adding SPQRobin since he has done a lot of RTL work so he can look at this.

  1. Is this actually a formatting change that would be desirable?
  1. Does this apply to all RTL languages?
  1. if 1) and 2) this should probably just be a CSS bit.

mathematics is RTL so its formula have to wright RTL (when formula is in single line)
it doesn't have problem in RTL wikis
in fa.wiki we correct it http://fa.wikipedia.org/wiki/%D9%85%D8%AF%DB%8C%D8%A7%D9%88%DB%8C%DA%A9%DB%8C:Common.js
but it is hacky and it makes some problem and doesn't work with extensions such as
PDF export extension.

if we don't correct it all of LTR wikis have to correct it by that JS code.

Created attachment 8995
incurrect direction result in PDF extension

Attached:

mathematics is LTR so its formula have to wright LTR (when formula is in single
line)
it doesn't have problem in LTR wikis
in fa.wiki (RTL) we correct it
http://fa.wikipedia.org/wiki/%D9%85%D8%AF%DB%8C%D8%A7%D9%88%DB%8C%DA%A9%DB%8C:Common.js
but it is hacky and it makes some problem and doesn't work with extensions such
as
PDF export extension.

if we don't correct it all of LTR wikis have to correct it by that JS code.

(In reply to comment #4)

mathematics is RTL so its formula have to wright RTL (when formula is in single
line)
it doesn't have problem in RTL wikis
in fa.wiki we correct it
http://fa.wikipedia.org/wiki/%D9%85%D8%AF%DB%8C%D8%A7%D9%88%DB%8C%DA%A9%DB%8C:Common.js

RTL is "right to left" and describes the overall text direction for Arabic and Hebrew scripts. Farsi text written in Arabic script will be laid out as RTL, with some numerals, Latin text, etc as embedded bits of LTR ("left to right").

In the examples you link to above, it *looks* like all the math formulas are meant to be read as LTR ("left to right").

The ones that appear inline in body text seem to be getting shown much like any other LTR bit (English text, numerals, etc).

The one that appears on a standalone line is, in the "before" picture, laid out as LTR. It's also right-aligned, matching the overall look of the page.

The "after" picture shows the exact same LTR layout of the formula, but also left-aligns the whole chunk, making it stand out as rather distinct from the rest of the page.

Is this how math formulas are normally laid out in Arabic, Farsi, Urdu, Hebrew, etc texts?

but it is hacky and it makes some problem and doesn't work with extensions such
as
PDF export extension.

if we don't correct it all of LTR wikis have to correct it by that JS code.

(In reply to comment #5)

Created attachment 8995 [details]
incurrect direction result in PDF extension

PDF export will be entirely separate software and bugs should be opened separately -- if about the PDF export on *.wikipedia.org etc that'll belong in with the Collection extension. There'll be a number of RTL and complex-script issues there.

Attached:

Created attachment 8997
iran's high school books screen shot http://www.chap.sch.ir/MaghtaList.asp

Attached:

irans_high_school_books.png (527×646 px, 46 KB)

in Farsi i am sure that mathematics formula is like "after JS" image.

Created attachment 8998
another screen shop from physics book

Attached:

Created attachment 8999
another screen shop from physics book

Attached:

IMAG0012.jpg (1×976 px, 911 KB)

Awesome -- these sorts of real-world usage examples are *very* helpful when we're evaluating things like this!

Will still want to double-check in other RTL languages to confirm that's generalizable.

Indeed.

Notified Amir about this bug, and adding him as CC. I suppose he can tell how it's written in Hebrew.

I checked a few printed Hebrew math textbooks. In Hebrew, the *directionality* of the formula itself is LTR. If the formula is in a paragraph by itself, the *alignment* (text-align) of the paragraph is usually 'center' or 'left', but it is conceivable that it would be 'right'.

If the default text-align of a paragraph that consists of nothing but a formula will be 'left', nobody in the Hebrew Wikipedia would complain.

It must, of course, be remembered that formulas can be inline.

The trick I think is that we need to make the _parent block_ LTR if all it contains is one (just one? what about one or more?) LTR math elements.

The JS on fa.wikipedia does this by checking all math bits (img.tex etc) to see if they have a <p> or <dd> parent that doesn't seem to have other stuff in it, and then adjusting directionality there. This may be doable through CSS instead (cleaner, doesn't require postprocessing) but ... I think it would require additional stuff we don't have.

This may really be a more hairy problem, akin to detecting that only LTR text is present in some paragraph and declaring it to be LTR-primary instead of RTL-primary. Urk!

nzmoihue wrote:

I wrote that little JS on fa.wikipedia, I tried some CSS tricks that I knew. For example this:
p > img.tex:first-child:last-child {

display: block;
text-align: left;

}
But there is some some problem here:
0) I don't think this is cross-browser compatible.

  1. "display: block" not working well (a wrapper needed) so in this situation "text-align: left" has no effect I think.
  2. This CSS can not find out there is some texts in that paragraph addition of equation.
  3. Two (and more) equations in a paragraph will reject by this.

volker.haas wrote:

The PDF rendering engine now renders math nodes left aligned.

The following examples where used to check for correctness:

http://fa.wikipedia.org/w/index.php?title=%DA%A9%D8%A7%D8%B1%D8%A8%D8%B1:Volker.haas&oldid=5505734

fixed mainly with:

https://github.com/pediapress/mwlib/commit/d56e5bc389fadcd2ef1f4d78bc163a3f3d106871

in samples if you see left side of formula when it is inside text it does'nt have space for example:
http://fa.wikipedia.org/wiki/%DA%A9%D8%A7%D8%B1%D8%A8%D8%B1:Reza1615/pdf
in firest line of sample between (formula) "P(A|B)" and "را" in pdf version it doesn't have space

volker.haas wrote:

(In reply to comment #19)

in samples if you see left side of formula when it is inside text it does'nt
have space for example:
http://fa.wikipedia.org/wiki/%DA%A9%D8%A7%D8%B1%D8%A8%D8%B1:Reza1615/pdf
in firest line of sample between (formula) "P(A|B)" and "را" in pdf version it
doesn't have space

I don't know exactly what goes wrong, but I fear there's not much I can do about it. Spacing around inline images is a bit flaky.

(In reply to comment #21)

it has also this problem

http://en.wikipedia.org/wiki/User:Reza1615/pdf

Please create a new bug for that. The reported issue is solved.

(In reply to comment #22)

(In reply to comment #21)

it has also this problem

http://en.wikipedia.org/wiki/User:Reza1615/pdf

Please create a new bug for that. The reported issue is solved.

It solved only on PDF export, in RTL wikis it doesn't change

sumanah wrote:

Now that we have deployed MediaWiki 1.18 to WMF wikis: is this problem still reproducible?

(In reply to comment #0)

Math extension equations in RTL wikis is not left aligned.

Quick update:
http://fa.wikipedia.org/wiki/%D9%82%D8%B6%DB%8C%D9%87_%D8%A8%DB%8C%D8%B2 shows that math expression are LTR (because of JavaScript in comment 0)
http://ar.wikipedia.org/wiki/%D9%85%D8%A8%D8%B1%D9%87%D9%86%D8%A9_%D8%A8%D8%A7%D9%8A%D8%B2 and http://he.wikipedia.org/wiki/%D7%97%D7%95%D7%A7_%D7%91%D7%99%D7%99%D7%A1#.D7.A7.D7.99.D7.A9.D7.95.D7.A8.D7.99.D7.9D_.D7.97.D7.99.D7.A6.D7.95.D7.A0.D7.99.D7.99.D7.9D they are RTL.

If I get it right, the only remaining situation when the problem applies are lines that have a math expression only and no other surrounding text?

yes! it works only for lines which doesn't have text and only <math> used in them

Just stumbled upon this bug via Wikitech-I.

For background information RTL+LTR and RTL+RTL, this old W3C document is still useful http://www.w3.org/TR/arabic-math/ -- just keep in mind that it's about MathML 2 -- MathML 3 now supports RTL/LTR fully.

nzmoihue wrote:

(In reply to comment #28)

Just stumbled upon this bug via Wikitech-I.

For background information RTL+LTR and RTL+RTL, this old W3C document is
still
useful http://www.w3.org/TR/arabic-math/ -- just keep in mind that it's about
MathML 2 -- MathML 3 now supports RTL/LTR fully.

It is fantastic but not very related, this bug is about making math equation left aligned like what we use in our books. Many users was using <div style="text-align: left"> or dir="ltr" to make it on Persian Wikipedia articles so I wrote that little JS code to make only element in a line/paragraph left aligned.

Huji removed Huji as the assignee of this task.Nov 11 2016, 7:51 PM
Huji updated the task description. (Show Details)

The issue here is, the only way for this to work is by modifying the CSS of the <p> or <dd> tag in which the math expressions occurs, if and only if the entire content of that <p> or <dd> is a math expression. However, the Math extension does not have access to that surrounding <p> or <dd> tag (what is given to the Math extension is the inside, and it will parse the inside and return it in a <span> tag).

More simply put, the Math extension itself does not have any knowledge whether what is passed to it is an inline math expressions, or one that is the entire content of a paragraph.

For instance, see this page. The first formula is inline, and the <p> tag that contains it also contains some other text. The second formula is not inline, as in the <p> tag that contains it does not contain any other visible elements. But the Math expression wouldn't know the difference; in both cases, the input for the Math expression is exactly identical.