Page MenuHomePhabricator

Add CSS classes based on writing system code
Open, MediumPublicFeature

Description

It would be useful to have the languages that MediaWiki supports be aware of the alphabet in which they are written.

ISO 15924 writing systems code can be used for the most part, so it will be "latn" for English, French, Vietnamese, etc.; "cyrl" for Russian, Udmurt, etc.; "deva" for Hindi, Marathi, etc. - and so on.

This is useful for applying special styles that are common to all languages written in a script. For example the notorious line-height fix is applied in shared.css by element name and language code, and it could be greatly simplified by using the relevant script codes so instead of having selectors for bh, bho, hi, ks, mr, mai, ne, new, pi and sa, there would only be one selector for class deva.

Of course, it will have to take into account language variants (Serbian Cyrillic / Latin, Kazakh Cyrillic / Latin / Arabic etc.) and also special cases not covered by ISO 15924, such as the internal differences between styles of Arabic (so Arabic may use "arab" and Urdu may use something like "arab-nastaleeq").

An easy way to implement this would be to add a variable to each Messages file, similarly to $rtl. (In fact, this can make $rtl obsolete, because the direction is inherent in the writing system.)

Comment 1: In this bug, "writing system", "script" and "alphabet" are synonyms. I mentioned all to make it easier to find.

Comment 2: I vaguely recall that there is a bug for this, but I cannot find it. If there is one, please mark this one as a dupe.


Version: unspecified
Severity: enhancement

Details

Reference
bz57045

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:39 AM
bzimport set Reference to bz57045.
bzimport added a subscriber: Unknown Object (MLST).

Where do you think adding these CSS classes? If I understand correctly the structure of the page, the <html> tag has the language and directionality of the user interface language and the <div id="mw-content-text"> tag has the language and directionality of the page language. Adding these CSS classes on these two places?

Interwikis could perhaps also be considered.

(In reply to comment #1)

Where do you think adding these CSS classes? If I understand correctly the
structure of the page, the <html> tag has the language and directionality of
the user interface language and the <div id="mw-content-text"> tag has the
language and directionality of the page language. Adding these CSS classes on
these two places?

Interwikis could perhaps also be considered.

Actually, any place where we currently use lang and dir.

<html>, <div id="mw-content-text">, interlanguage links, ULS, various elements of the Translate and Wikibase extensions. Some Day Soon, when VisualEditor will have full support for marking pieces of text in another language, then there, too. (It's already a beta feature.) Many of these are not in the core, but they will need it just as well, so there probably should be core PHP and JS functions that make it easy to add them everywhere.

(In reply to comment #2)
Some Day Soon, when VisualEditor

will
have full support for marking pieces of text in another language, then there,
too. (It's already a beta feature.)

... Just checked, and it doesn't seem to be deployed anywhere in Wikimedia. But it has been in development for months already.

I have two points related to this bug:

1/ Is the information of the "ISO 15924 <-> language code" somewhere in MediaWiki? I quickly searched and didn’t find.

2/ Rethinking about the CSS classes, there is a problem in adding multiple levels of CSS classes: if there is

<html class="writing-Latn"><div class="writing-Cyrl">здесь</div></html>

then the inner <div> is affected by the classes "writing-Latn" and "writing-Cyrl", and there is no mean to know what is the most inner class between the two.

(In reply to comment #4)

I have two points related to this bug:

1/ Is the information of the "ISO 15924 <-> language code" somewhere in
MediaWiki? I quickly searched and didn’t find.

Not in core, but it's available in the UniversalLanguageSelector extension's "langdb" feature. Its most readable version can be found here:
https://github.com/wikimedia/jquery.uls/blob/master/data/langdb.yaml

(It's tracked in GitHub and periodically updated in Gerrit.)

I occasionally fantasize about separating the langdb into an independent jQuery library and adding it to core.

Slightly related: Bug 56908

@Amire80 I was incredibly happy to have stumbled into this task. When developing on several of our products, I've been asking a similar question about 3 years ago, but one that should in the mid-term be resolved by the W3C in CSS standard in my opinion. We've got :lang() selector, why not :lang-script() as additional pseudo-class.
Maybe you are interested to work together on a write-up for when such a selector would come helpfully into play?

@Amire80 I was incredibly happy to have stumbled into this task.

Thanks, it means a lot :)

For years, I've been having these big ideas, and it's not always easy to convince people they're necessary.

When developing on several of our products, I've been asking a similar question about 3 years ago, but one that should in the mid-term be resolved by the W3C in CSS standard in my opinion. We've got :lang() selector, why not :lang-script() as additional pseudo-class.
Maybe you are interested to work together on a write-up for when such a selector would come helpfully into play?

It would be nice to do it in a universally standard way, as you suggest, but my experience with the W3C till now has been that they are very reluctant about deriving a writing system from a language. I recommended deriving the dir from the lang: if lang="he", shouldn't the renderer assume dir="rtl" by default? Apparently, they really didn't want it. Their argument was that it will hurt "small" languages like Azerbaijani and Punjabi, which can be written in both LTR and RTL alphabets. I disagree with this argument, at least as far as these two languages go, because these languages have different codes for the LTR and RTL versions (az/azb, pa/pnb), and we have fairly active Wikipedias in all four of them, but this didn't convince them.

(I did manage to convince them to apply bidi isolation by default to elements that have an explicit dir attribute. My little contribution to the HTML/CSS standard.)

So yeah, it would be nice, but because it's similar to that thing in which I failed, I suspect that it will not be easy to actually get done. But hey, I'm willing to try again! Share a doc or a wiki page with me or something.

But until it succeeds, perhaps we should do something by ourselves, and it will be a real-world proof of concept for them...

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 12:23 PM
Aklapper removed a subscriber: wikibugs-l-list.
Winston_Sung moved this task from Vertical writing to RTL on the I18n board.