Page MenuHomePhabricator

Special case Traditional and Simplified Chinese in all language handling
Closed, ResolvedPublic3 Estimated Story Points

Description

Background: Written Chinese has two variants, Traditional Chinese (used in Taiwan, Hong Kong and Macau) and Simplified Chinese (used in China and elsewhere). Beyond that, there are some usage differences within the same variant. The Chinese Wikipedia supports variants and adjusting for usage differences using the LanguageConverter feature in MediaWiki. Editors can add text in their preferred variant (and articles would frequently contain both), which is then converted on display based on user preferences or browser settings.

Observed behaviour: When device language is English and the user is not logged in, zhwiki is always shown without converting the variant.

Expected behaviour: There should be a way to choose the variant shown.

Steps to reproduce:

  1. Go to the Settings app
  2. Go to Language & input
  3. Change the device language to English (United States)
  4. Open the Wikipedia app
  5. Tap on the menu icon and choose More
  6. Set Wikipedia language to Chinese
  7. All pages are displayed without variant conversion

This is also the case when the views a page in Chinese using the "Read in another language" feature. Note that variant conversion does happen when the device language is set to Traditional Chinese or Simplified Chinese, or if the user is logged in and has a variant preference set on zhwiki.

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 3:04 AM
bzimport set Reference to bz60743.
bzimport added a subscriber: Unknown Object (MLST).

Minor correction: The current entry for Chinese loads the source text (which could be a mix of both, depending on who edited the article) without converting it. It does not load Traditional Chinese by default.

Thanks for the clarification :) But I suppose the proposed solution (special case the two variants) is acceptable?

Yes. For the time being I don't think supporting showing the unconverted text or the other regionalised variants (zh-tw, -cn, -hk, -mo, -sg) would make for better UX.

Note that some other languages use the same language converter system such as sr; we may as well throw them all in the same special-casing system so we don't have to do more work later.

I'd recommend hardcoding a list of known languages-with-variants, and displaying the variants as multiple adjacent items in the main language list.

Then we use the combined language + variant to fetch from the right wiki (language in URL) and variant (usevariant=).

Hmmm, how will this complicate history & bookmarking? Should we combine them in the entries?

Deskana claimed this task.

This bug has insufficient context for us to act on it and is very confusing, so I am marking it as invalid.

Please provide more specific instructions on exactly what the problem is, how to reproduce it, and give suggestions for possible solutions.

Thanks!

This also comes up quite quite regularly in the app store reviews.

This is my current understanding of the current situation:
Some improvements have already been made to the situation since this task was originally created:
Now we have code in the WikipediaApp class to special case "zh" to also set the language variant parameter as part of the Accept-Language header when requesting page content from the server.

Let's say user A has set her device to zh-TW (Chinese in Taiwan) and looks at a Chinese page then the page would be displayed in Traditional Chinese. The UI would also be displayed in Traditional Chinese. So far so good.

User B from mainland China who set his device to zh-CH is also happy since he gets Simplified Chinese for both pages and UI.

The problem appears for user C, who has moved away from Taiwan and has set his device to English or any other language than Chinese. When he looks at a Chinese page he would always get Simplified Chinese for the page and the UI. There is no way for him to specify within the app that he wants to see everything in Traditional Chinese. The workaround would be to switch his device to zh-TW or zh-HK. Depending on the device model that may or may not be done easily. Ok, there is the MoreLocale 2 app which let's you chose almost any language. But at the least it's very cumbersome to do this back and forth if you want to have your device in English.

Possible solutions:
One possible solution would be to split the "Chinese" entry into two separate entries: "Simplified Chinese" and "Traditional Chinese", so that the page language can be requested in the correct language variant. Either that, or have a submenu, or a toggle somewhere else in the settings. Currently I tend to favor the first approach since it requires no direct UI change.

In our languages_list.xml (and therefore generate_wiki_languages.py) we could make some modifications to allow for an optional variant parameter. Or we could hard-code that in LanguagePreference.java.

When a language with a variant parameter is selected by the user the app would store the preferred language variant in another Shared Preference entry (probably one per base language). The code in WikipediaApp special casing the handling for zh would also have to be modified to use the SharePreference value instead.

Is this no longer an issue? I'm confused what context this bug is lacking. I no longer use an Android device, so I can't really help, but I'm sure there are Android users who could help clarify anything the devs need to resolve the issue. If this is a wontfix because viewing zhwiki on a non-Chinese-language device is considered too much of an edge case, could that be made clear?

(The way MobileFrontend handles the issue is that they show a single entry for Chinese in the language list, but once you're viewing the article in Chinese, clicking "Read in another language" shows all the Chinese variants as options at the top, above the list of other languages.)

Note that morelocales2 is a lot more complicated now, and you would have to root your device or have developer access to use it. (See http://droider.eu/2013/11/03/how-to-get-morelocale2-to-work-on-4-2-and-above/)

Deskana changed the task status from Open to Stalled.Apr 13 2015, 9:47 PM

Is this no longer an issue? I'm confused what context this bug is lacking. I no longer use an Android device, so I can't really help, but I'm sure there are Android users who could help clarify anything the devs need to resolve the issue. If this is a wontfix because viewing zhwiki on a non-Chinese-language device is considered too much of an edge case, could that be made clear?

The reason I closed this bug is because we do not have clear steps to reproduce the problem. This is still the case. Can you please look at the recommended information for a bug and update the request? Thanks. https://www.mediawiki.org/wiki/Wikimedia_Apps/Team/Bug_reporting#Description

wctaiwan changed the task status from Stalled to Open.Apr 21 2015, 9:37 PM
wctaiwan updated the task description. (Show Details)

I've edited the task description. Let me know if any additional clarification is needed. (I'm testing this using a VirtualBox VM running Android.)

For reference, MobileFrontend handles this by showing a list of variants above the list of languages when the user clicks "Read in another language" on zhwiki, as in this screenshot:

Capture.png (632×460 px, 75 KB)

There is a bit of back and forth here, I can find some time to discuss this with Deskanaa today and submit some recommendations if there is UX impact.

Hello, please excuse my ignorance in asking some questions. Is this an issue with app wide language settings, article language settings, or both? Do we want to mimic mobile frontend? For reference, here's our current implementation for app wide settings:

2015-04-29-13-29-58-196727488.png (1×977 px, 367 KB)

Although this screenshot only shows Chinese and Classical Chinese, without the filter I also see some entries mentioned here including Wu, Gan, Hakka, and Cantonese. Here's the article of the day in Hakka:

2015-04-29-13-38-22-438979157.png (1×977 px, 546 KB)

Similarly, the article of the day in Cantonese:

2015-04-29-13-41-33-300197801.png (1×977 px, 768 KB)

I apologize for asking stupid questions, but the actions required for this task cannot be over communicated due to the language barrier.

After some discussion with Bernd and Michael:

The user visible acceptance criteria are to replace "Chinese" with "Chinese (Simplified)" and "Chinese (Traditional)". These should work like any of the other language choices in app wide language or article language. Language choice should have effect for articles. Additional UI effect, such as for the edit summary, are not considered in scope.

Technical detail:

  • Ideally, Android will accept zh-Hans (simplified) and zh-Hant (traditional). If so, this would largely only require updates to languages_list.xml. If Android does not accept script designators, some hackery to store this extra information may be necessary.
  • Try to get rid of WikipediaApp.getAcceptLanguage().

Non-Chinese reader tips:

  • Simplified: strokes are simpler and akin to Latin sans-serif fonts ("without feet").
  • Traditional: strokes are more elaborate and akin to Latin serif fonts.

The quoted proposed solution sounds fine. It might be better if it still has a single Chinese entry for logged in users and respects the user preference set on zhwiki, but the added complexity is probably not worth it.

without the filter I also see some entries mentioned here including Wu, Gan, Hakka, and Cantonese

Those are distinct Wikipedias and it looks like none of them use LanguageConverter, so they don't require special handling.

Non-Chinese reader tips:

  • Simplified: strokes are simpler and akin to Latin sans-serif fonts ("without feet").
  • Traditional: strokes are more elaborate and akin to Latin serif fonts.

That's actually not quite true--there are sans-serif and serif fonts for both, just as there are for English. Those are probably system defaults. The simple/elaborate strokes distinction is accurate.

Note that it's crucial for the editing view not to convert the source text, regardless of the chosen variant (the current behaviour is correct). This is because changing a page to use one's preferred variant is considered vandalism on zhwiki. This probably isn't something that needs to be special-cased (I don't think there's a way to fetch the source text in a specific variant), but I'm mentioning just in case.

@bearND, @wctaiwan, apologies, I'm still struggling with the desired behavior. There is no zh-hans / zh-hant .wikipedia.org subdomain but the following endpoints exist:

Chinese: https://zh.wikipedia.org/wiki/%E5%86%B0%E6%B7%87%E6%B7%8B
Simplified: https://zh.wikipedia.org/zh-hans/%E5%86%B0%E6%B7%87%E6%B7%8B
Traditional: https://zh.wikipedia.org/zh-hant/%E5%86%B0%E6%B7%87%E6%B7%8B

The current behavior is to use zh.wikipedia.org and adjust the Accept-Language header. Do we want different endpoints, headers, or both? I believe the desired behavior is use Accept-Language headers for all locales.

Niedzielski changed the task status from Open to Stalled.May 5 2015, 12:34 AM

@Niedzielski I'd prefer to just set the Accept-Language header appropriately and avoid using different endpoints. I think the links you have are good for testing, though, to compare hans vs. hant.
The Android resource system doesn't know hans/hant, but that doesn't matter for the Accept-Language header value (string) we build ourselves.
As I mentioned in a meeting with you earlier at least while developing, hopefully later, too, you should disable some of the complicated logic in WikipediaApp that sets the Accept-Language header in lieu of something that gets set by a SharedPreference or similar, which in turn gets set by the selection of the distinct language variant from the languages list ("Chinese" -> "Simplified Chinese", plus new entry for "Traditional Chinese).

Also, keep in mind that later we also add language variant handling for a few more languages that use Cyrillic script. The Chinese variants are the more pressing ones right now, since there are more people dealing with that.

Change 210101 had a related patch set uploaded (by Niedzielski):
Add Simplified and Traditional Chinese dialects

https://gerrit.wikimedia.org/r/210101

Change 210101 merged by Dbrant:
Add Simplified and Traditional Chinese dialects

https://gerrit.wikimedia.org/r/210101

@Vibhabamba ping do you think design needs to sign off on this or can we move to ready for product signoff?