Page MenuHomePhabricator

Implement LanguageConverter for yue (Cantonese)
Open, LowPublicFeature

Description

Local discussion: https://zh-yue.wikipedia.org/w/index.php?title=Wikipedia:%E5%9F%8E%E5%B8%82%E8%AB%96%E5%A3%87_%28%E5%8D%94%E5%8A%A9%29&diff=816932&oldid=816911

Current local JavaScript converter: https://zh-yue.wikipedia.org/wiki/MediaWiki:Gadget-trad2sim.js

Draft implementation: Change-Id: Iee936baa0a42370a723b34b09a791bf0917dcdf4


Version: 1.23.0
Severity: enhancement

Related Objects

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:14 AM
bzimport set Reference to bz57106.
bzimport added a subscriber: Unknown Object (MLST).

Change 78504 had a related patch set uploaded by Liangent:
Implement LanguageConverter for yue (Cantonese)

https://gerrit.wikimedia.org/r/78504

This comment was removed by C933103.

If I understand correctly, yue wikipedia is currently using javascript to provide transliteration into hans (simp. chinese)? shouldn't that be done by language converter too?

This is just requesting such action

If I understand correctly, yue wikipedia is currently using javascript to provide transliteration into hans (simp. chinese)? shouldn't that be done by language converter too?

This is just requesting such action

sry just mixed up something in my mind.

Change 78504 had a related patch set uploaded (by Fomafix):
Implement LanguageConverter for yue (Cantonese)

https://gerrit.wikimedia.org/r/78504

Change 78504 had a related patch set uploaded (by Fomafix; owner: Liangent):
[mediawiki/core@master] Implement LanguageConverter for yue (Cantonese)

https://gerrit.wikimedia.org/r/78504

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 12:23 PM
Aklapper removed a subscriber: wikibugs-l-list.
Winston_Sung added a subscriber: shinjiman.

Also there's hasn't any consensus for the Yue community to implementing any Language Converter Functions onto the Yue language.
For my opinion I would oppose to implement the Language Converter Functions for the Yue language.

This would not be an enhancement. I have worked at a newspaper publisheer before and I know very well the pitfalls of any such language converters; the last thing I’d like to see is to introduce all the typos typical of automatic language conversion onto my now-home wiki.

In fact I stopped using zhwiki partly because of the existence of the language converter. This ticket should be closed as invalid.

Change 78504 merged by jenkins-bot:

[mediawiki/core@master] Implement Language Converter for yue (Cantonese)

https://gerrit.wikimedia.org/r/78504

Winston_Sung changed the task status from Open to In Progress.May 21 2023, 4:24 AM
Winston_Sung assigned this task to liangent.

Given the above opposing comments and the various concerns mentioned in https://zh-yue.wikipedia.org/wiki/Wikipedia_talk:用字轉換, I believe it's best to hold off enabling the change on Wikipedia at the very least. Granted, some of the issues are from as early as 2006, and LC has evolved in the meantime; but surely it doesn't hurt to have a local discussion to gauge the community's attitude towards LC in 2023 first.

I believe it's best to hold off enabling the change on Wikipedia at the very least.

As Gadget-tradsim-trad2sim.js in https://zh-yue.wikipedia.org/wiki/Special:Gadgets is already enabled by default, the merged Gerrit change implements the same thing from the gadget: One-way transliteration for "Hant source, Hant + Hans view" (which means only Hant can be used in the source code, which keeps the same before the merged change).

This only makes the enabled Gadget a core functionality.

Thanks for the clarification. Assuming the search behaviour remains the same (no normalization to Hans), personally I'm fine with this change. We have an AbuseFilter in place to block all LC syntax so those side effects can be ignored.

This seems to have changed the default UI language to be something other than yue (yue-hant presumably). By default, all interface messages are now shown as their MediaWiki defaults.

Right now no customized interface messages are visible unless users deliberately set their UI language preference to yue

Also, DiscussionTools is now having trouble loading some of the UI messages and is defaulting back to English.

Right now no customized interface messages are visible unless users deliberately set their UI language preference to yue

Might need to copy them into language subpages.

This issue had already been addressed and filed as another task.

Right now no customized interface messages are visible unless users deliberately set their UI language preference to yue

Might need to copy them into language subpages.

Doesn't this go against the goal of this task, that is implementing LC to the extent of reaching feature parity with the JS gadget (and no more)? I don't think anyone consented to this.

This issue had already been addressed and filed as another task.

Do you mind pointing me to the task?

Might need to copy them into language subpages.

I'd say this is actually a side effect which should be fixed.

Might need to copy them into language subpages.

I'd say this is actually a side effect which should be fixed.

Fixing as in changing the default language back to yue?

Is there any timeline for when this mess will be fixed? DiscussionTools and MobileFrontend are both failing to serve some of the messages translated to Cantonese. This is unacceptable.

Is there any timeline for when this mess will be fixed? DiscussionTools and MobileFrontend are both failing to serve some of the messages translated to Cantonese. This is unacceptable.

I'll push a temporary fix.

Are there list of failed messages and some example URLs?

Change 923640 had a related patch set uploaded (by Winston Sung; author: Winston Sung):

[mediawiki/core@master] YueConverter: Temporary fix for custom system message display

https://gerrit.wikimedia.org/r/923640

Before we dive too deep into T229992, I have a question:

Right now no customized interface messages are visible unless users deliberately set their UI language preference to yue

Somehow the default UI language is not yue for some or all users (?). Could you provide an example with screenshots? So that I don't have to try to understand local reports in yue (if any).

The default UI language is now yue-hant instead of yue, and the default yue messages are being shown instead of the locally customized messages.

You can verify this by going to any page on yue.wikipedia.org and setting the UI language to yue and yue-hant and observe the difference.

Is there any timeline for when this mess will be fixed? DiscussionTools and MobileFrontend are both failing to serve some of the messages translated to Cantonese. This is unacceptable.

Dynamically loaded messages from the 2010 wikitext editor and ULS are also suffering from similar issues.

I've just put a note on Technical Village Pump to tell about this change. Let see if there's any feedback to implementing with this LanguageConverter module.

https://zh-yue.wikipedia.org/w/index.php?title=Wikipedia:%E5%9F%8E%E5%B8%82%E8%AB%96%E5%A3%87_(%E6%8A%80%E8%A1%93)&diff=prev&oldid=1989867

Please note that this changes will affecting Cantonese Wiktionary as well.

It turned out the culprit is the user option manager:

	private function loadOriginalOptions(
		UserIdentity $user,
		int $queryFlags = self::READ_NORMAL,
		array $data = null
	): array {
		$userKey = $this->getCacheKey( $user );
		$defaultOptions = $this->defaultOptionsLookup->getDefaultOptions();
		$isTempUser = $this->userNameUtils->isTemp( $user->getName() );
		if ( !$user->isRegistered() || $isTempUser ) {
			// For unlogged-in users, load language/variant options from request.
			// There's no need to do it for logged-in users: they can set preferences,
			// and handling of page content is done by $pageLang->getPreferredVariant() and such,
			// so don't override user's choice (especially when the user chooses site default).
			$variant = $this->languageConverterFactory->getLanguageConverter()->getDefaultVariant();
			$defaultOptions['variant'] = $variant;
			$defaultOptions['language'] = $variant;
			$this->originalOptionsCache[$userKey] = $defaultOptions;
			return $defaultOptions;
		}

But this should only happen for logged-out users, @H78c67c could you confirm?

It turned out the culprit is the user option manager:

	private function loadOriginalOptions(
		UserIdentity $user,
		int $queryFlags = self::READ_NORMAL,
		array $data = null
	): array {
		$userKey = $this->getCacheKey( $user );
		$defaultOptions = $this->defaultOptionsLookup->getDefaultOptions();
		$isTempUser = $this->userNameUtils->isTemp( $user->getName() );
		if ( !$user->isRegistered() || $isTempUser ) {
			// For unlogged-in users, load language/variant options from request.
			// There's no need to do it for logged-in users: they can set preferences,
			// and handling of page content is done by $pageLang->getPreferredVariant() and such,
			// so don't override user's choice (especially when the user chooses site default).
			$variant = $this->languageConverterFactory->getLanguageConverter()->getDefaultVariant();
			$defaultOptions['variant'] = $variant;
			$defaultOptions['language'] = $variant;
			$this->originalOptionsCache[$userKey] = $defaultOptions;
			return $defaultOptions;
		}

But this should only happen for logged-out users, @H78c67c could you confirm?

This happens for anonymous users, but I just created a new account and the default is yue-hant too.

So, right now there are two issues:

  1. Some messages loaded after the initial page load fall back to English
  2. All customised messages are being disregarded by default

If there isn't going to be a fix to both of these issues in the foreseeable future, can these changes be rolled back? It's ridiculous how little forethought there was and how little progress there is to make things right.

So, right now there are two issues:

  1. Some messages loaded after the initial page load fall back to English

Is there any examples? I never encountered this issue. If there's any, we should encounter them for zh-tw too (as zh-tw messages stored under zh-hant ).

  1. All customised messages are being disregarded by default.

It's the same issue for zh wikis.

The proposed fix is demonstrated here: T229992#8892837 https://patchdemo.wmflabs.org/wikis/1208833be0/wiki/

So, right now there are two issues:

  1. Some messages loaded after the initial page load fall back to English

Is there any examples? I never encountered this issue. If there's any, we should encounter them for zh-tw too (as zh-tw messages stored under zh-hant ).

I can only reproduce this intermittently, but for example when trying to add a new topic in talk pages through DiscussionTools, the placeholder for the field Title (which should be 標題) is sometimes shown in the original English form. And when creating new pages with the 2010 editor, the toolbar items (進階, 特別字元. 幫手 and 引用) are sometimes shown as "Advanced", "Special characters", "Help" and "Cite".

I do want to note that I was able to consistently reproduce this issue on the first two or three days after the changes went live. It occurs less frequently now.

Looks like system message caching issue.

Another issue that's been bugging me is that bare urls in messages are now shown in this LanguageConverter syntax: -{R|https://example.org/}- (which apparently means "raw content"), even though it's not present in the message sources. I can't find any further info how to get rid of it.

I think it's a known issue, but couldn't find the Phabricator task.

Can we just acknowledge that this current implementation of the yue LanguageConverter is a bad idea? This was pushed forward despite many existing issues with LanguageConverter and community opposition, while bringing negligible benefit to the project. (I'd even argue that the issues outweigh the benefits.) There is little demand for the feature to begin with, that you'd even have to dig up a 10 year old thread on a defunct discussion page to justify this, and as far as I am aware the existing JavaScript gadget already covers most use cases. Also, to some extent it's more useful than LC, since it does not offer fragmented control over the interface and content script variants seperately as LC does.

Change 929447 had a related patch set uploaded (by Winston Sung; author: Winston Sung):

[mediawiki/core@master] Revert "Implement Language Converter for yue (Cantonese)"

https://gerrit.wikimedia.org/r/929447

Change 923640 had a related patch set uploaded (by Winston Sung; author: Winston Sung):

[mediawiki/core@master] YueConverter: Temporarily switch "yue-Hant" language variant to "yue"

https://gerrit.wikimedia.org/r/923640

Revert submitted.

Re-implement should be done after big issues were resolved.

Change 923640 abandoned by Winston Sung:

[mediawiki/core@master] YueConverter: Temporarily switch "yue-Hant" language variant to "yue"

Reason:

Replaced-by: I935cc23cbc2838c4338c5fb2220d8ec4cfb750a9

https://gerrit.wikimedia.org/r/923640

Change 929447 merged by jenkins-bot:

[mediawiki/core@master] Revert "Implement Language Converter for yue (Cantonese)"

https://gerrit.wikimedia.org/r/929447

Change 929647 had a related patch set uploaded (by Winston Sung; author: Winston Sung):

[mediawiki/core@wmf/1.41.0-wmf.13] Revert "Implement Language Converter for yue (Cantonese)"

https://gerrit.wikimedia.org/r/929647

Change 929647 merged by jenkins-bot:

[mediawiki/core@wmf/1.41.0-wmf.13] Revert "Implement Language Converter for yue (Cantonese)"

https://gerrit.wikimedia.org/r/929647

Mentioned in SAL (#wikimedia-operations) [2023-06-15T14:19:46Z] <lucaswerkmeister-wmde@deploy1002> Started scap: Backport for [[gerrit:929647|Revert "Implement Language Converter for yue (Cantonese)" (T59106 T337527)]]

Mentioned in SAL (#wikimedia-operations) [2023-06-15T14:21:11Z] <lucaswerkmeister-wmde@deploy1002> lucaswerkmeister-wmde and wsung: Backport for [[gerrit:929647|Revert "Implement Language Converter for yue (Cantonese)" (T59106 T337527)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet

Mentioned in SAL (#wikimedia-operations) [2023-06-15T14:29:39Z] <lucaswerkmeister-wmde@deploy1002> Finished scap: Backport for [[gerrit:929647|Revert "Implement Language Converter for yue (Cantonese)" (T59106 T337527)]] (duration: 09m 53s)

H78c67c changed the task status from In Progress to Stalled.Jun 17 2023, 1:08 AM
H78c67c lowered the priority of this task from Medium to Low.

I believe this new status best reflects the total lack of community support and the blocking issues with LanguageConverter that prevent this feature from being useful or beneficial.

Winston_Sung changed the task status from Stalled to Open.Jun 17 2023, 3:44 AM

Another issue that's been bugging me is that bare URLs in interface messages are now shown in this LanguageConverter syntax: -{R|https://example.org/}- (which apparently means "raw content"), even though it's not present in the message sources. I can't find any further info how to get rid of it.

Can you give some steps to reproduce it (on wikis with converters)?