Page MenuHomePhabricator

Add Vietnamese (vi) input method
Open, LowPublicFeature

Description

(In reply to Minh Nguyễn from bug 5309 comment #45)

many Vietnamese Wikipedia users rely on an IME script embedded via
a gadget, but gadgets are disabled at [[Special:UserLogin/signup]]. We’d
need to port the (rather complex) IME to ULS to keep the signup form
accessible.

We definitely should! It shouldn't be hard, jquery.ime only needs a comma separated list of input keys -> result character pairs.
https://github.com/wikimedia/jquery.ime/wiki/Technical-Specification#input-method-definition
If you can't prepare such a table, can you ask other vietnamese users?


Version: master
Severity: enhancement
See Also:
https://github.com/wikimedia/jquery.ime/issues/342

Details

Reference
bz63465

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 3:09 AM
bzimport set Reference to bz63465.
bzimport added a subscriber: Unknown Object (MLST).

Unfortunately, it isn't quite that simple. Vietnamese users expect IME-like functionality, not just a keyboard layout. For instance, with the [[Telex (IME)]] input method, "Vietje" should become "Việt", "truongwf" should become "trường", and if you're indecisive enough, "masfsfsfsfrxj" should become "mạ". "Wasshington" should become "Washington" rather than "Wáington" or "Wahíngton".

Much of this can be accomplished with regular expressions, but it might be easier to shoehorn an existing Vietnamese IME into jquery.ime. AVIM.js is very reliable and is available under an MIT license, but no one really understands the core code. bogo.js [1] is much cleaner but much more limited and only available under the MPL. Mudim [2] is a bit cleaner and is GPL2-licensed but has its own limitations, like an input buffer (not ideal for Vietnamese) and poor support for VIQR.

We'll definitely need support for a few input method-specific preferences, particularly tone mark placement (xóa vs. xoá) and spelling rule enforcement (Wahíngton vs. Washington).

AVIM works pretty well with VisualEditor and other features, so at the Wikimedia wikis, the only real benefit to ULS integration would be getting an IME on Special: pages. (The current workaround is to install a system IME or browser extension, or to use a bookmarklet.) I wouldn't recommend adopting a simplistic keyboard layout even as a stopgap measure, because then it'd be forced upon the Vietnamese wikis and literally no one would want to use it.

Full disclosure: I maintain a Firefox extension based on AVIM.js [3], but these comments are more from the perspective of a Vietnamese Wikipedian.

[1] https://github.com/lewtds/bogo.js
[2] https://code.google.com/p/mudim/
[3] http://avim.1ec5.org/en/

Forgot to link [[vi:MediaWiki:Gadget-AVIM.js]] and [[vi:MediaWiki:Gadget-AVIM portlet.js]].

I'd be happy to try and port AVIM to jquery.ime, but it may take awhile. Fortunately, most of the complexity relates to text selection handling, which should be handled by jquery.ime already.

(In reply to Minh Nguyễn from comment #2)

I'd be happy to try and port AVIM to jquery.ime, but it may take awhile.

That would be wonderful! Please assign this bug to yourself when you start working on it.

Fortunately, most of the complexity relates to text selection handling,
which should be handled by jquery.ime already.

Nice. If other features are missing and block the port, please file them as soon as possible at https://github.com/wikimedia/jquery.ime/issues .

(In reply to Nemo from comment #3)

(In reply to Minh Nguyễn from comment #2)

I'd be happy to try and port AVIM to jquery.ime, but it may take awhile.

That would be wonderful! Please assign this bug to yourself when you start
working on it.

Indeed, this would be wonderful. I'll gladly help with that. It shouldn't be very hard. jqeury.ime had pretty good docs for developers. Please contact me if you need any assistance.

As noted in T286863#7287345, @MikePlantilla has begun work on a jquery.ime-based reimplementation of Vietnamese IME.

In case it helps, the AVIM Firefox extension’s repository has a battery of automated tests that ensures the entire corpus of words in the Free Vietnamese Dictionary Project can be typed using a variety of tone mark placements. I put together this integration test because the core AVIM code is difficult to reason about and impossible to unit test, but it would be applicable to any IME implementation. Feel free to rig it to test your implementation.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:13 AM
Aklapper removed a subscriber: wikibugs-l-list.