VisualEditor: Implement script-specific cursoring for Devanagari if native cursoring is insufficient
Closed, ResolvedPublic8 Estimated Story Points
Actions

Assigned To

Authored By

	Mahitgar
	Sep 4 2013, 2:57 PM

Description

This is language/script specific diacritic specific bug split from T53472: VisualEditor: Backspace deletes combined character clusters together with diacritics

For language Marathi Script Devanagari anuswara diacritic is ं
If one types the anuswara diacritic ं after any devanagari script alphabet it is usually place above the concerned alphabet. For example क is a Marathi/Devanagari alphabet if I type क+ ं I get कं
If I place cursor before the cluster कं and press delete then whole cluster gets deleted and that is right way to happen, and this behaviour needs to be retained (since if basic alphabet is not there then no need of anuswara diacritic ं) .
Problem: In Visual Editor if I place cursor after the cluster e.g. after कं and press backspace current behaviour is the whole cluster gets deleted. Example कं + Backspace key = results in whole cluster getting deleted.
Expected behaviour is: Cluster with anuswara diacritic (कं) + Backspace key = retain rest of cluster(retain क ) and only anuswara diacritic ं should get deleted.
Reason: During spell correction many times we need to retain rest of the cluster and only change the diacritics. For anuswara ं diacritic spell change is required frequently for various reasons like change in singular-plural tense. Every time retyping the whole cluster is cumbersome. Traditional Source editing behaviour is proper and expected and problem is coming only with VisualEditor behaviour.
For example please see this https://mr.wikipedia.org/w/index.php?title=कृष्ण_श्रीनिवास_अर्जुनवाडकर&diff=1198833&oldid=1198826 edit difference a user had to change this diacritic at several places.

Since users would be reluctant to use VE without correct behaviour, I would prefer this being treated as a bug not just enhancement and to have fair importance level.

Version: unspecified
Severity: normal

Details

Reference: bz53754

Related Objects
Search...

Status	Subtype	Assigned	Task
Open		None	T76456 Language Engineering tracker of trackers (tracking)
Resolved		Nikerabbit	T55014 Input Methods issues for supported languages (tracking)
Open	Release	None	T84936 Release VisualEditor-MediaWiki as "1.0"
Open		None	T57972 VE cursor test invokes ULS IME in Firefox on test2wiki
Invalid		Jdforrester-WMF	T35077 VisualEditor multilingual input / i18n issues (tracking)
Resolved		dchan	T51569 Make ULS input methods work in content editable divs of VisualEditor
Resolved		Jdforrester-WMF	T55754 VisualEditor: Implement script-specific cursoring for Devanagari if native cursoring is insufficient
Resolved		dchan	T53472 VisualEditor: Backspace deletes combined character clusters together with diacritics

Event Timeline

• bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:49 AM

• bzimport added a project: VisualEditor-ContentLanguage.

• bzimport set Reference to bz53754.

Mahitgar created this task.Sep 4 2013, 2:57 PM

Per Bug 51472#c4 , the grapheme cluster handling for backspace is to be on a per script basis. So, this should be treated as the bug for specifically devanagari.

Also note that I am confirming the bug for hindi.

To further clarify the original report, devanagari has various diacritics which can be applied to base unicode characters. It also has a combining character halant (viram) ् (U+094D).

Currently, pressing backspace after a grapheme cluster containing one or more base characters with one or more diacritics and/or combining character deletes the entire grapheme cluster. This is not desired behaviour. Pressing delete before a cluster deletes the entire cluster. This is desired behaviour.

Examples of diacritics: ँ (Chandrabindu) U+0901 ं (Bindu) U+0902 etc.

Examples of grapheme clusters:

One base character with one diacritic: कं ( क + ं ), कँ ( क + ँ ), कः ( क + ः )

One base character with multiple diacritics: किं ( क + ि + ं )

Multiple base characters with halant: श्र ( श + ् + र ), क्ष ( क + ् + ष ), प्र ( प + ् + र )

Multiple base characters with halant followed by diacritics: श्रिं (श + ् + र + ि + ं), क्षि ( क + ् + ष + ि ), प्रे ( प + ् + र + े )

System environment:
Win7 X64
Google Chrome 29.0.1547.62 m
Page used for testing: [[:w:hi:User:Siddhartha Ghai/sandbox]]

Expected behaviour:
Only one diacritic (the last one in the grapheme), ie one unicode character, is to be deleted. The rest of the grapheme cluster is to stay intact.

Examples used (not exhaustive):
Grapheme -> Grapheme after pressing backspace
कं -> क
कँ -> क
कः -> क
क् -> क
किं -> कि
श्र -> श्
क्ष -> क्
प्र -> प्
श्रिं -> श्रि
क्षि -> क्ष
प्रे -> प्र

Current behaviour (blank indicates entire grapheme cluster was removed) (these results should be verified on other browser/OS combinations):
कं ->
कँ ->
कः ->
क् ->
किं ->
श्र -> श् (Working correctly)
क्ष -> क् (Working correctly)
प्र -> प् (Working correctly)
श्रिं -> श् (Deletes र + ि + ं , ie three unicode characters instead of one)
क्षि -> क् (Deletes ष + ि , ie two unicode characters instead of one)
प्रे -> प् (Deletes र + े , ie two unicode characters instead of one)

Points to note:
Some IMEs may provide non-normalized input for characters such as फ़ (U+095E) in place of फ (U+092B) + ़ (U+093C), ढ़ (U+095D) in place of ढ (U+0922) + ़ (U+093C) etc. In such cases, the user may expect that pressing a backspace will only eliminate the diacritic, not the entire grapheme. So, VE may have to handle normalization in such cases.

Results seem to indicate that halant is partially correctly handled. letter + halant + letter + backspace gives letter + halant correctly. But
letter + halant + backspace, instead of giving the letter, deletes the entire grapheme.

The remaining diacritics as of unicode 3.0 come under Nonspacing mark (Mn) and Spacing combining mark (Mc) (Note: This does not include devanagari extended added in unicode 6.0 and vedic extensions added in unicode 6.1)

Retitled for clarity; we're switching back to native cursoring and backspacing, which hopefully will fix this, but keeping this open and distinct in case not.

Jdforrester-WMF added a project: VisualEditor.Nov 27 2014, 12:37 AM

Jdforrester-WMF moved this task from To Triage to Freezer on the VisualEditor board.Nov 27 2014, 12:56 AM

In T55754#549797 in October 2013, @Jdforrester-WMF wrote:

we're switching back to native cursoring and backspacing, which hopefully will fix this

@Mahitgar: Do you know if this is still a problem in Devanagari? (I'm afraid I'm missing input method skills to test this myself.)

I'm provisionally deeming this fixed. We changed this behaviour significantly since this was filed. Please re-open if you find issues.

Jdforrester-WMF added a parent task: T51569: Make ULS input methods work in content editable divs of VisualEditor.Dec 8 2015, 7:28 PM

Jdforrester-WMF moved this task from Freezer to TR3: Language support on the VisualEditor board.Dec 8 2015, 7:44 PM

Jdforrester-WMF edited a custom field.

VisualEditor: Implement script-specific cursoring for Devanagari if native cursoring is insufficientClosed, ResolvedPublic8 Estimated Story PointsActions

Description

Details

Related ObjectsSearch...

Event Timeline

VisualEditor: Implement script-specific cursoring for Devanagari if native cursoring is insufficient
Closed, ResolvedPublic8 Estimated Story Points
Actions

Related Objects
Search...