Author: gangleri
Description:
Hallo!
To my knowledge the build in editor can handle "magical character conversion"
which is activated for wiki's with content language Esperanto.
This magical character conversion works as follows. The characters Ĉ, Ĝ, Ĥ, Ĵ,
Ŝ, Ŭ, ĉ, ĝ, ĥ, ĵ, ŝ, ŭ are stored in the database but are displayed as Cx, Gx,
Hx, Jx, Sx, Ux, cx, gx, hx, jx, sx, ux in the brwoser. See
[[eo:Vikipedio:Bugzilla_1512#Notes]].
*note* :eo: has also an "escape notation" as displaying Cxx for a stord Cxx, CxX
for a stord CxX etc.
*Requirement of this bug report:*
It should be possible to set up the character conversion
- character by character
and / or
- by *include* range
and / or
- by *exclude* range
Examples:
If you want to distinguish between "minus" = "-" and – – – = "–"
see: Unicode Character EN DASH - U 2013
There should be a syntax that
"–" should be shown as – in the editor but saved as "-".
There should be a syntax to match "existing" magical character conversion in
Esperanto. If such syntax, configuration would be available users could activate
it as their choice in their monobook definition.
This feature would allow to detect BiDi punctuation characters as mentioned in
bug 3819: strip phantom general punctuation characters from page titles
This feature would help to distinguish different kinds of whitespace as "space"
"tab" as mentioned in
bug 3894: white space characters, BiDi control characters should show up in diff
This feature would allow to edit InterLangua links using the magic character
conversion as desired by Scot in bug 3615 comment 1:
bug 3615: blocks of code not handling magic character conversions in Esperanto
correcty - reason for page deletion
*notes*
a) The best way to introduce a feature is to offer it optional. This will not
brake code or iritate users.
b) Because of "escape syntax" at some point it should be decided if this feature
would brake documentaion pages with examples. If there all versions –
– – = "–" are used these should not be changed while the page is
edited and saved again.
c) no details about the required syntax are specified here in order to avoid
limitations
The requested felexible magic character conversion build in the editor will
offer solutions / workarounds / would be helpfull also for other reported bugs:
One could see in the source of a page
- if Unicode whitespaces is used in article title
bug 1414: Unicode whitespaces allowed in article title
- one could distinguish characters which look the same in different alphabets
but are coded differently
bug 1524: usernames should use unicode whitelist
bug 2290: user impersonation using homographs
bug 3885: title normalisation
This feature would "normalise" the way how characters are saved. This would /
should make search more efficient.
"copy and past" can be platform and program dependend. I have seen many broken
pages where cyrillic characters in InterLanguage links where saved as ????. This
could be avioded if a whole range would be displayed only in &#nnnn; notation in
the editor and saved back as real unicode.
The same applies detecting homoglyphs / homographs. The reports are mentioned above.
regards reinhardt [[user:gangleri]]
P.S. This request is only about implementing the basic feature. Please open
individual bugs for subfeatures wherever necessary.
Version: unspecified
Severity: enhancement
URL: http://test.leuksman.com/edit/User:Brion%E2%80%AD%E2%80%AC?oldid=9812