Author: gangleri
Description:
Sorry for this!
Hallo!
a) I tested character normalisation which seams part of title normalisation.
Regarding precombined characters - NON-precombined characters this workes fine:
[[User:Gangleri/tests/אָ]]
http://yi.wikipedia.org/wiki/User:Gangleri/tests/%EF%AC%AF
http://yi.wikipedia.org/wiki/User:Gangleri/tests/%D7%90%D6%B8
point to the same page despite different coding.
b) The bug's URL will list four different pages with "identical optical title".
There are "phantom" trailing general punctation characters generating different
URL's. Compare:
http://www.fileformat.info/info/unicode/char/202b/index.htm
Unicode Character 'RIGHT-TO-LEFT EMBEDDING' (U+202B)
UTF-8 (hex) 0xE2 0x80 0xAB (e280ab)
http://homepage1.nifty.com/nomenclator/unicode/data/punct.htm
The generated URL's are:
http://yi.wikipedia.org/wiki/User:Gangleri/tests/%E2%80%AB%D7%B0%D7%99%D7%A5
http://yi.wikipedia.org/wiki/User:Gangleri/tests/%E2%80%AB%D7%B0%D7%99%D7%A5%E2%80%AB
http://yi.wikipedia.org/wiki/User:Gangleri/tests/%E2%80%AB%D7%B0%D7%99%D7%A5%E2%80%AB%E2%80%AB
http://yi.wikipedia.org/wiki/User:Gangleri/tests/%E2%80%AB%D7%B0%D7%99%D7%A5%E2%80%AB%E2%80%AB%E2%80%AB
There are many aspects to this:
a) possible vandalism - suggestion: Please evaluate if "phantom" = unnecessary
heading or trailing punctuation should be stripped from database titles
++ this looks like a normalisation
b) garbage in - garbage out
Regards Reinhardt [[user:gangleri]]
P.S. I run into this because of textual ambiguosities at Wikipedia in Yiddish
relating to the usage of "tsvey vovn" versus "vov + vov", "tsvey-yudn": versus
"yud + yud" etc.
example 1: There is an article [[yi:וויץ]] but not [[yi:װיץ]] .
example 2: http://www.yiddishdictionaryonline.com/ contains "vey iz (tsu) mir"
which is written *there* both with "vov + vov" and "yud + yud". Nevertheless
http://www.cs.engr.uky.edu/~raphael/yiddish/makeyiddish.html translates with
"tsvey vovn" and "tsvey-yudn": װײ איז (צו) מיר!
It seems that automatical character substitution is not possible because of
ambiguasities when three characters meet together as in
http://www.yiddishdictionaryonline.com/ at
farvunderung - פֿאַרווונדערונג , "farvundert" - פֿאַרווונדערט
and the other way around at
oyspruvn - אויספּרווון
Version: unspecified
Severity: trivial
URL: http://yi.wikipedia.org/w/index.php?title=Special%3APrefixindex&from=Gangleri%2Ftests%2F%E2%80%AB%D7%B0%D7%99&namespace=2