Page MenuHomePhabricator

blocks of code not handling magic character conversions in Esperanto correcty (as reason for page deletion etc.)
Closed, InvalidPublic

Description

Author: gangleri

Description:
Hallo!

Sorry for this:

Most of bugs relating to
bug 1512: magic character conversions in Esperanto cause problems with foreign
words, interlanguage links
where fixed.

However it seams that there are still some blocks of codes where magical
character in Esperanto is used and wrong links are generated.

At the URL from above you may find

16:49, 5. Okt 2005 Gangleri forigis "La Chaŭ-de-Fonds" (ne plu bezunata ĉar cimo
bugzilla:01512 nun estas fiksita --- la enteno estis : '#REDIRECT La
Chaŭ-de-Fonds' (kaj la sola kontribuinto estis 'Gangleri'))

Please note that the seccond *La Chaŭ-de-Fonds* is wrong. It should be a link to
*[[eo:La Chaux-de-Fonds]]*. In the URL most of the redirects where valid
redirects. You should see blue links not red ones!!!

Hope you understand the point. If not please do not hesitate to contanct me again.

Best regards and thanks in advance for fixing this. Reinhardt [[user:gangleri]]


Version: 1.6.x
Severity: trivial
URL: http://eo.wikipedia.org/w/index.php?title=Speciala%3ALog&type=delete&user=Gangleri

Details

Reference
bz3615

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 8:52 PM
bzimport set Reference to bz3615.
bzimport added a subscriber: Unknown Object (MLST).

Yekrats wrote:

Interwiki links (and directly typed URLs) that had an 'x' character were at one
time converted into Esperanto. cx became ĉ, gx became ĝ, ... and so on, and xx
became x. Thus, several interwiki links have been set up with the *x two-letter
codes into Esperanto.

However, a few months ago, this conversion tool went away. I'm not sure why or
how. But when it happened, it caused all of the interwiki links previously set
up with this x-system to become broken. There are quite a few links broken by
this, possibly hundreds.

Ideally, I'd like the old system to come back. It's more convenient to type the
ASCII sx instead of ŝ, and that would fix the major problem, broken links in the
interwiki.

However, if a bot could be written to repair the Esperanto links, that would fix
the interwiki issue, which is fairly important. -- User Yekrats.

gangleri wrote:

(In reply to comment #1)

Ideally, I'd like the old system to come back. It's more convenient to type the
ASCII sx instead of ŝ, and that would fix the major problem, broken links in the
interwiki.

Dear Scott

The way how *magical Esperanto character conversion* worked *between* wiki's of
other languages and projects in Esperanto before the change you mention in your
comment was broken anyhow.

I support Brions decision to define the *magical Esperanto character conversion*
as *local* to a project with *content language* Esperanto. Else this
functionality would have been a part of *all* language projects. For my
understanding this is not realistic.

Because the InterLanguage links where broken anyhow a fix was necessary one way
or another. The solution chosen for InterLanguage links *to* projects with
content language Esperanto is [[KISS]] (simple), users can use "copy and paste"
between languages as you would do with [[:hi:]], [:ja:]], [[:ka:]], [[:ko:]],
[[:ur:]] and other projects where you do not need to understand the language and
/ or alphabet. This is also straight forward for [[en:Wikipedia:bot]]s as well.

The main issue of this bug report related to parts of MediaWiki code which do
not handle the *magical Esperanto character conversion* at the time I opened the
bug.

I identified the *magical Esperanto character conversion* in "pieces of
MediaWiki" messages as inserted in history fields and logs. If you look at this
carefully it is mainly relating how the content of the variables $1, $2, etc.
when submitted and the *magical Esperanto character conversion* procedure might
be activated or not.

I am shure that Brion can fix these "minor issues" at a point in time. The bugy
code might be spread "here and there" which complicates the fix / all fixes.

best regards reinhardt [[user:gangleri]]

gangleri wrote:

*testlinks*
I just deleted an useless redirect [[eo:Aldous_HUXXLEY]]. The correct title *is*
[[eo:Aldous_HUXLEY]]
Please compare what
http://eo.wikipedia.org/w/index.php?title=Special:Log&type=delete&user=Gangleri
is showing:
01:52, 15. Dec 2005 Gangleri forigis "Aldous HUXXLEY" (enhavis: '#REDIRECTAldous
HŬLEY' Aldous HUXLEY' ne plu bezonata)
It mentions a / the messy link [[eo:Aldous_HŬLEY]] deletd in Oktober.

*request 1*
The comment should be
(enhavis: '#REDIRECTAldous HUXLEY' Aldous HUXXLEY' ne plu bezonata)

This was known before and reported in comments above.


Now you can tray "page=foo" using one, two, three "X" and other coding characters:

http://eo.wikipedia.org/w/index.php?title=Speciala:Log&type=delete&page=Aldous_HUXLEY
[[eo:Aldous_HŬLEY]] was deleted in Oktober.
16:35, 5. Okt 2005 Gangleri forigis "Aldous HŬLEY" (ne plu bezunata ĉar cimo
bugzilla:01512 nun estas fiksita --- la enteno estis : '#REDIRECT Aldous_HŬLEY'
(kaj la sola kontribuinto estis 'ArnoLagrange'))
http://eo.wikipedia.org/w/index.php?title=Speciala:Log&type=delete&page=Aldous_HUXXLEY
no result
http://eo.wikipedia.org/w/index.php?title=Speciala:Log&type=delete&page=Aldous_HUXXXLEY
no result
http://eo.wikipedia.org/w/index.php?title=Speciala:Log&type=delete&page=Aldous_HU%58XLEY
no result
http://eo.wikipedia.org/w/index.php?title=Speciala:Log&type=delete&page=Aldous_H%55%58%58LEY
no result
other combinations ...

I was not able to find an url to filter that delete event.

*request 2*
Please fix the search. The magical character sonversion is done at a place it
shouldn't.

Thanks for your efforts in advance.

best regards reinhardt [[user:gangleri]]

gangleri wrote:

(In reply to comment #3)

I was not able to find an url to filter that delete event.

The right url is:
http://eo.wikipedia.org/w/index.php?title=Speciala:Log&type=delete&page=Aldous_HUXXXXLEY

gangleri wrote:

*note*

When deleting a page the automatic reason for page deletion was *displayed* as
enhavis: '#REDIRECT[[Aldous HUXLEY]]'
I added
[[Aldous HUXXLEY]]' ne plu bezonata (because I was nor aware if the characters should be
in Esperanto notation or not).

As a result of this "misunderstanding"
what was added in the database *dispalys* as:
enhavis: '#REDIRECTAldous HŬLEY' Aldous HUXLEY' ne plu bezonata

What would be required is that the reason for deletion should have been
enhavis: '#REDIRECT[[Aldous HUXXLEY]]'
because this is an editbox where Esperanto cx, gx, hx, jx, ux etc. notation is used by
default same as in every other editbox at eo: .

I assume that
http://eo.wikipedia.org/w/index.php?title=Speciala:Log&type=delete&page=Aldous_HUXXXXLEY
is wrong because the "page" parameter expects cx, gx, hx, jx, ux etc. notation.

In other url's this is not the case compare:
http://en.wikipedia.org/wiki/eo:Aldous_HUXLEY
http://eo.wikipedia.org/wiki/Aldous_HUXLEY
and
http://en.wikipedia.org/wiki/eo:Aldous_HUXXLEY
http://eo.wikipedia.org/wiki/Aldous_HUXXLEY

another block of code:
Please go to
http://eo.wikipedia.org/wiki/Aldous_HUXXLEY (page does not exist)
and log *in*.
After login you return to
http://eo.wikipedia.org/wiki/Aldous_HUXLEY (existing page)

*undelete*
has the same problem:
Undeleting [[eo:Aldous_HUXXLEY]]
http://eo.wikipedia.org/wiki/Speciala:Undelete/Aldous_HUXXLEY
returns with the message:
"La artikolo [[Aldous_HUXLEY]] estas sukcese restarigita. ..."
see
http://eo.wikipedia.org/wiki/Speciala:Log
where there is an entry about completion
Please verify the timestamps at Special:Undelete with those from the history of
[[:eo:Aldous_HUXLEY]] .

but after undelete
tray http://eo.wikipedia.org/wiki/Speciala:Undelete/Aldous_HUXXLEY
*nothing* will happen.
I was not able to undelete neither the last nor all 4 revisions.

another block of code:
Please go to
http://eo.wikipedia.org/wiki/Aldous_HUXXLEY (page does not exist)
and log *out*.
After loggin out you return to
http://eo.wikipedia.org/wiki/Aldous_HUXLEY (existing page)

Hope that these examples help to identify many of required pieces of code.

best reagards reinhardt [[user:gangleri]]

gangleri wrote:

*more*
I tried to create a user name:
Cx Gx Hx Jx Sx Ux cx gx hx jx sx ux CXX GXx HXX JXx SXX UXx cxX gxx hxX jxx sxX uxx
but received the error:
Ensaluta eraro: Vi ne enmetis validan salutnomon.

However the user name field changed to:
Ĉ Ĝ Ĥ Ĵ Ŝ Ŭ ĉ ĝ ĥ ĵ ŝ ŭ CX GX HX JX SX UX cx gx hx jx sx ux

I am puzzeled if this is ok because submitting it another time changes the user name.

Finaly I created:
[[eo:user:Ĉ ĝ Ĥ ĵ Ŝ ŭ Ĉ ĝ Ĥ ĵ Ŝ ŭ]] and
[[eo:user:Ĉ ĝ Ĥ ĵ Ŝ ŭ CX gx HX jx SX ux]]
the last using
Ĉ ĝ Ĥ ĵ Ŝ ŭ CXX gxX HXx jxx SXX uxX

  1. please go to

http://eo.wikipedia.org/wiki/Speciala:Movepage/Vikipediisto:%C4%88_%C4%9D_%C4%A4_%C4%B5_%C5%
9C_%C5%AD_CX_gx_HX_jx_SX_ux
It shows:
Movu paĝon: Vikipediisto:Ĉ ĝ Ĥ ĵ Ŝ ŭ Ĉ ĝ Ĥ ĵ Ŝ ŭ
Al nova titolo: Vikipediisto:Ĉ ĝ Ĥ ĵ Ŝ ŭ Ĉ ĝ Ĥ ĵ Ŝ ŭ
This is wrong.

  1. The screen dump following in the next attachment shows an example about blocking

There are more editboxes some are using the magical character conversion but the one for
the user name does not.

gangleri wrote:

screen dump for bug 03615

Attached:

bugzilla_03615_001.jpg (335×758 px, 54 KB)

gangleri wrote:

http://eo.wikipedia.org/w/index.php?title=Special:Search&search=cxx&fulltext=cxx
is an url I leared at other wiki's ([[he:]])

Please tray to do the following:
a) insert cxx in the editbox under the navigation menu
b) click "Serĉu"
c) the url is now

http://eo.wikipedia.org/wiki/Speciala:Search?search=cxx&fulltext=Ser%C4%89u
in the editbox under the navigation menu you still have "cxx"
in the editbox beside "Trovu" you have "cx"

d) click on "Trovu"
e) the url changes to

http://eo.wikipedia.org/w/index.php?title=Special%3ASearch&search=cx&fulltext=Trovu

notice that "cxx" changed to "cx"
in the editbox under the navigation menu you have now "cx"
beside "Trovu" you have now "ĉ"

*request*
As all editboxes the input syntax should be "magic esperanto conversion. This
means that in both fields you should enter "ĉ" or "cx" alternatively. This is
the case now.
*But* the input in the field under the navigation bar should preserve. The
relavant cases are cxx cxxxx ... uxx uxxxx etc.

*note*
f) http://he.wikipedia.org/wiki/special:Search?search=%D7%90%D7%90%D7%90
using three *alephs* shows different in IE and Firefox
Firefox shows only *two* alephs. I will identify this bug at bugzilla.mozilla.

g) however the url's look different if you compare the parameters

I did not see somthing as "&fulltext=Trovu" at he:

best regards reinhardt [[user:gangleri]]

gangleri wrote:

(In reply to comment #9)

10)http://eo.wikipedia.org/w/index.php?

title=Special:Search&search=cxx&fulltext=cxx

"fulltext=cxx" is INVALID

*note*
f) http://he.wikipedia.org/wiki/special:Search?search=%D7%90%D7%90%D7%90
using three *alephs* shows different in IE and FirefoxFirefox shows only

*two* alephs.

see [[bugzilla:04332]]: RTL characters disapear in the Search edit box at
special:Search

I will identify this bug at bugzilla.mozilla.

still open

g) however the url's look different if you compare the parametersI did

not see somthing as "&fulltext=Trovu" at he:

is INVALID

best regards reinhardt [[user:gangleri]]

gangleri wrote:

(In reply to comment #10)

see [[bugzilla:04332]]: RTL characters disapear in the Search edit box at
special:Search

wrong link:
see bug 4332: RTL characters disapear in the Search edit box at special:Search
sorry!

Restored bug from flood attack.

I still think that the esperanto specific magical conversion is a hack that should not be present in the MediaWiki stoftware itself, but that should only be offered as an edit tool that will manually convert selected characters to actual esperanto. This should then be supported only within a gadget for the page editor, or a more general gadget for use in various forms (like the Search form). The conversion should be explicitly requested, otherwise nothing should be done automagically as this breaks too often.

This also means that the esperanto messages (in the MadiaWiki: namespace) need to be created on TranslateWiki with the correct characters (you may ask to TranslateWiki to add a gadget for esperantists that can't type the circumflexes on their keyboard, but I really suggest that the esperanto wiki documents how to create or install a keyboard layout to support the circumflex diacritic directly (because this is really easy in Windows and Linux; I don't know if this is possible as easily on MacOSX, but I suppose that tools similar to MSKLC for Windows also exist on MacOSX).

If there remains incorrect links using extra "x" characters or where characters where converted to esperanto that should not have been converted on EO:WP, this has to be fixed within the articles. But the magic conversion after pressing the Publish button is really a bad thing with unsolvable problems.

I've implemented input method rules for Esperanto in the Narayam input method extension in r85504.

This relies on JavaScript-capable browsers that work with modern UTF-8 and have working fonts -- usually now a good assumption, whereas it wasn't in 2002 :) -- but if all goes well, we should be able to dump the old internal conversions and use this instead.

The client-side input method handles *all* text input fields, and doesn't rely on server-side processing remembering what type of field it's dealing with. Yay!

See also bug 21781.

Marking this as dependent on 21781; dumping the old conversion for the new Narayam ext's conversion will be much nicer... but that itself is pending on bug 28668 -- it needs to be modified to work reliably on all fields too. :)

Aklapper subscribed.

$wgEditEncoding which was only used for the Esperanto language character conversion got removed in 1.28 in T62677. Input methods are nowadays provided by the UniversalLanguageSelector. Closing this old task.