Page MenuHomePhabricator

failure in interlanguage links with leading zeros ( &+#+0+nnnn+; )
Closed, ResolvedPublic

Description

Author: gangleri

Description:
Dear friends,

the first interlanguage link in
http://en.wikipedia.org/w/index.php?title=Oltenita&oldid=10808167
will fail, the seccond will not.

ro:Olteniţa
ro:Olteniţa
are displayed properly in (my) browser (Mozilla Firefox)

It would be nice to fix this because it is just another "have to know".

The page is corrected. See:
http://en.wikipedia.org/w/index.php?title=Oltenita&diff=10808185&oldid=10808167

'''Note:''' The example of this behaviour relates to a [[Latin-1]] wiki. It is
not tested at [[UTF-8]] or right to left wikis.

Best regards
Reinhardt [[gangleri]]


Version: unspecified
Severity: minor
OS: Windows XP
Platform: PC
URL: http://en.wikipedia.org/w/index.php?title=Oltenita&oldid=10808167

Details

Reference
bz1636
TitleReferenceAuthorSource BranchDest Branch
Add skinsrepos/qte/catalyst/ci-charts!8jhuneidiT363604main
Customize query in GitLab

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:13 PM
bzimport set Reference to bz1636.
bzimport added a subscriber: Unknown Object (MLST).

gangleri wrote:

Halló! The bug has been reported at [[:sourceforge:projects/pywikipediabot]].

http://sourceforge.net/mailarchive/forum.php?thread_id=6739695&forum_id=36014

None of the hexadecimal variants ( +x as in &+#+x+0+nnnn+; ) has been tested so far.

Regards Reinhardt [[User:Gangleri]]

P.S. I wonder if pictures from [[commons:]] are available here
[[Image:Smiley.png|16px|;-)]]

gangleri wrote:

[[en:User:Gangleri/tests/bugzilla:1636]] shows olso other behaviors where links
are not generated at all, see :eo:ŝanĝo

(regarding comment #2)
The correct link is [[sourceforge:projects/pywikipediabot]] (if InterWiki's work
here else http://sourceforge.net/projects/pywikipediabot ).

Reinhardt

MediaZilla wrote:

HTML entities are not correctly converted to characters.

ro:Olteniţa links to [[:ro:Olteniţa]] with a "ţ" (lower case T with cedille)
ro:Olteniţa links to [[:ro:Olteniía]] with an "í" (lower case I with accent aigu)

This "í" is actually #237 (decimal), which is #0355 (octal). I think, "0355" is incorrectly converted to an
integer, leading zeros should be stripped.

By the way: Internet Explorer 6, Firefox 1.0 and Opera 8.0 display #355 and #0355 as the same character "ţ".
[[User:Richie]]

wfMungeToUtf8() was incorrectly passing decimals in a way that got initial-zeroes interpreted as octal.

Fixed in CVS HEAD and REL1_4 and put live. Fix will be in 1.4.3.