Page MenuHomePhabricator

Lingo breaks external links containing umlauts
Closed, DeclinedPublic

Description

Hi, when I add a Link to an external destination containing german “umlaute” then the link doesn’t work if I save the page.
In preview-mode everything works correct, only after saving the page the link is broken.

The Link showing in the text looks ok, it shows me: /Preisübersicht but the link behind the text is broken: /Preisübersicht/

If I make a internal link to a page with “umlaute”, this works.

Example:
This works fine: <html><a href="file://folderA\Preisübersicht\">Preisübersicht</a><html>, the generated code by the page is: <a class="external text" href="file:///FolderABC\Preisübersicht\" rel="nofollow"> Preisübersicht</a>

Doesn't work: [file:///folderAB\Preisübersicht\ Preisübersicht]
The code generated by the preview function is: <a class="external text" href="file:///folderABC\Preisübersicht\" rel="nofollow">Preisübersicht</a>
The code generated by the "normal" page (after save) is: <a class="external text" href="file:///folderABC%5CPreis%C3%BCbersicht%5C" rel="nofollow">Preisübersicht</a> this is wrong, the code %C3 + %BC are à + ¼

This Error occur only when on the page is also a word from my Glossary. So the complete page text is only:

[file:///folderAB\Preisübersicht\ Preisübersicht]
HTML

(HTML is defined on my glossary Page)

If i delete the HTML-string from the page, everything works fine. So my Problem is the Lingo extension.

Software Version
MediaWiki 1.21.3
PHP 5.3.27 (cgi-fcgi)
MySQL 5.1.42-community
Lingo 0.4.2

see also: Project:Support_desk#x.5BSolved.5D_Problem_with_Encoding_only_in_external_links_39254


Version: unspecified
Severity: normal

Details

Reference
bz62040

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:03 AM
bzimport set Reference to bz62040.
bzimport added a subscriber: Unknown Object (MLST).

thomas.ramm wrote:

This bug happens with Internet Explorer 11, Windows 7 64bit. I can't test with different Browser because wihin my company only IE11 is allowed/installed

I will not fix this for various reasons.

  • The created links are correct. URLs containing Umlauts or anything but ALPHA / DIGIT / "-" / "." / "_" / "~" have to have these characters encoded. Else they do not conform to the applicable standard (http://tools.ietf.org/html/rfc3986#section-2.3)
  • The HTML code is produced by the PHP DOM module. This module is used by Lingo to efficiently parse the page text for occurrences of glossary terms. There is no option to coerce the module into producing broken HTML.
  • This issue only applies to URLs of the file: protocol. Other external links work just fine. Try http://www.stupidedia.org/stupi/Heiz%C3%B6lr%C3%BCcksto%C3%9Fabd%C3%A4mpfung
  • Browsers will (or should) not normally follow local links on websites served from a remote server anyway.

That said, I am willing to consider patches (or detailed strategies to fix this). If you ca provide a way to fix this, please re-open this bug.