Page MenuHomePhabricator

collection extension doesn't convert URL to unicode
Open, MediumPublic

Description

I made a simple test in
http://en.wikipedia.org/wiki/User:Yamaha5/pdf
as you see both are them are the same URL but the first is percentage URL and it should better that collection extension before rendering convert them to Unicode.

Convertor Code in python (mediawiki has problem with {|} space which are in URLs so I replaced them with percentage one)

import urllib

def UnicodeURL(text):

old_text=text
RE=re.compile(ur'\/\/.*?(?=[\s\n\|\}\]]|$)')
fa_Urls=RE.findall(text)
if fa_Urls:
    for URL in fa_Urls:
        try:
            URL=URL.split('<')[0]
            new_URL=urllib.unquote(URL.encode('utf8')).decode('utf8').replace(u' ',u'%20').replace(u'{',u'%7B').replace(u'|',u'%7C').replace(u'}',u'%7D')
            text=text.replace(URL,new_URL)
        except:
            continue
return text

Version: unspecified
Severity: normal

Details

Reference
bz59681

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:19 AM
bzimport added a project: Collection.
bzimport set Reference to bz59681.
bzimport added a subscriber: Unknown Object (MLST).