Page MenuHomePhabricator

urlencode returns &26lrm%3B instead of percent encoded UTF-8
Closed, InvalidPublic

Description

Author: omniplex

Description:
{{urlencode:{{DIRMARK}}}} returns %26lrm%3B

The LTR mark is u+200E, and according to a highly suspicious
algorithm published on [[m:Help:URL#Percent-Encoding]] that
should be
2: -0010---0000---0000---1110
3: ----0010---000000---001110
4: ----0010-10000000-10001110
7: 11100010-10000000-10001110

> 11100010-10000000-10001110 => %E2%80%8E

See RFC 3987, it's straight forward (excluding punycode for
domains).


Version: 1.7.x
Severity: normal
Platform: Other
URL: http://meta.wikimedia.org/wiki/Help:URL#Percent-encoding

Details

Reference
bz6219

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:16 PM
bzimport set Reference to bz6219.
bzimport added a subscriber: Unknown Object (MLST).

{{DIRMARK}} returns ‎, so this would encode to %26lrm%3b.

omniplex wrote:

Something with my caffeine level. I only looked at what I get for {{DIRMARK}},
it was translated to UTF-8 later (too late for urlencode). But somebody fixed
{{DIRMARK}} to return UTF-8 now, now urlencode works as expected.