Page MenuHomePhabricator

Magic words (predefined templates) {{UTF-8DECODE:}} {{UTF-8ENCODE:}} (optional {{UTF-8ENCODE8:}} {{UTF-8ENCODE16:}})
Closed, InvalidPublic

Description

Author: gangleri

Description:
Hallo!

There is a lot of code in &html_entity; &#nnnn; &#xnnn; %xx%yy%zz notation
around there.
There are arguments for using either UTF-8 characters or &html_entity; &#nnnn;
&#xnnn; %xx%yy%zz notation depending on available fonts, objectives (traying to
read / debug / trace the code) etc.

*request*
New prediefined templates {{UTF-8DECODE:}} {{UTF-8ENCODE:}} should act like
{{SUBST:}} and change the page source. {{SUBSTUTF-8DECODE:}}
{{SUBSTUTF-8ENCODE:}} would be "horrible" magic words.

*reason*
Such endoding decoding is very painfull. With preview, copy and paste characters
can be lost. UTF-8DECODE requires external tools and copying %xx%yy%zz notation
from the browsers url should belong to the past.

In the example from the url
{{UTF-8DECODE:משתמש זה
דובר '''[[:Category:User
he|עברית]]''' כמעט
כ'''[[:Category:User he-4|שפת אם]]'''.}}
would give
משתמש זה דובר '''[[:Category:User he|עברית]]''' כמעט כ'''[[:Category:User
he-4|שפת אם]]'''.
which displays here in MediaZilla according to the bidirectional algorithm.

{{UTF-8ENCODE:}} should do the oposite. It might be that a encoding using HTLM
entities should be supported for better readability (by default or as an
option). Then UTF-8ENCODE should encode non-entities either in decimal notation
or in hexadecimal notation.

Please comment if variants {{UTF-8ENCODE8:}} {{UTF-8ENCODE16:}}
{{UTF-8ENCODEENTITY:}} are required or not.

best regards reinhardt [[user:gangleri]]


Version: unspecified
Severity: enhancement
URL: http://epov.org/wd-gemet/index.php?title=Template%3AUser_he-4&diff=392686&oldid=391182

Details

Reference
bz5256

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:09 PM
bzimport added a project: MediaWiki-Parser.
bzimport set Reference to bz5256.
bzimport added a subscriber: Unknown Object (MLST).

gangleri wrote:

I assume that for {{UTF-8DECODE:}}

& should never be decoded as &
*but*
& should be escaped as &