Page MenuHomePhabricator

� in truncated captions
Closed, ResolvedPublic

Description

Using the Serbian user interface, � is displayed at the end of some Serbian
definitions (e.g. [[WiktionaryZ:biologija]]). --[[User:Red Baron|Red Baron]]
11:34, 29 August 2006 (CEST)

:It looks fine to me (Firefox) [[User:GerardM|GerardM]] 13:54, 1 September 2006
(CEST)
: it happens to me in French, it's at the end of the part of the def that is
displayed (before we click on the "+" sign) when the letter where it's cut is a
special character (é, è, ...). (Is it any clear?) [[User:Kipcool|Kipcool]]
19:25, 8 September 2006 (CEST)
:: Seen again today on the above-mentioned article (with Serbian interface).
[[User:Kipcool|Kipcool]] 14:47, 17 November 2006 (CET)

Taken from: http://wiktionaryz.org/index.php?title=Insect_room&oldid=596181


Version: unspecified
Severity: major
Platform: PC
URL: http://www.omegawiki.org/Expression:biologija

Details

Reference
bz8095

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 9:33 PM
bzimport set Reference to bz8095.
bzimport added a subscriber: Unknown Object (MLST).

Created attachment 4608
Screenshot of anomaly

Attached:

Incorrect_characters_in_OW_in_sr_UI.png (157×589 px, 6 KB)

Still happens with UI language set to 'sr' on
http://www.omegawiki.org/Expression:biologija. Added a screenshot and updated URL.

This bug is still there; the problem is that the truncation of the displayed text uses plain old substr (see OmegaWiki/Editor.php), which cuts at bytes, not at UTF-8 characters. So if the boundary happens to be inside a UTF-8 character, the result is an invalid UTF-8 sequence, displayed as � in most browsers.

You need either to use mb_substr (or implement a similar function), or clean the resulting string (UTF-8 is self-synchronizing, so it’s not a problem).