Page MenuHomePhabricator

mb_check_encoding() change in php 5.4.10
Closed, ResolvedPublic

Description

testIsUtf8WithMbstring is failing in PHP 5.4.10 with the strings:

  • "\xf8\x88\x80\x80\x80" (#6)
  • "\xfc\x84\x80\x80\x80\x80" (#7)
  • "\xf7\xbf\xbf\xbf" (#11)
  • "\xfb\xbf\xbf\xbf\xbf" (#12)
  • "\xf4\x90\x80\x80" (#18)

mb_check_encoding( $string, 'UTF-8' ) is returning false in php 5.4.10 (they made their checks stricter?)

Hashar, I'm assigning this to you as 750db30d9 author.


Version: 1.21.x
Severity: normal

Details

Reference
bz43679

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 1:23 AM
bzimport set Reference to bz43679.

Did it succeed with a previous 5.4 release or did it start failing with 5.4.10 ?

I do not yet have time to look at this for now, I guess we want our test to react differently whenever PHP is 5.4 or later.

This happens for me on PHP 5.4.4-14 (Debian).
Most likely this item from the PHP 5.4.0 NEWS file indicates the change: Ill-formed UTF-8 check for security enhancements. (Rui)

Related URL: https://gerrit.wikimedia.org/r/65300 (Gerrit Change I026eff69236187ce3d2ff2fc261a5e1d9cd88b24)

Both patches mentioned here have received -1's and likely need rework.

Change 48743 had a related patch set uploaded by PleaseStand:
Adapt StringUtils::isUtf8 to the top of Unicode at U+10FFFF

https://gerrit.wikimedia.org/r/48743

Change 48743 merged by jenkins-bot:
Adapt StringUtils::isUtf8 to the top of Unicode at U+10FFFF

https://gerrit.wikimedia.org/r/48743