Page MenuHomePhabricator

wfBaseConvert gives incorrect results and sometimes raises warning if using gmp and input starts with '0x' but base != 16 ('x' is valid in base 34-36)
Closed, ResolvedPublic

Description

Author: fred

Description:
For example:

  • converting '0x10' from base36 to base36 results in '000g' instead of '0x10'
  • converting '0x1234567890abhaer' from base36 to base36 returns '000000000000000000, and raises this warning: "Warning: gmp_init(): Unable to convert variable to GMP - wrong type! in /home/fredemmott/test2.php on line 3"; presumably this is because 'h' isn't valid in an executable string

Tested with PHP 5.5.9 and HHVM 3.2.0 against Mediawiki master and 1.22wmf22

Isolated examples showing the cause:

<?php
$in = '0x1234567890abhaer';
$gmp = gmp_init($in, 36);
$val = gmp_strval($gmp, 36);
var_dump(str_pad($val, strlen($in), '0', STR_PAD_LEFT));

<?php
$in = '0x10';
$gmp = gmp_init($in, 36);
$val = gmp_strval($gmp, 36);
var_dump(str_pad($val, strlen($in), '0', STR_PAD_LEFT));

This issue was found by WfBaseConvertTest::testIdentity intermittently failing on HHVM's CI system since we added GMP support. Cause found by fuzzing:

<?php

function test() {

$chars = '0123456789abcdefghijklmnopqrstuvwxyz';
$base = mt_rand(2, 36);
$len = mt_rand(10, 100);
$str = '';
for ($i = 0; $i < $len; ++$i) {
  $str .= $chars[mt_rand(0, $base - 1)];
}
$gmp = gmp_init($str, $base);
$val = gmp_strval($str, $base);
$val = str_pad($val, strlen($str), '0', STR_PAD_LEFT);
if ($val !== $str) {
  throw new Exception('Not same - base '.$base."\n-".$str."\n+".$val);
}

}

for($i = 0; $i < 100000; ++$i) {

test();

}


Version: 1.24rc
Severity: normal
OS: Linux
See Also:
https://bugs.php.net/bug.php?id=50175
https://bugs.php.net/bug.php?id=55398

Details

Reference
bz69249

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:28 AM
bzimport set Reference to bz69249.
bzimport added a subscriber: Unknown Object (MLST).

fred wrote:

s/executable string/hexadecimal string/... oops.

Thanks for taking the time to report this!

Note to myself: includes/GlobalFunctions.php in Core.

CC'ing Aaron as he's worked on wfBaseConvert before.

fred wrote:

Just to be clear re the hiphop keyword: this bug also exists if using PHP5 with the gmp extension.

I think this is a case where gmp_init() is not behaving as documented on php.net. ltrim( $input, '0' ) should suffice as a workaround.

The documentation:

The base may vary from 2 to 36. If base is 0 (default value), the actual base is determined from the leading characters: if the first two characters are 0x or 0X, hexadecimal is assumed, otherwise if the first character is "0", octal is assumed, otherwise decimal is assumed.

[It doesn't say "if and only if", though there is no mention of autodetection when base is not 0.]

PHP 5.6, from convert_to_gmp() in ext/gmp/gmp.c:

if (Z_STRLEN_P(val) > 2) {
if (numstr[0] == '0') {

		if (numstr[1] == 'x' || numstr[1] == 'X') {
			base = 16;
			skip_lead = 1;
		} else if (base != 16 && (numstr[1] == 'b' || numstr[1] == 'B')) {
			base = 2;
			skip_lead = 1;
		}

}
}

[Why is this code even present when simply leaving it out, according to the GMP documentation https://gmplib.org/manual/Assigning-Integers.html#Assigning-Integers, would produce the documented behavior?]

HHVM, from variantToGMPData():

//Figure out what Base to use based on the ~*data*~
if (strLength > 1 && dataString[0] == '0') {

if (strLength > 2) {
  if (dataString[1] == 'x' || dataString[1] == 'X') {
    base = 16;
    dataString = dataString.substr(2);
  } else if (base < 12 && (dataString[1] == 'b'
                           || dataString[1] == 'B')) {
    base = 2;
    dataString = dataString.substr(2);
  }
}
if (base == -1) {
  base = 8;
}

} else if (strLength == 0) {

dataString = s_gmp_0.get();

}

if (base == -1) {

base = GMP_DEFAULT_BASE;

}

[Also note the use of -1 rather than 0 for autodetect.]

This is a very old bug in PHP, likely caused by the following commit:

https://github.com/php/php-src/commit/cc065b33514a3ef6533de6b5406db954a5f4b9c4

Notice that the base == 0 check was lost. Then the committer noticed a problem and "fixed" it:

https://github.com/php/php-src/commit/6829710dceb0d39a06b7560a493a462e7504657b

(In reply to Kevin Israel (PleaseStand) from comment #6)

https://bugs.php.net/bug.php?id=50175
https://bugs.php.net/bug.php?id=55398

Fixed for PHP 5.6.1. Just have to fix HHVM and add a workaround to MediaWiki for older PHP versions.

Change 157870 had a related patch set uploaded by PleaseStand:
wfBaseConvert(): Work around PHP Bug #50175

https://gerrit.wikimedia.org/r/157870

Change 157870 merged by jenkins-bot:
wfBaseConvert(): Work around PHP Bug #50175

https://gerrit.wikimedia.org/r/157870