-rakkaus:#mediawiki-i18n- [18-Nov-2014 14:18:47 UTC] PHP Notice: Uninitialized string offset: 710 in /www/translatewiki.net/w/includes/libs/JavaScriptMinifier.php on line 571
Description
Details
- Reference
- bz73556
Related Objects
Event Timeline
Excerpt:
class JavaScriptMinifier { public static function minify( $s, $statementsOnOwnLine = false, $maxLineLength = 1000 ) { // .. $out = ''; $pos = 0; $length = strlen( $s ); $lineLength = 0; $last = ';'; while( $pos < $length ) { // .. $out .= $token; $lineLength += $end - $pos; $last = $s[$end - 1]; $pos = $end;
The while loop manipulates $end to reflect where we are in the code. In some cases it advanced artificially based on changes or assumptions it made. I guess it must be off in some cases.
Does this one happen often on translatewiki? Can you produce a stack trace (so that we know whether it's triggered by ResourceLoaderFileModule, WikiModule, something else, etc. And ideally also an excerpt of $s (e.g. the first 20 characters) which will probably identify what the input script was.
@Nemo: Can you help get a test case? By getting a stack trace we'll narrow down whehter this is from a file module, wiki module or other generated script. And by getting the argument we'll identify the script in question.
I finally got a backtrace and a URL. There must be some sort of cache as the notice is not reproducible by accessing the url.
It seems the reason is missing / from beginning of a comment: https://translatewiki.net/wiki/User:Arystanbek/common.js
2014-12-04 11:48:09 translatewiki.net translatewiki_net-bw_: [6d10d1d5] /w/load.php?debug=false&lang=kk-cyrl&modules=user&only=scripts&skin=vector&user=Arystanbek&version=20141204T114746Z&* ErrorException from line 571 of /www/translatewiki.net/w/includes/libs/JavaScriptMinifier.php: PHP Notice: Uninitialized string offset: 710 #0 /www/translatewiki.net/w/includes/libs/JavaScriptMinifier.php(571): MWExceptionHandler::handleError(8, 'Uninitialized s...', '/www/translatew...', 571, Array) #1 /www/translatewiki.net/w/includes/resourceloader/ResourceLoader.php(194): JavaScriptMinifier::minify('/*?User:Arystan...', false, 1000) #2 /www/translatewiki.net/w/includes/resourceloader/ResourceLoader.php(1035): ResourceLoader->filter('minify-js', '/*?User:Arystan...') #3 /www/translatewiki.net/w/includes/resourceloader/ResourceLoader.php(647): ResourceLoader->makeModuleResponse(Object(ResourceLoaderContext), Array, Array) #4 /www/translatewiki.net/w/load.php(45): ResourceLoader->respond(Object(ResourceLoaderContext)) #5 {main}
I can still reproduce this:
> $a = json_decode("\" * \u0411\u0435\u0442 \u043c\u04d9\u0442\u0456\u043d\u0456\u043d \u041c\u0435\u0434\u0438\u0430\u0423\u0438\u043a\u0438 \u0445\u0430\u0431\u0430\u0440 \u0430\u0442\u0442\u0430\u0440\u044b\u043d\u0430 \u0430\u0443\u044b\u0441\u0442\u044b\u0440\u044b\u043f \u043a\u04e9\u0440\u0441\u0435\u0442\u0435\u0442\u0456\u043d \u0441\u0456\u043b\u0442\u0435\u043c\u0435\u043d\u0456 \u049b\u04b1\u0440\u0430\u043b\u0434\u0430\u0440 \u0431\u04e9\u043b\u0456\u043c\u0456\u043d\u0435 \u049b\u043e\u0441\u0443\n * Revision: 1.0\n * Author: Edokter\n * *Source code: [[:en:MediaWiki:Gadget-ShowMessageNames.js]]\n *\/\n \n$( document ).ready( function() {\n mw.util.addPortletLink(\n 'p-tb',\n location.href.replace( location.hash, '' ) + ( location.search ? '&' : '?' ) + 'uselang=qqx',\n '\u0416\u04af\u0439\u0435 \u0445\u0430\u0431\u0430\u0440\u044b\u043d\u044b\u04a3 \u0430\u0442\u0430\u0443\u043b\u0430\u0440\u044b',\n 't-messagenames',\n '\u0411\u0435\u0442 \u043c\u04d9\u0442\u0456\u043d\u0456\u043d \u041c\u0435\u0434\u0438\u0430\u0423\u0438\u043a\u0438 \u0445\u0430\u0431\u0430\u0440 \u0430\u0442\u0430\u0443\u043b\u0430\u0440\u044b\u043d\u0430 \u0430\u0443\u044b\u0441\u0442\u044b\u0440\u044b\u043f \u043a\u04e9\u0440\u0441\u0435\u0442\u0443'\n );\n});\""); > echo JavaScriptMinifier::minify($a); PHP Notice: Uninitialized string offset: 643 in /vagrant/mediawiki/includes/libs/JavaScriptMinifier.php on line 579 PHP Stack trace: PHP 3. eval() /vagrant/mediawiki/maintenance/eval.php:78 PHP 4. JavaScriptMinifier::minify() /vagrant/mediawiki/maintenance/eval.php(78) : eval()'d code:1
Test case: JavaScriptMinifier::minify('*/');
The * is recognized as a punctuation character, and the / is recognized as a regex start, that's where there are increments of 2, $end reaches 3, then we hit the "break" in the loop and $end is staying overflown.
Possible fix, inserting near line 484:
// Regexp literal, search to the end, skipping over backslash escapes and // character classes for( ; ; ) { do{ $end += strcspn( $s, '/[\\', $end ) + 2; } while( $end - 2 < $length && $s[$end - 2] === '\\' ); $end--; if( $end - 1 >= $length || $s[$end - 1] === '/' ) { break; } do{ $end += strcspn( $s, ']\\', $end ) + 2; } while( $end - 2 < $length && $s[$end - 2] === '\\' ); $end--; }; + if( $end > $length ) { + $end--; + } // Search past the regexp modifiers (gi) while( $end < $length && ctype_alpha( $s[$end] ) ) { $end++; }
Change 399861 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] JavaScriptMinifier: Improve docs around parsing of regexp literals
@Od1n Thank you for your patch. To our knowledge, the impact of this PHP notice is low because the fallback behaviour when the error happens seems to result in the correct outcome. As such, this has relatively low priority, and relatively high risk of breaking something. As such, I cannot promise that this will be implemented soon.
However, I will review the code thoroughly and consider your patch. As a starting point, I have improved the test coverage of this class, removed some unused code, and improved the documentation.
- https://gerrit.wikimedia.org/r/399853 (JavaScriptMinifier: Improve code coverage)
- https://gerrit.wikimedia.org/r/399855 (JavaScriptMinifier: Remove support for unused statementsOnOwnLine parameter)
- https://gerrit.wikimedia.org/r/399861 (JavaScriptMinifier: Improve docs around parsing of regexp literals)
Indeed, the error happens only on invalid input. Still, there is a clear possibility of overflow here, fixing it is good to do. You should consider as luck it resulted in just a php notice :)
But please consider my above patch only as a starting point, a demonstration of concept. It probably could be improved.
Change 399861 merged by jenkins-bot:
[mediawiki/core@master] JavaScriptMinifier: Improve docs around parsing of regexp literals
@Od1n Now that we have full code/test coverage, I'd be willing to consider your patch. I'll look at it this later this week.
If you are familiar using the command prompt (also known as Terminal), could you consider submitting the patch to Gerrit? See Developer access and Gerrit for details. Alternatively, if you Git already, you can use Patch uploader (no git-review, ssh or Gerrit account needed).
Change 403331 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] JavaScriptMinifier: Improve docs for parsing of string literals
Change 403332 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] JavaScriptMinifier: Fix "Uninitialized offset" in string and regexp parsing
@Od1n I've used your patch as starting point, but applied it slightly differently.
- Instead of checking outside the do-while loop, I've placed the check closer to the point where I believe the incorrect offset is created. I did so by changing an existing check, this helps avoid introducing a second check and should make the code easier to follow. However, if you believe there are cases where this problem can be reached for the other branches in the do-while loop, let me know and we can consider adding a second check there, and/or by placing it outside the loop like you had originally.
- I noticed the same problem exists with the parsing of string literals, and applied a similar fix there.
The added tests in https://gerrit.wikimedia.org/r/403332 consistently produce PHP Notice: Uninitialized string offset when run without the patch to JavaScriptMinifier.php.
Thank you for your work on this.
I trust you about it, you probably have better understood that code than myself.
I think the most important is to apply the "overflow fixes" only in code paths where such overflows can technically happen. If ever there are doubts, better fix a bit less than too much.
So, you don't think there is the same issue with the 2nd do/while of the for loop? I haven't checked it thoroughly, but it just looks very similar, with these n+2 steps. That's why I added the "overflow fix" after the for loop in my demonstration patch, to cover the two do/while's.
Would you be able to write a test that breaks the 2nd do/while?
Change 403331 merged by jenkins-bot:
[mediawiki/core@master] JavaScriptMinifier: Improve docs for parsing of string literals
Change 403332 merged by jenkins-bot:
[mediawiki/core@master] JavaScriptMinifier: Fix "Uninitialized offset" in string and regexp parsing
Change 406972 had a related patch set uploaded (by Krinkle; owner: Krinkle):
[mediawiki/core@master] JavaScriptMinifier: Fix "Uninitialized offset" in regexp char class parsing