Page MenuHomePhabricator

JS syntax errors should not affect other modules in a load.php request
Closed, ResolvedPublic

Description

When ResourceLoader combines multiple .js scripts into a single bundle, they're given separate function contexts (thus separate local variable namespaces) and are executed separately with try/catch wrappers to isolate faults.

However, a JavaScript syntax error will cause the entire batch of modules to fail, as nothing will be executed in the first place...

This can let a broken gadget for instance break important parts of the UI if they end up getting bundled together:

unmatched ) in regular expression <- syntax err in a gadget
addButton is not defined <- editing widgets haven't been loaded either

A syntax validity check might be a good thing to throw in before (or after?) minification.


Version: 1.18.x
Severity: normal

Details

Reference
bz28626

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 11:33 PM
bzimport set Reference to bz28626.

Would minified CSS run into the same problem?

CSS parsing generally recovers from syntax errors (like HTML usually does), as opposed to rejecting the entire file at the first error, so in most cases only the damaged rule itself will be broken.

Could still be an issue if there's a major error at the end of one file that breaks interpretation of the beginning of the next file that's been appended, but probably much lower impact.

Are there any JS syntax checkers out there that are GPL compatible? It would be nice if we could add one into ResourceLoader's minifier. Unfortunately, most of them seem to be forks of JSLint which isn't GPL-compat.

(In reply to comment #3)

Are there any JS syntax checkers out there that are GPL compatible? It would be
nice if we could add one into ResourceLoader's minifier. Unfortunately, most of
them seem to be forks of JSLint which isn't GPL-compat.

The current minifier essentially parses JS, it may be possible to tweak it to validate the syntax too.

(In reply to comment #1)

Would minified CSS run into the same problem?

Except for core modules, CSS isn't combi-minified. Each module has an mw.loader.implement()-call with an argument for the module javascript context, and an argument with the CSS. The CSS is unable to infect/mess up other CSS.

(In reply to comment #4)

The current minifier essentially parses JS, it may be possible to tweak it to
validate the syntax too.

The minifier uses a very reduced js grammar to simplify the code and speed up the process. Expanding it to fully validate the syntax would make it similar to jsminplus, which is way slower and more complex.
I'd suggest to only check the syntax on save for gadget pages and user js pages and display a warning to the user if it fails.

miao_the_cat wrote:

I was wondering if anyone has a solution for this bug? I am facing some difficulty with a Joomla-based jwiki Mediawiki installation. The install goes through without any problems but I cannot see the edit toolbar when trying to edit. I can't load an external editing toolbar either. I realised after stepping through the Chrome browser that all the .js files related to jwiki Mediawiki (eg. edit.js, wikibits.js) are not loading during the session.

Any advice on how I can proceed will be greatly appreciated.

Cheers,

Christopher

(In reply to comment #8)

I was wondering if anyone has a solution for this bug? I am facing some
difficulty with a Joomla-based jwiki Mediawiki installation. The install goes
through without any problems but I cannot see the edit toolbar when trying to
edit. I can't load an external editing toolbar either. I realised after
stepping through the Chrome browser that all the .js files related to jwiki
Mediawiki (eg. edit.js, wikibits.js) are not loading during the session.

Any advice on how I can proceed will be greatly appreciated.

Are you using MediaWiki 1.17? Are you getting any JavaScript errors in the console?

miao_the_cat wrote:

I am not exactly using Mediawiki. I am using jwiki, which is an extension of Joomla content management system. It was ported over based on Mediawiki 1.15.0. Info about jwiki is found here: http://hallowelt.biz/software/joomlawiki/

The error I see in the console is:
Uncaught ReferenceError: addButton is not defined

I tracked the function addButton to Edit.js file (which exists in Mediawiki as well) and realised that none of the javascript files are loaded in under the resources tab - which I assume means that somehow Chrome couldn't load the js files.

(In reply to comment #10)

I am not exactly using Mediawiki. I am using jwiki, which is an extension of
Joomla content management system. It was ported over based on Mediawiki
1.15.0.

This bug is about a feature introduced in 1.17.0, so your issue is very likely to be unrelated.

miao_the_cat wrote:

Apologies then. Please disregard my comments and queries.

(In reply to comment #7)

I'd suggest to only check the syntax on save for gadget pages and user js pages
and display a warning to the user if it fails.

That might be a stop-gap measure. If so, I'd suggest to even refuse to save for any js pages in MediaWiki-namespace on syntax errors. For user js, a mere warning may be sufficient.

I was directed here because of the Commons upload problem where the Category selection disappeared from all English upload forms named: Basic, Special and Transfer - (all essentially are the same). Here is the screenshot of the problem.
http://commons.wikimedia.org/wiki/File:Upload_options.jpg. To me, this is a major drawback as I have thousands of images to upload and the new upload wizard hinders the process in the context of my work. Please fix this ASAP. Thank you.

(In reply to comment #14)

I was directed here because of the Commons upload problem where the Category
selection disappeared from all English upload forms named: Basic, Special and
Transfer - (all essentially are the same). Here is the screenshot of the
problem.
http://commons.wikimedia.org/wiki/File:Upload_options.jpg. To me, this is a
major drawback as I have thousands of images to upload and the new upload
wizard hinders the process in the context of my work. Please fix this ASAP.
Thank you.

I don't see how that is even *remotely* related to this bug.

(In reply to comment #15)

(In reply to comment #14)

I was directed here because of the Commons upload problem where the Category
selection disappeared from all English upload forms named: Basic, Special and
Transfer - (all essentially are the same). Here is the screenshot of the
problem.
http://commons.wikimedia.org/wiki/File:Upload_options.jpg. To me, this is a
major drawback as I have thousands of images to upload and the new upload
wizard hinders the process in the context of my work. Please fix this ASAP.
Thank you.

I don't see how that is even *remotely* related to this bug.

Sorry if I posted here by mistake, but this is the bug number I was given in the Commons village pump in this post -->http://commons.wikimedia.org/wiki/Commons:Village_pump#Missing_categories_selection_from_English_upload_forms

Just for the record, having a JSLint or JSHint (that is GPL) compatible in PHP doesn't (directly) help this bug getting fixed.

Many lints and hints are good practices or could cause constructions not to work, or exceptions to be thrown. But just checking syntax errors probably requires a different (possibly simpler) kind of validator.

This could be derived from JSHint or JSLint-ish tools, though.

But the kind of thing we should probably check for:

  • Unclosed braces, brackets, quotes, apostrophes and parentheses.
  • Closing too many of the above.
  • Anything else ?

Since I'm brushing up on my low-level parsing skills anyway ;) I'm making a quick stab at building a PHP-based recursive-descent parser following the lexical & syntactical grammars in the ECMAScript standard -- this should be able to determine pretty cleanly what does/doesn't parse, and could be invoked at page save time to give feedback on identifying syntax errors in editable .js pages.

May or may not end up reasonable enough to actually run it within ResourceLoader; like minification, the result can be cached so it might actually be reasonable to do. :)

(In reply to comment #18)

Since I'm brushing up on my low-level parsing skills anyway ;) I'm making a
quick stab at building a PHP-based recursive-descent parser following the
lexical & syntactical grammars in the ECMAScript standard -- this should be
able to determine pretty cleanly what does/doesn't parse, and could be invoked
at page save time to give feedback on identifying syntax errors in editable .js
pages.

Just to save you a bit of work, note that there is already JSMinPlus [1][2], which is pretty much a JS parser, written in pure PHP. It should be easy to adapt it a bit to make it a syntax checker.

[1] Desription: http://crisp.tweakblogs.net/blog/1856/jsmin+-version-13.html
[2] Source: http://files.tweakers.net/jsminplus/jsminplus.zip

Did you guys decide to use jsmin* or did brion write his own? How close are we to closing this?

I've done a little prelim poking at a custom parser for my own edification (mostly to make sure I better understood the ECMAScript specs) but it's a relatively low priority for me.

It looks like jsmin+'s parser is indeed legit per description at http://crisp.tweakblogs.net/blog/1665/a-new-javascript-minifier-jsmin+.html -- that's a good catch, paul!

On quick peek, looks like it's got a standalone JSParser class that produces a parse tree (and throws exceptions on parse error) which could be called explicitly, or simply used via the JSMinPlus class which calls through to the parser, then flattens the parse tree into minified source.

License is... MPL 1.1/GPL 2.0/LGPL 2.1 so it should be fine to integrate into MediaWiki (yay!)

It's probably worth comparing jsmin+'s minification qualities to our current home-grown minifier; it sounds like just swapping to jsmin+ as a minifier might be more robust (since it should be able to detect a JavaScript parse failure and throw an error that lets us skip the bad file) but we could also use just the parser as a verifier.

If nothing else, I'd strongly recommend grabbing it and using it for the minification unit tests (to verify that minified source parses correctly).

The library doesn't appear to have been updated since May 2009 and its parser is based on the JS-based implementation in https://github.com/mozilla/narcissus/wiki -- so it might need/want some slight tweaks for newer ECMAScript standard updates (though it's unlikely we'll be using any such new features in our code!)

Created attachment 8752
Quick patch adding jsmin+ and running its JS parser on JavaScriptMinifier test case results

This quick patch adds jsminplus.php into libs/, makes it and its JSParser available via autoloader, and calls the parser on post-minified JS bits in JavaScriptMinifierTest.

Unfortunately many of the test cases don't appear to be valid in the first place...

..EE.E....E.EE................E.........................

Time: 0 seconds, Memory: 23.50Mb

There were 7 errors:

  1. JavaScriptMinifierTest::testJavaScriptMinifierOutput with data set #2 ('\' Foo \\\' bar \\ baz \\\' quox \' .', '\' Foo \\\' bar \\ baz \\\' quox \'.')

Exception: Parse error: Unexpected token; token 3 expected in file 'minify-test.js' on line 1

  1. JavaScriptMinifierTest::testJavaScriptMinifierOutput with data set #3 ('\\" Foo \\" bar \\\\n baz \\" quox " .', '\\" Foo \\" bar \\\\n baz \\" quox ".')

Exception: Parse error: Illegal token in file 'minify-test.js' on line 1

  1. JavaScriptMinifierTest::testJavaScriptMinifierOutput with data set #5 ('/ Foo \\/ bar [ / \\] / ] baz / .', '/ Foo \\/ bar [ / \\] / ] baz /.')

Exception: Parse error: Unexpected token; token 3 expected in file 'minify-test.js' on line 1

  1. JavaScriptMinifierTest::testJavaScriptMinifierOutput with data set #10 ('return

x;', 'return
x;')
Exception: Parse error: Invalid return in file 'minify-test.js' on line 1

  1. JavaScriptMinifierTest::testJavaScriptMinifierOutput with data set #12 ('continue

x;', 'continue
x;')
Exception: Parse error: Invalid continue in file 'minify-test.js' on line 1

  1. JavaScriptMinifierTest::testJavaScriptMinifierOutput with data set #13 ('break

x;', 'break
x;')
Exception: Parse error: Invalid break in file 'minify-test.js' on line 1

  1. JavaScriptMinifierTest::testJavaScriptMinifierOutput with data set #30 ('return/ x/g', 'return/ x/g')

Exception: Parse error: Invalid return in file 'minify-test.js' on line 1

The return/continue/break bits appear to be problematic because they don't appear in a legitimate context (return must be in a function, continue/break must be in a looping construct). The quoted strings... ugh I haven't tried to decipher those yet.

Attached:

(In reply to comment #21)

It's probably worth comparing jsmin+'s minification qualities to our current
home-grown minifier; it sounds like just swapping to jsmin+ as a minifier might
be more robust (since it should be able to detect a JavaScript parse failure
and throw an error that lets us skip the bad file) but we could also use just
the parser as a verifier.

AFAIR JsMinPlus had been considered as a minifier for Mediawiki but turned out to be much slower than the alternatives, see bug 27691 for some benchmarks.

(In reply to comment #22)

Unfortunately many of the test cases don't appear to be valid in the first
place...

Yeah, they weren't designed to be complete js programs, just little fragments to test edge cases that are likely to be broken by other minifiers. Should be straightforward to make them valid programs, though.

Ok r91591 adds JSMin+ to libs and uses its parser on the minifier unit tests (now tweaked to be valid JS programs).

r91608 adds a $wgResourceLoaderValidateJs option (on by default) and uses JSMin+'s parser to validate JS coming from files or wiki pages into ResourceLoader. On parse error, the bad script chunk is replaced with a JS exception throw which lists the file/page name, line number, and parse error.

This ought to do the job, but I recommend some more thorough testing to make sure it doesn't break anything & some performance testing -- it does look like JSParser is a bit sluggish but I haven't really solidly compared its behavior.

Advantage of using a real parser here is of course that we can report errors based on the original line numbers rather than forcing people to switch to debug mode just to get a location.

Per notes about memory usage from TranslateWiki (bug 29784) I've disabled validation for file-backed JS modules by default in r91914.

Large libraries like jQuery & jQuery.ui when bundled up can hit the memory limits sooner than we like; while the core jQuery survived at my default setting, it would fail at 80M for me.

The primary target for validation is on-wiki pages that are editable, while things on disk are provided with MediaWiki core & extensions and should already be debugged, so they don't really need to be validated anyway.

This may still though die on largeish input from the wiki, so would be nice to cut it down. PHP/Zend's in-memory representation of zvals is notoriously inefficient, so the parse tree structure is likely super wasteful if it hasn't been optimized specifically for memory.

(In reply to comment #25)

This ought to do the job, but I recommend some more thorough testing to make
sure it doesn't break anything & some performance testing -- it does look like
JSParser is a bit sluggish but I haven't really solidly compared its behavior.

It has been a month and there aren't reports of problems on TWN. Closing this.

Bugs in implmentation or performance should be reported separately.

Removing the bug 29784 dependency -- it should still be fixed but isn't as big a problem for smaller site JS files.