Page MenuHomePhabricator

QUnit tests for jquery.makeCollapsible and jquery.color failing intermittently on Jenkins
Closed, ResolvedPublic

Description

QUnit tests for jquery.makeCollapsible are often failing on Jenkins. I've seen this at least three times in the last week or so, most recently on https://gerrit.wikimedia.org/r/#/c/124051/ . Every time the failure was within the jquery.makeCollapsible tests (but I'm not entirely sure if on the same exact test).

This might be related to the fake timers being used in them.


Version: 1.23.0
Severity: major

Details

Reference
bz63579

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:17 AM
bzimport set Reference to bz63579.

Warning: PhantomJS timed out, possibly due to a missing QUnit start() call.\x07 Use --force to continue

On https://gerrit.wikimedia.org/r/#/c/124366/ and some more of my changes from last days

Possible regression from gerrit 117153

Please have a look into this tests. Thanks.

  • Bug 63647 has been marked as a duplicate of this bug. ***

This has nothing to do with a time limit of any kind. The qunit tests run quicker than ever and have recently been sped up a lot further by getting rid of slow animations and http calls to the server etc.

This is failing because for some reason QUnit unexpectedly dies at which point PhantomJS never gets the 'end' event callback and thus times out. It does not time out because the tests are too long or anything.

mediawiki-core-qunit is now sometime stuck. I killed a few jobs earlier today that ran for multiple hours.

Just killed https://integration.wikimedia.org/ci/job/mediawiki-core-qunit/19068/console which was running for 10 minutes.

Might want to add adjust the timeout of jobs to something more reasonable than the current 360 minutes.

The internal time out is 30 seconds.

It hanging for more than 30 seconds is new, that didn't happen until today.

Both that 30s timeout and the 360min timeout should not be triggered because the tests aren't actually taking more than < 10-20 seconds to run. Something is causing it to get stuck or in some kind of loop...

This was resolved after the following upgrades were made:

The latter is what made the false failures go away, so presumably the changes in those QUnit releases fixed it (possibly the changes in v1.12.0 and v1.14.0 that cache setTimeout and clearTimeout to protect against mocks).

It's recurred. See https://integration.wikimedia.org/ci/job/mwext-GuidedTour-qunit/220/console . It's from about 6 hours ago.

It's causing spurious failures in GuidedTour (likely other extensions too). E.g.:

19:08:11 >> Message: after expanding: contentItem is visible
19:08:11 >> Actual: false
19:08:11 >> Expected: true

Someone really needs to take a look at this. It is causing major friction in the continuous integration pipeline due to spurious failures.

Krinkle/Antoine: can you take a look please.

This is an old report. Been fixed a while ago.

Hm.. will give it another look. Although I doubt it's a bug in core.

I'm going to work on this as well, since it's a recurring problem affecting the Growth team.

Actually, the first failure (at least sometimes, such as 220 mentioned above) is jquery.color.

From the full log (https://integration.wikimedia.org/ci/job/mwext-GuidedTour-qunit/220/consoleFull) (without full it shows the tail):

19:08:06 jquery.color - animate...ERROR
19:08:06 >> Message: Expected 1 assertions, but 0 were run

Change 143481 had a related patch set uploaded by Mattflaschen:
Use QUnit.asyncTest even for fake timers, to fix spurious failures

https://gerrit.wikimedia.org/r/143481

Considering it is only affecting tests related to asynchronous Date manipulation, I suspect one or more of the extensions the Growth team maintains is interfering with that interface.

That was the case a few months back as well (that time it was an infrastructure library inside Grunt/Jenkins causing it).

I can't reproduce this in MediaWiki core or any of the other extensions I can find. The test for MediaWiki core alone or those run for extensions such as VisualEditor and other projects are all consistently passing.

(In reply to Krinkle from comment #19)

Considering it is only affecting tests related to asynchronous Date
manipulation, I suspect one or more of the extensions the Growth team
maintains is interfering with that interface.

I'm not sure what you mean by "asynchronous Date manipulation". Sinon's fake timer methods are synchronous, AFAICT, just a simulation where you advance the clock manually.

As far as the actual code (jquery.color.js and jquery.makeCollapsible.js), they are both using .promise (https://api.jquery.com/promise/) to monitor completed actions in the fx queue.

GuidedTour (one of the extensions this bug affects) does not have any asynchronous tests, nor does GettingStarted. Although we use setTimeout, we certainly don't do anything weird like redefining it.

EventLogging (one of our dependencies) does use asynchronous tests, but we're hardly the only extension to use EventLogging. (Note, EventLogging is not a Growth extension). I don't know if any extensions with QUnit tests use EventLogging, but I suspect they do.

If you see a bug (or unexpected interaction) which we can fix in a Growth extension, let us know.

However, practically speaking, we should focus on getting this bug fixed. My next step (as mentioned on Gerrit) will be to try use PhantomJS's remote debugging to step through and find exactly what's happening.

Doesn't look like we have a GettingStarted example yet, so:

https://integration.wikimedia.org/ci/job/mwext-GettingStarted-qunit/58/consoleFull

(Jul 3, 2014 12:13:51 AM per https://integration.wikimedia.org/ci/job/mwext-GettingStarted-qunit/58/)

Seems to be the same set. Again, jquery.color is the first error.

If it affects any other extensions, please add.

Turning the tests from sync to async in core is the functional equivalent of removing the assertion "test framework works as expected". It's like boldly removing unit tests or parser tests because when executed with a certain extension, they no longer match expected behaviour.

I can reproduce this consistently but only with those extensions installed. With plenty of other extensions installed it works fine. I don't see anything obvious this extension that might cause it, but that wasn't the case earlier this year either and eventually we did find the problem there too.

After static analysis yielded nothing, I turned to removing code locally until you find out what's causing it. Maybe it'll turn out to be a trigger that calls back into mediawiki-core, maybe not. But the surface layer triggering it isn't in core.

Change 145479 had a related patch set uploaded by Mattflaschen:
Fix spurious QUnit failures related to fake timers

https://gerrit.wikimedia.org/r/145479

You were right, I didn't need to make it async.

Side note, the SinonJS docs imply that useFakeTimers() uses the current date by default. Unfortunately (this would have fixed half of my patch), this is not true, so I submitted a patch for those docs at https://github.com/sinonjs/sinon-docs/pull/2 .

Also, this really should've been a separate bug. The bug as reported was fixed and affected core. This looks similar (it's something with timers and animations), but a different exception is thrown, different unit tests are failing, and it's only affecting extensions (doesn't mean the bug isn;t in core, but it's a separate bug).

This isn't a "something is wrong with qunit and timers" tracking bug.

Change 145487 had a related patch set uploaded by Mattflaschen:
Don't actually show guiders in unit tests

https://gerrit.wikimedia.org/r/145487

(In reply to Krinkle from comment #25)

Also, this really should've been a separate bug. The bug as reported was
fixed and affected core.

The original symptoms appeared very close (not just "it's something with timers and animations", but specifically jquery.makeCollapsible).

I didn't dive deeper to figure out the underlying interactions until later.

Change 145487 merged by jenkins-bot:
Don't actually show guiders in unit tests

https://gerrit.wikimedia.org/r/145487

Change 143481 abandoned by Krinkle:
Use QUnit.asyncTest even for fake timers, to fix spurious failures

Reason:
See Ibada489a58295

https://gerrit.wikimedia.org/r/143481

I disabled the GuidedTour GUI that was triggering this during tests (1e37db808e1f9cc8daffb61e4303ce53d701e25a). Krinkle's b90c69cdf70ae55a7ae608b074893d140d0a96b6 ("QUnit: Detect unfinished jQuery effects or ajax requests in teardown") has been merged to core, which should detect similar issues in the future.

I will abandon the other patch, since if isolation deficiencies (e.g. running effects queue) are fully detected and error'ed, the runner doesn't need to enforce that isolation.

Change 145479 abandoned by Mattflaschen:
Fix spurious QUnit failures related to fake timers

Reason:
I'm okay with throwing an error (which is already merged), rather than having the test runner work to clean it up. The important thing is the test author/maintainer has a clear indication of the problem, which allows them to easily fix it.

https://gerrit.wikimedia.org/r/145479