Page MenuHomePhabricator

Users experiencing "Script not responding" errors on 1.24wmf11 wikis – EventLogging triggering itself with deprecation warnings
Closed, ResolvedPublic

Description

Multiple reports about problem with scripts serverd by bits.wikimedia.org.

Bits is very slow & a lot of scripts errors.

See https://commons.wikimedia.org/wiki/Commons:Forum#Anmeldung.3F


Version: wmf-deployment
Severity: major

Details

Reference
bz67420

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 22 2014, 3:27 AM
bzimport set Reference to bz67420.

Can't reproduce with Firefox 30 or Chrome 35.
Are these all Monobook skin users?

traceroute might also be welcome.

I don't have a problem so i can't trackeroute, but a user has reported:
Kabel Deutschland, (traceroute wikimedia.org via muntiple dynip.superkabel.de to ip4.tinet.net)

I experienced this issue just now as well on meta. FF30, using Vector.

My urls looked like https://bits.wikimedia.org/meta.wikimedia.org/load.php?debug=false&lang=en&modules=mediawiki.inspect%7Cschema.DeprecatedUsage&skin=vector&version=20140701T181452Z&*:2 and https://bits.wikimedia.org/meta.wikimedia.org/load.php?debug=false&lang=en&modules=jquery%2Cmediawiki%2CSpinner%7Cjquery.triggerQueueCallback%2CloadingSpinner%2CmwEmbedUtil%7Cmw.MwEmbedSupport&only=scripts&skin=vector&version=20140701T181452Z:6

After hitting continue about 5 times (my browser locked up in between) I hit "Stop script" and gave up.

I haven't run into anything like this in browsing enwp or mw.o, but those are both in MonoBook.

Traceroute is running...

$ traceroute bits.wikimedia.org
traceroute to bits-lb.eqiad.wikimedia.org (208.80.154.234), 64 hops max, 52 byte packets
1 192.168.1.1 (192.168.1.1) 0.882 ms 0.681 ms 0.664 ms
2 adsl-99-19-51-254.dsl.pltn13.sbcglobal.net (99.19.51.254) 10.564 ms

12.83.88.205 (12.83.88.205)  10.675 ms  11.433 ms

3 12.122.114.29 (12.122.114.29) 11.906 ms 15.319 ms 11.902 ms
4 192.205.37.58 (192.205.37.58) 40.980 ms 79.860 ms 40.158 ms
5 ae-7.r20.snjsca04.us.bb.gin.ntt.net (129.250.5.52) 42.310 ms * 38.981 ms
6 ae-4.r21.asbnva02.us.bb.gin.ntt.net (129.250.4.102) 108.718 ms 109.822 ms 108.737 ms
7 ae-2.r04.asbnva02.us.bb.gin.ntt.net (129.250.4.207) 113.969 ms 108.205 ms 111.447 ms
8 * * *
9 * * *

And * * * up to #64.

I ran into this on mediawiki.org, using the MonoBook skin. Bug 67404 says it happens on Modern. I didn't mention it earlier, but I'm using OSX.

This is only occurring on non-Wikipedias, leading me to guess it's something that was deployed in 1.24wmf11.

  • Bug 67404 has been marked as a duplicate of this bug. ***

On OS X too, FF 30, and having the same issues. Not on any Wikipedias.

Reedy: This is potentially a blocker for Wikipedias deploy today. I'll let you decide on course of action since I'll be out.

I also experienced this exactly once now on Commons on a File: page, Firefox 30 on Fedora 20, Vector skin, being logged in.

Had this several times in last days on mw.org, but only on FF 17 behind a company proxy, so i thought it wasn't a problem with MediaWiki/Wikimedia. Unhappily i can't do a traceroute here :/

With no solid evidence to back this up, I have a hunch that the
immediate cause is cde08292 "Add json2.js polyfill (v2014-02-04; with
module skip function)" and the underlying cause is the newly
introduced skip function support in ResourceLoader.

Circumstantial evidence:

  • This happens on various very different wikis (Wikidata, Commons, Wikisource), which hints towards something in core or very widely deployed extension, and I see no interesting changes in extensions at first glance on [[mw:MediaWiki_1.24/wmf11]].
  • There have been very few other JavaScript changes in core wmf11 (apart from that, only three changes: 96c65cf8 fa2d7e8c d57d9edd).
  • That change modifies the 'mediawiki.inspect' module to use the new functionality, and mediawiki.inspect appears in the problematic URL from comment 4.

It would be very helpful if someone who can reproduce this on local
installation could try reverting cde08292 and checking if that fixes
the issue. (I am unable to reproduce even on WMF wikis :( )

Change 143880 had a related patch set uploaded by Krinkle:
Revert addition of json2.js

https://gerrit.wikimedia.org/r/143880

JSON2 is only loaded in old browses. And just in case the feature test wasn't passing, I confirmed just now in Chrome 35 and Firefox 29 and 30 that when "loading" the json module, JSON.stringify and JSON.parse are not redefined and nothing is fetched or executed from that module.

Also to clarify, the usage in mw.inspect also had no impact because it's the reference to $.toJSON is a direct pointer to JSON.stringify. Nothing changed in this regard for modern browsers from what I can tell.

(Re the revert patch that was already proposed, if we want to revert at all, it should be sufficient to remove the (two) uses in core to resolve the immediate issue, if that in fact resolves this at all.)

Change 143894 had a related patch set uploaded by Jforrester:
Use 'json' module instead of deprecated 'jquery.json'

https://gerrit.wikimedia.org/r/143894

  • Bug 67478 has been marked as a duplicate of this bug. ***

Change 143894 merged by jenkins-bot:
Use 'json' module instead of deprecated 'jquery.json'

https://gerrit.wikimedia.org/r/143894

Change 143896 had a related patch set uploaded by Jforrester:
Update EventLogging to pull in rEEVL1e9adcfc5102

https://gerrit.wikimedia.org/r/143896

Change 143880 abandoned by Bartosz Dziewoński:
Revert addition of json2.js

Reason:
Turns out I was mostly wrong. While doing this would probably fix the issue, I just accidentally happened to pinpoint the right commit (I had one chance in four after all). See comment 19 on the bug.

https://gerrit.wikimedia.org/r/143880

(In reply to Lupo from comment #19)

Depreceation loop? See

https://commons.wikimedia.org/w/index.php?title=Commons:
Forum&diff=128061457&oldid=128054357

Yes, that does seem to be the case. I filed bug 67481 to request safeguards be added to the deprecation logger so this doesn't happen again next time.

(In reply to Andre Klapper from comment #1)

Can't reproduce with Firefox 30 or Chrome 35.

Your random number generator simply didn't produce a number < 0.01

Apart from bug 67481, what is WMF's strategy to avoid a bug like this [not reproducible by design, affected a large number of users though] in future?

This should now be fixed. Sorry for the disruption. Will recommend that an incident report is written up.