Page MenuHomePhabricator

beta labs getting 503 Service unavailable or slow
Closed, ResolvedPublic

Description

Since around 2014-07-22 20:00 UTC labs URLS like http://en.wikipedia.beta.wmflabs.org/wiki/Main_Page have been failing for me. Before that I would see Flow pages but get the database locked error trying to add content.

http://en.wikipedia.beta.wmflabs.org/w/api.php either times out or takes a long time. A basic load.php works fine, but a complex http://en.wikipedia.beta.wmflabs.org/w/load.php?debug=true&lang=en&modules=ext.flow.new&skin=vector&version=20140524T012822Z&*

bd808 reported some successes on Special:Version, so it's varying.


Version: unspecified
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=68413
https://bugzilla.wikimedia.org/show_bug.cgi?id=68459

Details

Reference
bz68407

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:40 AM
bzimport set Reference to bz68407.
bzimport added a subscriber: Unknown Object (MLST).

The apache error logs (/data/project/logs/apache-error.log) show quite a few errors from proxy_fcgi:

[Tue Jul 22 21:59:53.156019 2014] [proxy_fcgi:error] [pid 21637] [client 10.68.16.16:50309] AH01067: Failed to read FastCGI header
[Tue Jul 22 21:59:53.156796 2014] [proxy_fcgi:error] [pid 21637] (70014)End of file found: [client 10.68.16.16:50309] AH01075: Error dispatching request to :
[Tue Jul 22 21:59:53.194949 2014] [proxy:error] [pid 21626] (111)Connection refused: AH00957: FCGI: attempt to connect to 127.0.0.1:9000 (*) failed
[Tue Jul 22 21:59:53.195038 2014] [proxy_fcgi:error] [pid 21626] [client 10.68.16.12:7820] AH01079: failed to make connection to backend: 127.0.0.1

The /tmp directories on deployment-mediawiki01 and deployment-mediawiki02 are full of hhvm crash stack traces.

Most of the crashes I'm seeing right now are for bug 68413.

Resolved by https://gerrit.wikimedia.org/r/#/c/148743/ and https://gerrit.wikimedia.org/r/#/c/148754/ . (That isn't to say that we've resolved all availability or performance issues on Labs.)