Page MenuHomePhabricator

HTTP 503 error when requesting linked data for large entities
Closed, ResolvedPublic

Description

Author: jclarke

Description:
For some entities using Wikidata's Special:EntityData linked data URIs return HTTP 503.

$ curl -I https://www.wikidata.org/wiki/Special:EntityData/Q30.json
HTTP/1.1 503 Service Unavailable

$ curl -I https://www.wikidata.org/wiki/Special:EntityData/Q30.nt
HTTP/1.1 503 Service Unavailable

$ curl -I https://www.wikidata.org/wiki/Special:EntityData/Q30.rdf
HTTP/1.1 503 Service Unavailable

Access via api.php works fine:

$ curl -I "https://www.wikidata.org/w/api.php?action=wbgetentities&ids=q30&format=json"
HTTP/1.1 200 OK

Other examples:

  • Q30
  • Q148
  • Q145

Version: unspecified
Severity: major
Whiteboard: varnish u=dev c=infrastructure p=0

Details

Reference
bz60003

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:50 AM
bzimport set Reference to bz60003.
bzimport added a subscriber: Unknown Object (MLST).

Hypothesis: large entities trigger an Out Out Memory error, because SpecialEntityData uses output buffering on the JSON, effectively doubling the memory footprint of the serialized entity.

So, according to Chris & Chad, since this error doesn't show in fatal.log, it's not an OutOfMemory error. And it's not a timeout either. We'll need assistance from someone with shell access to identify the problem.

the error is:

Request: GET http://www.wikidata.org/wiki/Special:EntityData/Q30.json, from 10.64.0.102 via cp1065 cp1065 ([10.64.0.102]:3128), Varnish XID 2665166859<br/>Forwarded for: 87.138.110.76, 91.198.174.103, 208.80.154.77, 10.64.0.102<br/>Error: 503, Service Unavailable at Thu, 06 Feb 2014 17:30:26 GMT

seems to come from varnish. Maybe we are hitting some memory or other limit?

According to Katie, it's a problem with Varnish:

[18:31] <aude> DanielK_WMDE: it says "Varnish XID 2665166859"
[18:32] <aude> tells me it probably is varnish
[18:33] <aude> it's a text varnish cache
[18:33] <aude> cp1065.eqiad.wmnet

So I suppose we'll have to ask Mark about it.

This indeed is a varnish problem, fetching the data directly from one of the apache's works fine (and fast).

Varnish 503s with this error message:

170 FetchError   c straight insufficient bytes

which I think is a problem with compression that we've seen before. See if this varies whether or not you send an Accept-Encoding header of gzip/deflate or not?

Change 131746 had a related patch set uploaded by Hoo man:
Don't set Content-Length in EntityDataRequestHandler

https://gerrit.wikimedia.org/r/131746

Change 131746 merged by jenkins-bot:
Don't set Content-Length in EntityDataRequestHandler

https://gerrit.wikimedia.org/r/131746