Page MenuHomePhabricator

HTTP server at download.wikipedia.org should set Last-Modified header
Closed, ResolvedPublic

Description

Author: jcsahnwaldt

Description:
lighttpd at http://download.wikipedia.org currently does not set the Last-Modified header in the response for most files.

Example: the response to a GET or HEAD request for http://download.wikipedia.org/enwiki/latest/enwiki-latest-interwiki.sql.gz does not include a Last-Modified header, but the response for http://download.wikipedia.org/enwiki/latest/enwiki-latest-interwiki.sql.gz-rss.xml does.

The reason is probably this: "If there is no content-type set, we remove the Last-Modified header" http://redmine.lighttpd.net/issues/1236

To fix this, it should suffice to add the following lines to the lighttpd config file:

mimetype.assign = (
".gz" => "application/x-gzip",
".bz2" => "application/x-bzip"
)

The Last-Modified header would be helpful for many things, e.g. to set the correct timestamp for the local copy of the file and to use wget timestamping.


Version: unspecified
Severity: normal
URL: http://download.wikipedia.org

Details

Reference
bz21575

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:48 PM
bzimport set Reference to bz21575.
bzimport added a subscriber: Unknown Object (MLST).

A strong upvote for this bug, it will make it possible to automatically detect a new version of a dump file and it's a small thing to do.

Sounds like something that should be done. Per comment 1 above.

jeluf wrote:

I've added this to the config file:

mimetype.assign += (
".gz" => "application/x-gzip",
".bz2" => "application/x-bzip"
)

Result:

HEAD http://download.wikipedia.org/enwiki/latest/enwiki-latest-interwiki.sql.gz

200 OK
Connection: close
Date: Sun, 30 Jan 2011 09:29:49 GMT
Accept-Ranges: bytes
ETag: "4224011594"
Server: lighttpd/1.4.26
Content-Length: 7905
Content-Type: application/x-gzip
Last-Modified: Sat, 15 Jan 2011 08:02:05 GMT <-------------
Client-Date: Sun, 30 Jan 2011 09:29:49 GMT
Client-Peer: 208.80.152.185:80
Client-Response-Num: 1