Page MenuHomePhabricator

Streaming large files is unacceptably slow
Closed, ResolvedPublic

Description

Now that all uploaded files are served by Varnish (as of late April/early May), we're seeing major issues with streaming of large video files. This is likely due to limited streaming support in Varnish in the 3.0.2 release we're running:

https://www.varnish-software.com/blog/http-streaming-varnish

"Varnish 3.0 has some support for streaming objects, but I can't say that I'm very happy with the semantics. It only allows one client to stream the object and others are placed on hold while the object is being streamed. If this client is a 3G phone downloading a 100 megabyte object other clients are likely to give up waiting and go and do something else."

Mark has enabled this limited streaming support, currently for files > 100 MB.

Whether it's due to these or other limitations, files > 100 MB are currently very unreliably playable, especially on first play, and still sometimes return 503 errors as well.

It takes many seconds to successfully initiate a play of files such as these, or they may fail completely:

http://upload.wikimedia.org/wikipedia/commons/b/b5/Monthly_Metrics_Meeting_March_1%2C_2012.ogv

http://upload.wikimedia.org/wikipedia/commons/e/ec/Monthly_Metrics_Meeting_April_5%2C_2012.ogv

http://upload.wikimedia.org/wikipedia/commons/f/f3/WMF_Analytics_Day_-_HBase.ogv

We'll need to investigate fuller solutions, whether it's backporting the current experimental streaming branch for upload hosts ( https://github.com/mbgrydeland/varnish-cache-streaming ), reverting to Squid, further changing config, etc. We'll also need to make sure these issues aren't exacerbated further with SWIFT.


Version: unspecified
Severity: major

Details

Reference
bz36577

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 12:26 AM
bzimport set Reference to bz36577.
bzimport added a subscriber: Unknown Object (MLST).

Resolved for now by reverting to Squid; we'll investigate setting up a separate cluster for large video streaming (maybe using Squid, maybe a patched Varnish if it's stable enough).

Mark's applied and tested the most recent Varnish streaming patches and again switched uploads to EQIAD. However, this issue is re-occurring with large files and being investigated right now. Should be fixed within next couple of days, either by identifying the cause or reverting.

In the meantime, if you're playing/downloading a very large file, you may have to suffer an unacceptably long wait on first play.

(We'll also need to make this work on both HTTP and HTTPS.)

This particular issue should now be fixed. However, due to lack of support for range requests, we're still having video issues with Varnish, filed as bug 40779. Closing this bug, adding same WMF CCs to 40779.

Confirmed fixed now, Varnish is back in production for caching media files. Please reopen/provide details if you still get these issues.