Page MenuHomePhabricator

Make upload.wikimedia.org Cross Origin compatible (CORS)
Closed, ResolvedPublic

Description

As with bug 25886, and as previously discussed such as http://www.gossamer-threads.com/lists/wiki/wikitech/220659 -- it would be useful for various next-generation user scripts, gadgets etc to have direct access to raw contents of uploaded files.

This is especially true for SVG and raster images, which if properly marked could be loaded in via XHR or img+canvas for editing and such. Currently, such tools must either have server-side support like the ApiSvgProxy plugin or use an offsite proxy on their own domain or via JSONP.

See Cross Origin Resource Sharing: http://www.w3.org/TR/access-control/


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=20298

Details

Reference
bz28700

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 11:25 PM
bzimport set Reference to bz28700.

Could you provide the specific header that should be set?

I believe we want:

Access-Control-Allow-Origin: *

A quick test confirms I can fetch files cross-domain in Firefox with a regular XHR with this header added. (Some browsers require a little more fiddling.)

Specification draft is at http://www.w3.org/TR/cors/ (IE8 works with a different object client-side (not XMLHttpRequest for cross-domain requests), but relies on the same Access-Control-Allow-Origin header: http://msdn.microsoft.com/en-us/library/dd573303(v=vs.85).aspx so even IE8 ought to be workable with it.)

I would think this bug would ideally be expanded to allow CORS for the API page itself as well--that would allow JavaScript applications to access the API without the GET limitations of JSONP and also avoids its security problems (a site can execute arbitrary JavaScript based on JSONP's current lack of a specific content-type in browsers (its not JSON, nor should it be JavaScript), not merely the callback requested by the user).

Besides that, my impression as a web developer is that JSONP is a lesser-known technique than Ajax, so I think you'd also be promoting the API usage more widely.

(In reply to comment #3)

I would think this bug would ideally be expanded to allow CORS for the API page
itself as well--that would allow JavaScript applications to access the API
without the GET limitations of JSONP and also avoids its security problems (a
site can execute arbitrary JavaScript based on JSONP's current lack of a
specific content-type in browsers (its not JSON, nor should it be JavaScript),
not merely the callback requested by the user).

Besides that, my impression as a web developer is that JSONP is a lesser-known
technique than Ajax, so I think you'd also be promoting the API usage more
widely.

The API already supports CORS, see bug 19907. This code is live on Wikimedia wikis already, but it's not configured, so no CORS headers are actually served right now.

Giving this to Roan to deploy CORS headers with high hopes. :)

(In reply to comment #5)

Giving this to Roan to deploy CORS headers with high hopes. :)

I can and will enable CORS for the API per bug 19907 comment 12 , but this bug is about upload.wikimedia.org , and I don't have access to the box that serves that. Ariel typically does upload-related things, so passing the buck :)

We need the additional headers Access-Control-Allow-Methods and Access-Control-Max-Age for pre-flight requests, I believe. Am I correct in assuming we would only want this for retrieval of image files or for thumbnails (perhaps generated by a 404 handler), i.e. GET? Anything else would start to make me nervous.

I don't have a clue whether we would need to do something with the upload squids as well. Anybody?

(In reply to comment #7)

We need the additional headers Access-Control-Allow-Methods and
Access-Control-Max-Age for pre-flight requests, I believe.

Nah, let's not bother with preflighted stuff. There's pretty much no use for that for upload.wikimedia.org

Am I correct in
assuming we would only want this for retrieval of image files or for thumbnails
(perhaps generated by a 404 handler), i.e. GET? Anything else would start to
make me nervous.

For POST as well, but that doesn't actually *do* anything on upload anyway, does it? Besides, these requests are already allowed, the only thing that changes is that the requestor will be able to read the response. The only case in which this is dangerous, to my knowledge, is if it contains anti-CSRF tokens, but those don't appear anywhere near upload.wikimedia.org .

I don't have a clue whether we would need to do something with the upload
squids as well. Anybody?

Sounds like we would have in order to also serve the headers on old cached images.

Is anything else in MW needed before we can just DO this?

Ariel, are you the right person to do it, or would there be someone more suitable for doing this?

Ariel will probably have time to rebuild squid (ugh!) on the week of
Jul 11, 2011. Check back then.

Nothing in MediaWiki at all is needed for this to happen -- it's purely a web server configuration issue.

What's needed is:

  1. configuration of the backend web servers for upload.wikimedia.org to add headers on newly-served files
  1. configuration of the frontend squid servers for upload.wikimedia.org to add headers on already-cached files (optional: if we'd done #1 months ago the old entries would have long expired by now :)

Removing "shell" keyword for things that aren't directly doable by shell users etc

Restoring "shell" keyword for things that only a subset of shell users can possibly do.

Removing shell keyword if exists

What is the current status of this? Has this been resolved?

(In reply to comment #16)

What is the current status of this? Has this been resolved?

I don't believe so.

afeldman wrote:

It doesn't seem that an ops request was ever made for this - I just opened RT 2107 for the apache/squid change.

So, I've commited https://gerrit.wikimedia.org/r/#/c/33652/ that adds a Swift middleware to push Access-Control-Allow-Origin: * to all unauthenticated GET/HEADs. No support for preflight requests or anything more complicated than that. I'd appreciate a review from a Python/WSGI speaker :)

Now, this obviously won't solve this for all already cached content. We can either deploy this and wait for either the content to be revalidated, or wait until we switch to Varnish so we can also (or instead) do CORS on VCL.

I'd be inclined to just push it and let cached content revalidate over time... it'd be nice to start using it on new files at least. :)

Both https://gerrit.wikimedia.org/r/#/c/33652/ and https://gerrit.wikimedia.org/r/34264 have been merged, and new content in Swift comes with CORS now.

As explained earlier, the same doesn't apply for already cached content; not sure what do you want to do about the bug report.

Oh something a bit related too: for entirely different reasons, upload.wikimedia.org also serves a crossdomain.xml with <allow-access-from domain="*"/> for a while now.

(In reply to comment #21)

As explained earlier, the same doesn't apply for already cached content; not
sure what do you want to do about the bug report.

The dream of every engineer is that a problem will resolve itself if you just ignore it long enough. This is one of those rare cases where it actually happens to be true.