Page MenuHomePhabricator

Content-Type for JavaScript through ResourceLoader doesn't specify a charset
Closed, ResolvedPublic

Description

When requesting a page such as http://bits.wikimedia.org/fr.wikipedia.org/load.php?debug=true&lang=fr&modules=site&only=scripts currently, the Content-Type is "text/javascript". It should be "text/javascript; charset=utf-8".


Version: unspecified
Severity: minor
URL: http://bits.wikimedia.org/fr.wikipedia.org/load.php?debug=true&lang=fr&modules=site&only=scripts

Details

Reference
bz28208

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 11:31 PM
bzimport set Reference to bz28208.

does this actually cause anyone problems?

Daniel Friesen helpfully pointed out that the URL itself demonstrates the problem.

Created attachment 8359
Sets character set to UTF-8 within the HTTP header's Content-Type field.

Here's a patch that will do the trick. I would have applied it, but I wanted to get some input on one concern.

Should we really be assuming that the JavaScript and CSS content is UTF-8? Should we be detecting the encoding and using that? What happens when we are combining multiple encodings into a single response? Should we convert everything to UTF-8?

I'm looking for some opinions, especially those based in experience with handling text encoding within MediaWiki.

Attached:

Everything in MediaWiki is UTF-8 nowdays (including the source files). I'd say go for it.