Page MenuHomePhabricator

Server does not accept MIME RFC2231 encoded header values
Closed, InvalidPublic

Description

Values in MIME headers which are encoded using RFC2231 are not accepted. As the only header value which isn't ASCII (and thus might need to be encoded) is in ApiUpload this currently only accepts uploads with non-ASCII chars in the filename.

Another interpretation of this bug is also, why there is 'filename' in the header of the chunk/file entry of the MIME request as the server ignores it apparently. But that would just mask the underlying issue that there is no way to get Unicode data in the header values to the server (except by just using the encoding of the server but afaics is that not MIME compliant and Python 3's library doesn't support that).


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=73661

Details

Reference
bz73662

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:57 AM
bzimport set Reference to bz73662.
bzimport added a subscriber: Unknown Object (MLST).

I note that one of the header examples you gave in bug 73661,

Content-disposition: =?utf-8?b?Zm9ybS1kYXRhOyBuYW1lPSJmaWxlIjsgZmlsZW5hbWU9?=

=?utf-8?b?IsOcMi5qcGci?=

is not actually valid. See RFC 2047 section 5.

However, the other one,

Content-disposition: form-data; name="file"; filename*=utf-8''%C3%9C.jpg

is also not correctly recognized.

But that doesn't have anything to do with MediaWiki, as PHP itself is not correctly handling such encoded parameters when populating $_POST and $_FILES. If this gets fixed in PHP, MediaWiki should accept it fine.

https://www.mediawiki.org/wiki/API:Upload(In reply to Brad Jorsch from comment #1)

I note that one of the header examples you gave in bug 73661,

Content-disposition:

?utf-8?b?Zm9ybS1kYXRhOyBuYW1lPSJmaWxlIjsgZmlsZW5hbWU9?

=?utf-8?b?IsOcMi5qcGci?=

is not actually valid. See RFC 2047 section 5.

yea, and it is noted as garbage in that bug. We should find the Python 2 bug for that.

However, the other one,

Content-disposition: form-data; name="file"; filename*=utf-8''%C3%9C.jpg

is also not correctly recognized.

But that doesn't have anything to do with MediaWiki, as PHP itself is not
correctly handling such encoded parameters when populating $_POST and
$_FILES. If this gets fixed in PHP, MediaWiki should accept it fine.

Shouldnt this be filed as a bug, and this tracked as an 'upstream' bug? I couldnt find a php bug about this, but it is very possible I have missed it because I'm not familiar with terms php uses.

https://www.mediawiki.org/wiki/API:Upload only says the following about these fields

file - File contents
chunk - Chunk contents

paraminfo says type 'upload'; that is all.

https://en.wikipedia.org/w/api.php?action=paraminfo&modules=upload

API:Upload suggests it should look like

Content-Disposition: form-data; name="file"; filename="Apple.gif"

But that doesnt address non us-ascii filenames.

It looks like we can send any value as the filename in Content-disposition.
The following is copying my rough analysis on https://gerrit.wikimedia.org/r/#/c/174677/ (would appreciate any corrections or historical titbits from mediawiki devs):

fwiw, this filename value is exposed to MediaWiki extensions via WebRequestUpload method getName.

http://git.wikimedia.org/blob/mediawiki%2Fcore.git/c1826209e739d51359bcea37ff4116eed9bd971c/includes%2FWebRequest.php#L1173

($fileInfo comes from $_FILES which is http://php.net/manual/en/reserved.variables.files.php)

Interestingly, Safari sends unicode filename to the server using html encoding (probably {), which are decoded by Sanitizer.php : http://git.wikimedia.org/blob/mediawiki%2Fcore.git/c1826209e739d51359bcea37ff4116eed9bd971c/includes%2FSanitizer.php#L32

WebRequestUpload method getName does not appear to be used in the current mediawiki codebase, but it is used (badly) by some (probably broken) mediawiki extensions. I quickly checked the v1.16 codebase, and cant see any use of getName to be concerned about.