Page MenuHomePhabricator

(patch) Mimetypes for Microsoft's 2007 office formats set wrongly
Closed, ResolvedPublic

Description

patch for MediaWiki r65816

The file ./includes/mime.types assigns the old ppt, doc, etc. mimetypes to the new 2007 pptx, docx, etc. file endings as well. This is wrong, and it leads to problems especially with files served via the wfStreamFile() function: when downloading docx/pptx/... files on machines that run the Windows operating system, the file names get changed to fit the reported mimetype (i.e. the actual file name gets .doc/.ppt/... appended). The files named in this way cannot be successfully opened by Microsoft's office applications any more, since they fail to recognize their files (yes, one should file a bug at MS about this;-).

The solution is to use the correct mimetypes, as found on the Web, e.g. at http://www.bram.us/2007/05/25/office-2007-mime-types-for-iis/
A patch is attached.


Version: unspecified
Severity: normal

Attached:

Details

Reference
bz23688

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:00 PM
bzimport set Reference to bz23688.
bzimport added a subscriber: Unknown Object (MLST).

P.S. this bug specifically affects users of the img_auth.php image authorisation script since it uses wfStreamFile().

Why not the docm dotm etc file extensions ? These are XML office files with Macro's enabled.

Yes, one could add further mime types as in the above link. But these file types do have mimetypes that start like the old ppt, doc, etc. types, so I reasoned that (1) they are very old and nobody has had any problem with them yet, and (2) the Microsoft tools may manage to open them anyway since the format might be compatible. If there are similar problems in these cases, one could of course use other mimetypes here as well.

For these specifically Microsoft-related issues, it might also be possible to use the infamous application/octet-stream as a default. This should at least prevent any renaming, and may suffice to let the operating system apply some more brains to figure out what the file is supposed to be.

Somewhat more authoritative source for MS mimetypes.

http://blogs.msdn.com/b/vsofficedeveloper/archive/2008/05/08/office-2007-open-xml-mime-types.aspx

I think we should just add all of them. I see no reason in doing half work in this case. They are documented, so we should implement it as such. We have had too many unnecessary problems with mime type and file extension mismatches already.

docx is a zip format. Due to bug 23642 (MimeMagic::detectZipType does not recognize microsoft formats) they still won't be detected properly.