Page MenuHomePhabricator

MediaWiki incorrectly detects OOXML types of files saved by OpenOffice
Open, MediumPublic

Description

MSOffice puts [Content_Types].xml at the beginning of ZIP file, but OpenOffice does not - it usually puts it at the end. So MimeMagic::detectZipType() can't guess the type as appropriate, because it checks only the first ZIP entry. Such files are detected as application/zip and can't be uploaded because of mime type and extension mismatch.


Version: unspecified
Severity: normal

Details

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 12:14 AM
bzimport set Reference to bz35607.
bzimport added a subscriber: Unknown Object (MLST).

Vitaliy:
I guess I need to allow a specific content type on a private wiki installation first? If so, what are the exact steps to reproduce this, as per default this format is not supported (see bug 2089 and bug 17497)?
(Note: This is documented on https://www.mediawiki.org/wiki/Manual:Configuring_file_uploads#Configuring_file_types )

With which MediaWiki version was this tested?

I tried to upload the file "test.odt", created with LibreOffice 3.4.6 OOO340m1 (Build:602), via https://test2.wikipedia.org/wiki/Special:Upload .
I get the error
"Permitted file types: png, gif, jpg, jpeg, xcf, pdf, mid, ogg, ogv, svg, djvu, tiff, tif, ogg, ogv, oga, webm."

(In reply to comment #1)

Vitaliy:
I guess I need to allow a specific content type on a private wiki installation
first? If so, what are the exact steps to reproduce this, as per default this
format is not supported (see bug 2089 and bug 17497)?

I am 90% sure those bugs have been fixed as of MW 1.18, but haven't tested (Well the second one has some legacy stuff that probably hasn't been, but that is irrelevant to the current bug).

(Note: This is documented on
https://www.mediawiki.org/wiki/Manual:Configuring_file_uploads#Configuring_file_types
)

With which MediaWiki version was this tested?

I tried to upload the file "test.odt", created with LibreOffice 3.4.6 OOO340m1
(Build:602), via https://test2.wikipedia.org/wiki/Special:Upload .
I get the error
"Permitted file types: png, gif, jpg, jpeg, xcf, pdf, mid, ogg, ogv, svg, djvu,
tiff, tif, ogg, ogv, oga, webm."

Yes, we don't allow open document on wikimedia servers except possibly the foundation site. (That's a separate issue).

You would want to add the line:
$wgFileExtensions[] = 'odt';

to enable odt documents to be uploaded.

However, I think this bug is about openXML files, not OpenDocument files. We don't use [Content_Types].xml for detecting odt files, only for detecting the OpenXML files. Presumably this bug is about saving in MS format from open office?

This bug is still likely open as MimeMagic is still doing what Vitaliy describes ( https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=includes/MimeMagic.php;h=65a2a6fa79b7bb69118768b69736e7431d396c29;hb=HEAD#l801 ). The relevant question is if MediaWiki code is in the wrong or not (and If I correctly understood comment 0 as referring to openXML as saved by open office).

My first impression was that OpenOffice was wrong here. However, the mime type registration for OOXML does not provide any specific magic number.
It is exactly the same as for zip. Which has been making this painful the whole time. Compare with http://www.iana.org/assignments/media-types/application/vnd.oasis.opendocument.text for instance, where OpenDocument, by placing the as mimetype as the first entry, made it easy to detect.

Test document

The bug is still relevant in git master, and I think that even if OO.o is wrong here MediaWiki should support these "wrong" versions... Because newer versions of OO.o/LO.o behave the same, and MSWord accept such documents. And MW still detects them as application/zip, so the bug should really be fixed. An example document (an empty one) is attached.

Attached:

Change 70830 had a related patch set uploaded by VitaliyFilippov:
(bug 35607) Fix OXML type detection

https://gerrit.wikimedia.org/r/70830

Change 70830 had a related patch set uploaded (by Jdlrobson):
(bug 35607) Fix OXML type detection

https://gerrit.wikimedia.org/r/70830

Change 70830 abandoned by Jdlrobson:
(bug 35607) Fix OXML type detection

Reason:
Assuming work here is abandoned given the timeframe and inactivity. Please reopen and rebase if not the case.

https://gerrit.wikimedia.org/r/70830