Page MenuHomePhabricator

UploadWizard should have more specific error message for files failing with 'verification-error' ("This file might be corrupt, or have the wrong extension")
Closed, ResolvedPublic

Description

If you try to upload a file that contains HTML in the EXIF data using Special:Upload it gives you the following error:
"This file contains HTML or script code that may be erroneously interpreted by a web browser."

If you try to upload the same file using UploadWizard, it gives you the following error:
"This file might be corrupt, or have the wrong extension."

It would be nice if the UploadWizard error were more specific.


Version: unspecified
Severity: enhancement

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:54 PM
bzimport added a project: UploadWizard.
bzimport set Reference to bz30095.
bzimport added a subscriber: Unknown Object (MLST).

Hi, Ryan, two questions:

  1. That message, I think, doesn't always show up where it used to. Could you confirm that this still happens?
  1. Could you attach a file with HTML in the EXIF data for testing purposes?

If you just fulfill number 2, I can test number 1.

Thanks!

Recently, a patch to MediaWiki core [0] changed the behavior of API warnings. We now get those at the first step, even if we don't handle them necessarily.

Could you try to reproduce this bug, and let us know if it still occurs? If so, the solution would likely be very different from before this patch.

[0] https://gerrit.wikimedia.org/r/#/c/9261/

Thanks!

This is not yet fixed.

This is a bug in the errors handlers. An API error is a code, an info and an optional 'details' array of message keys.

We use the code to get a UW specific error description. Normally this description is in 'info', but we don't seem to use that one. The details are not passed at all.

In this specific case we have:

code: 'verification-error',
info: "This file did not pass file verification",
details: [ "uploadscripted"]

The info corresponds to the msg of 'verification-error'
The uploadscripted corresponds to a msg key for ""This file contains HTML or script code that may be erroneously interpreted by a web browser."

Instead we display api-error-verification-error. Interesting observation, these api-error-* messages seem to be solely for UW, but are in the mediawiki-core translations.

For the entire error handler, we ought to implement something the likes of

processVerificationError() of SpecialUpload.php

we should expand showError() to showError(code, info, details) and pass details if the api error result has them. We can then also move the filetype-banned-type handling into showError.

Here is an example of a flickr file with some html in EXIF.

Uploading https://www.flickr.com/photos/arthurhsu/8004910780/in/set-72157631578196004/ file using flickr version of UploadWizard results in "This file might be corrupt, or have the wrong extension." error.

Upload of the same file manually (via local download) through UploadWizard gives "Internal error: Server failed to store temporary file." error.

The old fashion Special:Upload gives the most clear "This file contains HTML or script code that may be erroneously interpreted by a web browser." error. Now I was able to figure out what is the issue, stripped all EXIF from the file and reuploded as https://commons.wikimedia.org/wiki/File:Coopers_Rock,_WV_-_Tomb_Raider_Roof_-_2.jpg

UploadWizard should be able to strip HTML from the EXIF data, but if not than at least give more meaningful error message.

Another example: https://www.flickr.com/photos/arthurhsu/8004906216/in/set-72157631578196004/

This came up again on commons VP this week.

This file might be corrupt, or have the wrong extension. as an error message. The file was 2M but I was on a flaky Internet connection so I put it on upload problems. When on a more stable Internet connection I tried again and got the same error message. Then I switched to the old Uploader that gave a more detailed error message: This file contains HTML or script code that may be erroneously interpreted by a web browser. See the FAQ for more information. From there I looked in my file. It turned out the entry contentScriptType="text/ecmascript" under the svg tag was the issue.

matmarex renamed this task from UploadWizard should have more specific error message for files that contain HTML in the EXIF data to UploadWizard should have more specific error message for files failing with 'verification-error' ("This file might be corrupt, or have the wrong extension").May 12 2016, 5:36 PM
matmarex claimed this task.
matmarex set Security to None.
matmarex removed a subscriber: wikibugs-l-list.
matmarex added subscribers: matmarex, Zppix.

Copied from the merged task:


UploadWizard doesn't provide any details for 'verification-error' ("This file might be corrupt, or have the wrong extension"). This is a problem, since sometimes the file is not obviously corrupted and displays just fine in various applications, but MediaWiki rejects it. This also occurs for uploads blocked by AbuseFilter (T132866), but I think that is separate, as for AF we'd ideally just use a different error altogether.

Example:

(reported at https://commons.wikimedia.org/w/index.php?title=Commons:Upload_help&oldid=195480550#Upload_difficulties.2C_sourced_from_Wikimedia_Commons).

API response for it:

{
  "error": {
    "code": "verification-error",
    "info": "This file did not pass file verification: This SVG file contains an illegal namespace \"&ns_sfw;\".",
    "details": [
      "uploadscriptednamespace",
      "&ns_sfw;"
    ],
    "*": "See http://localhost:3080/w/api.php for API usage"
  }
}

I think this has become much easier after T47843, since the API is now able to return localised error messages (previously they were hardcoded to English). I've been putting off working on this because it would have been a lot of boring copy-paste work, which shouldn't be necessary now.

For testing, here's another file which fails with verification-error: filetype-bad-ie-mime.

badie.png (1 KB)

Change 327500 had a related patch set uploaded (by Matthias Mullie):
Do not lose message parameters in UploadFromChunks::verifyChunk()

https://gerrit.wikimedia.org/r/327500

Change 326134 had a related patch set uploaded (by Matthias Mullie):
Simplify error handling

https://gerrit.wikimedia.org/r/326134

Change 328373 had a related patch set uploaded (by Matthias Mullie):
Use parsed error messages straight from API

https://gerrit.wikimedia.org/r/328373

Change 327500 merged by jenkins-bot:
Do not lose message parameters in UploadFromChunks::verifyChunk()

https://gerrit.wikimedia.org/r/327500

Change 328373 merged by jenkins-bot:
Use parsed error messages straight from API

https://gerrit.wikimedia.org/r/328373