Page MenuHomePhabricator

Special:Upload isn't normalizing spaces in file extension correctly
Closed, ResolvedPublic

Description

Try uploading a file with a space in its extension:

  1. Go to Special:Upload
  2. Pick some .jpg file
  3. Change the destination filename field to end in ". jpg" (note space between . and jpg)
  4. Submit!

It should translate the " " to "_" immediately, then reject the file as a whole as "._jpg" is not recognized.

However, it seems to be allowing the upload through; file type is detected using the trimmed "jpg" form, but the web server of course doesn't know "._jpg" and serves it as a binary file which causes confusion.

Some on-wiki discussion:
http://en.wikipedia.org/wiki/Wikipedia:VPT#Image_does_not_display.2C_rather_attempts_to_download


Version: unspecified
Severity: enhancement
URL: http://en.wikipedia.org/wiki/File:%22City_of_India,Actress_Prachee_Adhikari_for_some_promotion,New_Delhi_1_-_Oct_2007._jpg

Details

Reference
bz16772

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:29 PM
bzimport set Reference to bz16772.
bzimport added a subscriber: Unknown Object (MLST).

herd wrote:

Another oddity: With optional file renaming on, you can rename a file from "._jpg" to ".jpg" but you cannot rename it back, so whatever checks that works. ^_^

Should there not be one specific centralized function that detects extension (and overall filename) validity? Or consistency in the way it is used, at least?

Offhand it looks like the problem is that on upload it's doing the extension validity check on the name *before* normalizing spaces to "_", and that check is doing a trim() on the extension after separating it from the ".".

So:

  1. The "_" normalization should be done as a first step.
  1. That trim() should be found and killed. ;)

And yes, the checks should probably be centralized more than they already are.

charlottethewebb wrote:

while we're on this subject can we also normalize all file extensions to three letters, lowercase?

i'd argue that allowing "Foobar.jpg", "Foobar.JPG", "Foobar.JPEG", etc. to be different files is a bug rather than a helpful feature.

what's that 2^3 + 2^4 = twenty-four possible files named "Foobar" in the same format? More?

course i probably have to File A Separate Ticket For This.

herd wrote:

(In reply to comment #3)

while we're on this subject can we also normalize all file extensions to three
letters, lowercase?

.djvu ?

charlottethewebb wrote:

.djvu ?

Wikipedia says:

Filename extension .djvu, .djv

Don't know which one is preferred, or whether we even allow this format, but if we do we should pick one suffix to favor and normalize the other to it.

Or maybe djvu is just a double entendre meaning that my complaint is a duplicate of some other bug (probably is, didn't check, never had any luck with the search anyway).

File a separate bug for that, please. :)

(Note we do normalize extensions for deleted file storage. It's not necessarily the 3-letter variant that's normal [if a 3-letter variant exists!] but there is a preferred variant.)

herd wrote:

(In reply to comment #5)

Or maybe djvu is just a double entendre meaning that my complaint is a
duplicate of some other bug (probably is, didn't check, never had any luck with
the search anyway).

Possibly a dupe of bug 4421 or bug 12992 (note these are alternate proposals to a similar resolution).

Also, Plz friend me (since we're turning this into a social networking site). (this is a joke)

*** Bug 16512 has been marked as a duplicate of this bug. ***