Page MenuHomePhabricator

Duplicate image reuploading should be forbidden
Closed, DeclinedPublic

Description

Author: Bryan.TongMinh

Description:
Uploading the same image over itself is just a waste of space, log entries and such and should as such throw an error.

Should probably introduce a new return value for UploadBase::verifyUpload.


Version: 1.15.x
Severity: enhancement

Details

Reference
bz15676

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 10:19 PM
bzimport set Reference to bz15676.
bzimport added a subscriber: Unknown Object (MLST).

mike.lifeguard+bugs wrote:

See also bug 14171.

Are we sure that the hashes don't have errors? IIRC there was an issue where non-identical images had identical hashes; this was probably discovered at the time of the "massive image loss" so perhaps Tim knows more?

Don't forget that if thumbnail generation breaks, reuploading may be the only way to reset it.

mike.lifeguard+bugs wrote:

(In reply to comment #2)

Don't forget that if thumbnail generation breaks, reuploading may be the only
way to reset it.

That shouldn't be (and AFAIK isn't) the case.

Bryan.TongMinh wrote:

Are there any other reasons not to implement this? Else I will go ahead.

I see no good reason to go forwards with this, and think it should be scrapped. Possible problems include:

  • If an image gets corrupted, but a stored hash is the one being compared, the system breaks.
  • Uploading the image again is, for people who aren't utter experts in Wiki work - and that includes me, an admin on commons - the only obvious way to regenerate thumbnails.
  • There is no evidence of actual good that this will do.
  • There is a small chance of two images having the same hash. As the number of uploads increases, the chance of this affecting someone does as well.
  • The case for implementation is not established.

Please do not go forwards with this.

There is no good reason to restrict this behaviour in the absence of evidence that it is causing any problem.

brad9626 wrote:

It shouldn't be "forbidden" (at the very least admins should be able to override it if needed), but a nice little warning would be nice. I hate when upload logs are filled with identical versions because the uploader simply picked the wrong file name (e.g. the original version again instead of the new one). Or they, for some unexplained reason, revert to the same version multiple times in a row. It happens more than you'd think. Sure, user carelessness/inexperience is at fault here, but software can help. Actually, I don't know why something like that isn't already in place for duplicates in general. One of my favorite features of Bryan's Flickr uploader is it tells you if the file is already uploaded (doesn't always work, but 95% of the time in my experience). There's been a couple times I uploaded an image directly (after doing a reasonable search on Commons) only to find out afterword we already have it thanks to the automatically-generated duplicate link provided on the description page. It's like, oh, thanks for letting me know ''now''. What we want to do is help prevent accidental duplicate uploads, not intentional ones. So even if the hashes get corrupted and the system fails, what, .1% of the time?, it's ok since a human makes the final call.

Bryan.TongMinh wrote:

Seems not worth the hassle at all.

I also see every now and then new users re-upload with the intention of changing the description page.
They remember when uploading there was this form, so they re-upload. Fill in the entire Information template again, and then upload the same file again with the same name.

Warninging a user when re-uploading (either by 'new version' or by 'revert') if the to-upload file is the same as the last revision, it should either be blocked for non-admins or a warning that can be ignored (like with warn abusefilters and forceeditsummary, the 2nd time it goes through).

My 2c

Bryan.TongMinh wrote:

*** Bug 28374 has been marked as a duplicate of this bug. ***