Page MenuHomePhabricator

Images listed in database have missing files
Closed, DeclinedPublic

Description

Author: overlordq

Description:
URL goes to one example.

Compiled lists of those (that I could find) on enwiki and commonswiki:

http://toolserver.org/~overlordq/404commons_path.txt
http://toolserver.org/~overlordq/404images_path.txt


Version: unspecified
Severity: critical
URL: http://en.wikipedia.org/wiki/Image:Ninpou_Chou_TitleScreen.jpg

Details

Reference
bz14635

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:09 PM
bzimport set Reference to bz14635.
bzimport added a subscriber: Unknown Object (MLST).

overlordq wrote:

Gmaxwell was kind enough to restore 4 of these images from backups he had, but these four images triggered another bug.

As talked about in #wikimedia-tech yesterday there were images that were sharing the sha1 hash of 'phoiac9h4m842xq45sp7s6u21eteeq1' This was attributed to them being 'old' from before images were sha1'd.

However, all 4 of the images Gmaxwell uploaded have the same hash of phoiac9h4m842xq45sp7s6u21eteeq1. Don't know if this is related or not, but action=purge doesnt seem to be fixing the problem as it did with the previous set of images.

(In reply to comment #1)

Gmaxwell was kind enough to restore 4 of these images from backups he had, but
these four images triggered another bug.

As talked about in #wikimedia-tech yesterday there were images that were
sharing the sha1 hash of 'phoiac9h4m842xq45sp7s6u21eteeq1' This was attributed
to them being 'old' from before images were sha1'd.

However, all 4 of the images Gmaxwell uploaded have the same hash of
phoiac9h4m842xq45sp7s6u21eteeq1. Don't know if this is related or not, but
action=purge doesnt seem to be fixing the problem as it did with the previous
set of images.

See also bug 17051. phoiac9h4m842xq45sp7s6u21eteeq1 is the hash of the empty string, which suggests some breakage in the SHA-1 hashing code. This seems to be fixed, however, as I just re-uploaded [[commons:File:5von10.jpg]] (which previously had phoiac9h4m842xq45sp7s6u21eteeq1 as hash) over itself which caused it to get the right hash (or at least a hash different from phoiac9h4m842xq45sp7s6u21eteeq1). There are also images (and old revisions of images) with img_sha1='' out there, which is probably a symptom of yet another former bug.

Useful lists:
Last 5 revisions of [[commons:File:5von10.jpg]]: http://commons.wikimedia.org/w/api.php?action=query&titles=File:5von10.png&gdflimit=max&prop=imageinfo&iiprop=sha1|user|comment|timestamp&iilimit=5
List of files with the same hash as [[commons:File:Amphipodredkils.jpg]]: http://commons.wikimedia.org/w/api.php?action=query&titles=File:Amphipodredkils.jpg&generator=duplicatefiles&gdflimit=max&prop=imageinfo&iiprop=sha1