Page MenuHomePhabricator

getimagesize on PHP before 5.3.3 doesn't work for image's that have more than 10 padding bytes between segments
Closed, ResolvedPublic

Description

Author: codrinb

Description:
The thumbnails are not generated for certain images. The dimensions of the images are indicated as 0x0, sometimes the EXIF is not extracted

See the details at:
http://commons.wikimedia.org/wiki/Commons:Help_desk#Thumbnail.2FEXIF_issues_with_uploaded_images

I suspect a bug and I think someone should check the logs for the thumbnail generation scripts


Version: unspecified
Severity: major
URL: http://commons.wikimedia.org/wiki/Commons:Help_desk#Thumbnail.2FEXIF_issues_with_uploaded_images

Details

Reference
bz31588

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:50 PM
bzimport set Reference to bz31588.
bzimport added a subscriber: Unknown Object (MLST).

[[File:Costesti Cetatuie Dacian Fortress 2011 - Tower House Two.jpg]] seems to still show up 0x0 after some action=purge'ing ... a local upload of the original file to trunk or 1.18wmf1 seems to pick up file info just fine.

Could be a live issue, such as something trying to do the metadata fetch or purge that doesn't actually have NFS access...

codrinb wrote:

I noticed that the EXIF information for the failed files is incomplete (Caption, Keywords and other fields missing) , by comparison with the successful ones.
For example, compare http://commons.wikimedia.org/wiki/File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Tower_House_One_Close_Up-3.jpg (successful) compared with http://commons.wikimedia.org/wiki/File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Stairs_and_Drain.jpg But this might be another bug altogether.

codrinb wrote:

I noticed that the EXIF information for the failed files is incomplete
(Caption, Keywords and other fields missing) , by comparison with the
successful ones.
For example, compare
http://commons.wikimedia.org/wiki/File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Tower_House_One_Close_Up-3.jpg
(successful) compared with http://commons.wikimedia.org/wiki/File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Stairs_and_Drain.jpg
But this might be another bug altogether.
[[commons:File:Costesti_Cetatuie_Dacian_Fortress_2011_-_Stairs_and_Drain.jpg]]

Hmm, worth double-checking the version of PHP that's running; might have a less stable older version of the exif module or something.

codrinb wrote:

Also, if you follow the original link of the posting http://commons.wikimedia.org/wiki/Commons:Help_desk#Thumbnail.2FEXIF_issues_with_uploaded_images, people found some workaround by rotating back and forth the image. But this can't be a solution as there are many files in this state

codrinb wrote:

This file with a missing thumbnail doesn't even have the EXIF extracted: http://commons.wikimedia.org/wiki/File:Orastie_Ethnography_Museum_2011_-_Dacian_Mandrel_and_Spiral_Bracelet.JPG, while others have a partially extracted EXIF data.

(In reply to comment #6)

This file with a missing thumbnail doesn't even have the EXIF extracted:
http://commons.wikimedia.org/wiki/File:Orastie_Ethnography_Museum_2011_-_Dacian_Mandrel_and_Spiral_Bracelet.JPG,
while others have a partially extracted EXIF data

That image doesn't have exif data because mediawiki's jpeg metadata support doesn't skip padding bytes properly, I'll commit a fix for that some time later.

This is probably unrelated to the file not thumbnailing properly.

(In reply to comment #7)

(In reply to comment #6)

This file with a missing thumbnail doesn't even have the EXIF extracted:
http://commons.wikimedia.org/wiki/File:Orastie_Ethnography_Museum_2011_-_Dacian_Mandrel_and_Spiral_Bracelet.JPG,
while others have a partially extracted EXIF data

That image doesn't have exif data because mediawiki's jpeg metadata support
doesn't skip padding bytes properly, I'll commit a fix for that some time
later.

r99477 (I still highly doubt this has much to do with the bug being reported here though)

r99477 (I still highly doubt this has much to do with the bug being reported
here though)

Actually several of the example images given here have strings of 0xFF padding bytes after the XMP data, perhaps there is some bug with files doing that and php's getimagesize.

codrinb wrote:

Sounds like you are getting closed to the issue. All I can add is that the images were prepared (downloaded from camera, some rotated, keywords/location/caption added to EXIF) with Picasa 3.8 in Windows 7.

Also, per the initial thread, it seems like if you remove the EXIF Orientation tag from the horizontal pictures, using a tool like exiftool, the Thumbnail and full EXIF can be generated after the upload of the "corrected" image. This doesn't apply to vertical ones, in that they need to also be rotated.

So, in my mind, whatever PHP script tried to read the EXIF metadata upon upload and/or possibly tried to rotate the images, failed.

codrinb wrote:

Meant to say "closer to". Don't know how to re-edit a posted comment...

Sorry I took so long to look at this further.

This is an upstream bug in php ( https://bugs.php.net/bug.php?id=33210 ) It should be fixed in PHP 5.3.3 and later (special:version says we're currently at 5.3.2).

So essentially we need to upgrade php, or i suppose manually apply the relavent fix... (however, since the number of images affected is very small (since you know, most people want their lossy-compression formats to give small file-size and not full it with random padding bytes), this is probably a very low-priority reason to upgrade.

btw, in the mean-time, re-saving the images with another program may "fix" the images.

codrinb wrote:

If you look at this category on Commons, it looks pretty bad and it is unpractical to reload each image. At least I don't know about any tool to automate this:

http://commons.wikimedia.org/wiki/Category:Costesti_Cetatuie_Dacian_Fortress

Any update on the ETA for this?

Bumping the priority since bawolff intrupted my vacation (but, to be honest, I was on IRC)... hopefully someone sees this before I get back

Moving priority back down. Mark H, could you make sure an RT ticket is filed for this one?

(In reply to comment #15)

Moving priority back down. Mark H, could you make sure an RT ticket is filed
for this one?

https://rt.wikimedia.org/Ticket/Display.html?id=2330

For the WMF deployment part, "all application servers are now running 5.3.10-1ubuntu3.4+wmf1" hence Wikimedia servers are not affected anymore.

(In reply to comment #12)

This is an upstream bug in php ( https://bugs.php.net/bug.php?id=33210 ) It
should be fixed in PHP 5.3.3 and later

http://www.mediawiki.org/wiki/Download says
"MediaWiki requires PHP 5.3.2+".

So a requirements bump of MediaWiki would fix this the problem?

Removing "ops" keyword as there's nothing left to do for ops here.

I'm going to call this fixed.

*It no longer affects WMF servers
*Its an upstream issue, and upstream has fixed the issue

Its an obscure issue involving an odd feature of a file format, so I don't think we should bump our version requirements just for this issue. However I don't see much point in keeping this bug open. I suggest if anyone else encounters this issue, we just tell them to upgrade.

Gilles raised the priority of this task from Medium to Unbreak Now!.Dec 4 2014, 10:20 AM
Gilles added a project: Multimedia.
Gilles moved this task from Untriaged to Done on the Multimedia board.
Gilles lowered the priority of this task from Unbreak Now! to Medium.Dec 4 2014, 11:21 AM