Page MenuHomePhabricator

PDF containing jpx/jp2 encoded images from Archive.org uploaded to Commons won't thumbnail
Closed, ResolvedPublic

Description

I uploaded to the Wikimedia Commons the PDF "File:John Stuart Mill, Considerations on Representative Government (1st ed, 1861).pdf" (https://commons.wikimedia.org/wiki/File:John_Stuart_Mill,_Considerations_on_Representative_Government_(1st_ed,_1861).pdf) which I had downloaded from Archive.org, but none of the pages will thumbnail properly. I tried purging the page to no avail.

I reported this problem at the Commons Village Pump, and another editor said that when he tried to view the thumbnail at https://upload.wikimedia.org/wikipedia/commons/thumb/7/70/John_Stuart_Mill%2C_Considerations_on_Representative_Government_%281st_ed%2C_1861%29.pdf/page1-76px-John_Stuart_Mill%2C_Considerations_on_Representative_Government_%281st_ed%2C_1861%29.pdf.jpg he encountered this error message:

"Error creating thumbnail: convert: no decode delegate for this image format `/tmp/magick-hg7YMuoz' @ error/constitute.c/ReadImage/532."
"convert: missing an image filename `/tmp/transform_b1c9d0271ec9-1.jpg' @ error/convert.c/ConvertImageCommand/3011."


Version: wmf-deployment
Severity: normal

Details

Reference
bz59975

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:40 AM
bzimport set Reference to bz59975.
bzimport added a subscriber: Unknown Object (MLST).

That PDF contains jpx/jp2 encoded images. Presumably ImageMagic's convert lacks a JPEG 2000 decoder.

Or ImageMagick lacks the "advanced" JPEG2000 stuff. I see that is listed at https://en.wikipedia.org/wiki/JPEG_2000#Application_support as having only "basic" JPEG2000 support. The OpenJPEG library is listed as having the advanced stuff, but that was added to ImageMagick just a few days ago: http://www.imagemagick.org/script/changelog.php (2013-12-30).

(In reply to comment #2)

Or ImageMagick lacks the "advanced" JPEG2000 stuff. I see that is listed at
https://en.wikipedia.org/wiki/JPEG_2000#Application_support as having only
"basic" JPEG2000 support. The OpenJPEG library is listed as having the
advanced
stuff, but that was added to ImageMagick just a few days ago:
http://www.imagemagick.org/script/changelog.php (2013-12-30).

Its probably the old-ish version of ghost script we use's fault. gs converts to a jpeg file first, and then we use image magick to resize. gs would be the program responsible for interpreting the JPEG2000 stuff. The error message would mention convert, because convert would choke on the lack of input from gs erroring out (And no ghostscript error output as gs doesn't have its stderr redirected, only convert does, which is probably a mistake)

(And no ghostscript error output as gs doesn't have its stderr
redirected, only convert does, which is probably a mistake)

Making that side issue into bug 59986.

Locally, gs complains about invalid jpx blocks, but then ignores them and renders the image.

My local (fairly old) gs version is 8.71.

(And no ghostscript error output as gs doesn't have its stderr
redirected, only convert does, which is probably a mistake)

I misread the source code, gs errors should have been redirected, which implies that gs is exiting with no output and no errors, which is odd.

Maybe someone could run gs without the -q option to see if that makes a difference (but on my isntall, -q doesn't affect stderr output...). Also, anyone know the version of gs used on the servers?

Another example: [[:commons:File:Geneva Convention 1864 - CH-BAR - 29355687.pdf]]

Also contains jp2-encoded page images and won't thumbnail. File is nearly 100MB and contains 8 pages.

fgiunchedi subscribed.

looks like thumbs for that pdf are still failing to generate, however IIRC jobrunners aren't HHVM (and thus trusty) yet, is this fixed in newer upstream versions ?

looks like thumbs for that pdf are still failing to generate, however IIRC jobrunners aren't HHVM (and thus trusty) yet, is this fixed in newer upstream versions ?

Image scalars aren't updated yet which is the part that needs to be for the new version to be used. See T84842. However I do not know if the updated version will fix this bug.

matmarex claimed this task.
matmarex subscribed.

The file appears to be thumbnailing correctly today.