Page MenuHomePhabricator

Segfault when getting EXIF data of some tif images
Closed, ResolvedPublic

Description

reedy@ubuntu64-web-esxi:/var/www/wiki/mediawiki/core$ php maintenance/importImages.php --comment-ext=txt --user=Reedy /tmp/uploads --overwrite
Import Images

Pressbilder_till_Ars_11,_reprofotografi_av_bild_från_resealbum_-_Hallwylska_museet_-_87785.tif exists, overwriting...Segmentation fault
reedy@ubuntu64-web-esxi:/var/www/wiki/mediawiki/core$

Initial stuff from gdb

Import Images

Pressbilder_till_Ars_11,_reprofotografi_av_bild_från_resealbum_-_Hallwylska_museet_-_87785.tif exists, overwriting...
Program received signal SIGSEGV, Segmentation fault.
php_ifd_get16u (value=0xfffffffff9f44320, motorola_intel=0) at /build/buildd/php5-5.4.9/ext/exif/exif.c:1095
1095 /build/buildd/php5-5.4.9/ext/exif/exif.c: No such file or directory.


Version: 1.22.0
Severity: normal
See Also:
https://github.com/facebook/hiphop-php/issues/1154
https://rt.wikimedia.org/Ticket/Display.html?id=6453

Details

Reference
bz55541

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:21 AM
bzimport set Reference to bz55541.

Also noting it's an issue in multiple versions of php:

PHP 5.3.10-1ubuntu3.6+wmf1 with Suhosin-Patch (cli) (built: May 15 2013 23:19:18)
PHP 5.4.9-4ubuntu2.3 (cli) (built: Sep 4 2013 19:32:25)

https://github.com/php/php-src/blob/PHP-5.4.9/ext/exif/exif.c#L1095

(gdb) bt
#0 php_ifd_get16u (value=0xfffffffff9e7b278, motorola_intel=0) at /build/buildd/php5-5.4.9/ext/exif/exif.c:1095
#1 0x000000000049d5fe in exif_iif_add_value (image_info=0x7fffffff9d30, section_index=0, name=0x0, tag=-102256004, format=0, length=3,

value=0x49d7cf <exif_process_IFD_TAG+383>, motorola_intel=-102256008) at /build/buildd/php5-5.4.9/ext/exif/exif.c:1753

#2 0x000000000049d7cf in exif_process_IFD_TAG (ImageInfo=0x7fffffff9d30, dir_entry=0x0, offset_base=0x22fc <Address 0x22fc out of bounds>, IFDlength=13927872,

displacement=4192710974, section_index=764, ReadNextIFD=4846524, tag_table=0x65c2d4 <vspprintf+52>) at /build/buildd/php5-5.4.9/ext/exif/exif.c:1803

#3 0x000000000049f3bc in exif_process_IFD_in_TIFF (ImageInfo=0x7fffffff9d30, dir_offset=6771448, section_index=0) at /build/buildd/php5-5.4.9/ext/exif/exif.c:3703
#4 0x000000000049f863 in exif_read_file.constprop.13 (ImageInfo=0x7fffffff9d30,

FileName=0x7fffe89e3138 "/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_från_resealbum_-_Hallwylska_museet_-_87785.tif", read_thumbnail=0)
at /build/buildd/php5-5.4.9/ext/exif/exif.c:3797

#5 0x00000000004a01a9 in zif_exif_read_data (ht=-102256008, return_value=0x7fff00000000, return_value_ptr=0x778c04, this_ptr=0x1, return_value_used=23832424)

at /build/buildd/php5-5.4.9/ext/exif/exif.c:3959

#6 0x00007ffff4dab5de in xdebug_execute_internal (current_execute_data=0x7ffff7fa7928, return_value_used=1) at /build/buildd/xdebug-2.2.1/build-php5/xdebug.c:1483
#7 0x000000000075f17e in zend_do_fcall_common_helper_SPEC (execute_data=0x7ffff7fa7928) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:644
#8 0x0000000000718c17 in execute (op_array=0x16a5098) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:410
#9 0x00007ffff4dab168 in xdebug_execute (op_array=0x16a5098) at /build/buildd/xdebug-2.2.1/build-php5/xdebug.c:1391
#10 0x000000000075f499 in zend_do_fcall_common_helper_SPEC (execute_data=0x7ffff7fa74e0) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:669
#11 0x0000000000718c17 in execute (op_array=0x169e990) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:410
#12 0x00007ffff4dab168 in xdebug_execute (op_array=0x169e990) at /build/buildd/xdebug-2.2.1/build-php5/xdebug.c:1391
#13 0x000000000075f499 in zend_do_fcall_common_helper_SPEC (execute_data=0x7ffff7fa7190) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:669
#14 0x0000000000718c17 in execute (op_array=0x167dbb0) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:410
#15 0x00007ffff4dab168 in xdebug_execute (op_array=0x167dbb0) at /build/buildd/xdebug-2.2.1/build-php5/xdebug.c:1391
#16 0x000000000075f499 in zend_do_fcall_common_helper_SPEC (execute_data=0x7ffff7fa6628) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:669
#17 0x0000000000718c17 in execute (op_array=0x161d588) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:410
#18 0x00007ffff4dab168 in xdebug_execute (op_array=0x161d588) at /build/buildd/xdebug-2.2.1/build-php5/xdebug.c:1391
#19 0x000000000075f499 in zend_do_fcall_common_helper_SPEC (execute_data=0x7ffff7fa6498) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:669
#20 0x0000000000718c17 in execute (op_array=0x1621b78) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:410
#21 0x00007ffff4dab168 in xdebug_execute (op_array=0x1621b78) at /build/buildd/xdebug-2.2.1/build-php5/xdebug.c:1391
#22 0x000000000075f499 in zend_do_fcall_common_helper_SPEC (execute_data=0x7ffff7fa4060) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:669
#23 0x0000000000718c17 in execute (op_array=0x7ffff7fd78f8) at /build/buildd/php5-5.4.9/Zend/zend_vm_execute.h:410
#24 0x00007ffff4dab168 in xdebug_execute (op_array=0x7ffff7fd78f8) at /build/buildd/xdebug-2.2.1/build-php5/xdebug.c:1391
#25 0x00000000006b8a8c in zend_execute_scripts (type=-134383448, retval=0x300000008, file_count=32767) at /build/buildd/php5-5.4.9/Zend/zend.c:1309
#26 0x0000000000658733 in php_execute_script (primary_file=0x7fffffff9d30) at /build/buildd/php5-5.4.9/main/main.c:2482
#27 0x0000000000761943 in do_cli (argc=0, argv=0x7fffffffe874) at /build/buildd/php5-5.4.9/sapi/cli/php_cli.c:988
#28 0x000000000042c8d0 in main (argc=32767, argv=0xdfd210) at /build/buildd/php5-5.4.9/sapi/cli/php_cli.c:1364

root@ubuntu64-web-esxi:/tmp# gdb php
GNU gdb (GDB) 7.5.91.20130417-cvs-ubuntu
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
http://www.gnu.org/software/gdb/bugs/...
Reading symbols from /usr/bin/php5...Reading symbols from /usr/lib/debug/usr/bin/php5...^[[Adone.
done.
(gdb)
(gdb) directory /tmp/php5-5.4.9/
Source directories searched: /tmp/php5-5.4.9:$cdir:$cwd
(gdb) source /tmp/php5-5.4.9/.gdbinit
Redefine command "set_ts"? (y or n) [answered Y; input not from terminal]
Redefine command "____executor_globals"? (y or n) [answered Y; input not from terminal]
Redefine command "print_cvs"? (y or n) [answered Y; input not from terminal]
Redefine command "dump_bt"? (y or n) [answered Y; input not from terminal]
Redefine command "printzv"? (y or n) [answered Y; input not from terminal]
Redefine command "____printzv_contents"? (y or n) [answered Y; input not from terminal]
Redefine command "____printzv"? (y or n) [answered Y; input not from terminal]
Redefine command "____print_const_table"? (y or n) [answered Y; input not from terminal]
Redefine command "print_const_table"? (y or n) [answered Y; input not from terminal]
Redefine command "____print_ht"? (y or n) [answered Y; input not from terminal]
Redefine command "print_ht"? (y or n) [answered Y; input not from terminal]
Redefine command "print_htptr"? (y or n) [answered Y; input not from terminal]
Redefine command "print_htstr"? (y or n) [answered Y; input not from terminal]
Redefine command "____print_ft"? (y or n) [answered Y; input not from terminal]
Redefine command "print_ft"? (y or n) [answered Y; input not from terminal]
Redefine command "____print_inh_class"? (y or n) [answered Y; input not from terminal]
Redefine command "____print_inh_iface"? (y or n) [answered Y; input not from terminal]
Redefine command "print_inh"? (y or n) [answered Y; input not from terminal]
Redefine command "print_pi"? (y or n) [answered Y; input not from terminal]
Redefine command "____print_str"? (y or n) [answered Y; input not from terminal]
Redefine command "printzn"? (y or n) [answered Y; input not from terminal]
Redefine command "printzops"? (y or n) [answered Y; input not from terminal]
Redefine command "zbacktrace"? (y or n) [answered Y; input not from terminal]
Redefine command "zmemcheck"? (y or n) [answered Y; input not from terminal]
(gdb) run /var/www/wiki/mediawiki/core/maintenance/importImages.php --comment-ext=txt --user=Reedy /tmp/uploads --overwrite
Starting program: /usr/bin/php5 /var/www/wiki/mediawiki/core/maintenance/importImages.php --comment-ext=txt --user=Reedy /tmp/uploads --overwrite
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
warning: the debug information found in "/usr/lib/debug//usr/lib/php5/20100525/mysql.so" does not match "/usr/lib/php5/20100525/mysql.so" (CRC mismatch).

warning: the debug information found in "/usr/lib/debug/usr/lib/php5/20100525/mysql.so" does not match "/usr/lib/php5/20100525/mysql.so" (CRC mismatch).

warning: the debug information found in "/usr/lib/debug//usr/lib/php5/20100525/mysqli.so" does not match "/usr/lib/php5/20100525/mysqli.so" (CRC mismatch).

warning: the debug information found in "/usr/lib/debug/usr/lib/php5/20100525/mysqli.so" does not match "/usr/lib/php5/20100525/mysqli.so" (CRC mismatch).

warning: the debug information found in "/usr/lib/debug//usr/lib/php5/20100525/pdo_mysql.so" does not match "/usr/lib/php5/20100525/pdo_mysql.so" (CRC mismatch).

warning: the debug information found in "/usr/lib/debug/usr/lib/php5/20100525/pdo_mysql.so" does not match "/usr/lib/php5/20100525/pdo_mysql.so" (CRC mismatch).

[New Thread 0x7fffe975b700 (LWP 23078)]
[Thread 0x7fffe975b700 (LWP 23078) exited]
Import Images

Pressbilder_till_Ars_11,_reprofotografi_av_bild_från_resealbum_-_Hallwylska_museet_-_87785.tif exists, overwriting...
Program received signal SIGSEGV, Segmentation fault.
php_ifd_get16u (value=0xfffffffffa3fb318, motorola_intel=0) at /build/buildd/php5-5.4.9/ext/exif/exif.c:1095
1095 /build/buildd/php5-5.4.9/ext/exif/exif.c: No such file or directory.
(gdb) zbacktrace
[0x7ffff7fa7928] exif_read_data("/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif", "0", true) /var/www/wiki/mediawiki/core/includes/media/Exif.php:302
[0x7ffff7fa74e0] Exif->__construct("/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif", "LE") /var/www/wiki/mediawiki/core/includes/media/BitmapMetadataHandler.php:268
[0x7ffff7fa7190] BitmapMetadataHandler::Tiff("/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif") /var/www/wiki/mediawiki/core/includes/media/Tiff.php:82
[0x7ffff7fa6628] TiffHandler->getMetadata(object[0x1c2f7b8], "/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif") /var/www/wiki/mediawiki/core/includes/filebackend/FSFile.php:136
[0x7ffff7fa6498] FSFile->getProps(true) /var/www/wiki/mediawiki/core/includes/filebackend/FSFile.php:241
[0x7ffff7fa4060] FSFile::getPropsFromPath("/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif") /var/www/wiki/mediawiki/core/maintenance/importImages.php:231

Slightly different with PagedTiffHandler installed

php_ifd_get16u (value=0xfffffffffa152920, motorola_intel=0) at /build/buildd/php5-5.4.9/ext/exif/exif.c:1095
1095 /build/buildd/php5-5.4.9/ext/exif/exif.c: No such file or directory.
(gdb) zbacktrace
[0x7ffff7fa8d00] exif_read_data("/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif", "0", true) /var/www/wiki/mediawiki/core/includes/media/Exif.php:302
[0x7ffff7fa88b8] Exif->__construct("/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif", "LE") /var/www/wiki/mediawiki/core/includes/media/BitmapMetadataHandler.php:268
[0x7ffff7fa7428] BitmapMetadataHandler::Tiff("/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif") /var/www/wiki/mediawiki/extensions/PagedTiffHandler/PagedTiffHandler.image.php:174
[0x7ffff7fa7190] PagedTiffImage->retrieveMetaData() /var/www/wiki/mediawiki/extensions/PagedTiffHandler/PagedTiffHandler_body.php:514
[0x7ffff7fa6628] PagedTiffHandler->getMetadata(object[0x143bd50], "/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif") /var/www/wiki/mediawiki/core/includes/filebackend/FSFile.php:136
[0x7ffff7fa6498] FSFile->getProps(true) /var/www/wiki/mediawiki/core/includes/filebackend/FSFile.php:241
[0x7ffff7fa4060] FSFile::getPropsFromPath("/tmp/uploads/Pressbilder_till_Ars_11,_reprofotografi_av_bild_fr\37777777703\37777777645n_resealbum_-_Hallwylska_museet_-_87785.tif") /var/www/wiki/mediawiki/core/maintenance/importImages.php:231

Amusingly, it completes under valgrind

root@ubuntu64-web-esxi:/var/www/wiki/mediawiki/core# valgrind php maintenance/importImages.php --comment-ext=txt --user=Reedy /tmp/uploads

22771== Memcheck, a memory error detector

22771== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.

22771== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info

22771== Command: php maintenance/importImages.php --comment-ext=txt --user=Reedy /tmp/uploads

22771

Import Images

Importing Pressbilder_till_Ars_11,_reprofotografi_av_bild_från_resealbum_-_Hallwylska_museet_-_87785.tif...

done.

Found: 1
Added: 1

22771

22771== HEAP SUMMARY:

22771== in use at exit: 148,542 bytes in 1,646 blocks

22771== total heap usage: 421,034 allocs, 419,388 frees, 122,387,624 bytes allocated

22771

22771== LEAK SUMMARY:

22771== definitely lost: 410 bytes in 7 blocks

22771== indirectly lost: 3,640 bytes in 5 blocks

22771== possibly lost: 0 bytes in 0 blocks

22771== still reachable: 144,492 bytes in 1,634 blocks

22771== suppressed: 0 bytes in 0 blocks

22771== Rerun with --leak-check=full to see details of leaked memory

22771

22771== For counts of detected and suppressed errors, rerun with: -v

22771== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)

root@ubuntu64-web-esxi:/var/www/wiki/mediawiki/core#
root@ubuntu64-web-esxi:/var/www/wiki/mediawiki/core# php maintenance/importImages.php --comment-ext=txt --user=Reedy /tmp/uploads
Import Images

Pressbilder_till_Ars_11,_reprofotografi_av_bild_från_resealbum_-_Hallwylska_museet_-_87785.tif exists, skipping

Found: 1
Skipped: 1
root@ubuntu64-web-esxi:/var/www/wiki/mediawiki/core# php maintenance/importImages.php --comment-ext=txt --user=Reedy /tmp/uploads --overwrite
Import Images

Pressbilder_till_Ars_11,_reprofotografi_av_bild_från_resealbum_-_Hallwylska_museet_-_87785.tif exists, overwriting...Segmentation fault
root@ubuntu64-web-esxi:/var/www/wiki/mediawiki/core#

Very much outside MediaWiki. Need to (I will) file an upstream bug with PHP for this issue

<bd808> If you can't you should file a bug about it
<bd808> Or just stuff it in a github public repo with a test script to reproduce.
<bd808> I think $data = exif_read_data( $this->file, 0, true ); in your minimum test case if I understand the trace Faidon gave.

Can you give a link to the TIFF file so that I can reproduce this?

I have the file. It is an interesting bug. The pointer offset_base is deliberately assigned a value before the start of its associated buffer. It is a kind of virtual start-of-file pointer -- when you add a file offset to offset_base, you get a pointer to the memory that holds the file data at that offset.

The bug occurs when the IFD offset is larger than the heap address for the file buffer. This only happens when both the heap address is small (e.g. if it is allocated from brk()) and the IFD offset is large. In this case, offset_base wraps around past the start of the address space and becomes a large positive pointer. This is mostly harmless, except for when a tag value is located so far before the start of the IFD that its virtual pointer also wraps around to a large positive value. Then in exif_process_IFD_TAG():

if (byte_count > IFDlength || offset_val > IFDlength-byte_count || value_ptr < dir_entry) {

The "value_ptr < dir_entry" condition should be true, because the value is before the IFD in the file, but because value_ptr has wrapped around and dir_entry hasn't, it is false. So this code incorrectly assumes that value_ptr is inside the already-loaded buffer.

Possibly needs reporting in hhvm as well, as most of the exif code is lifted straight from zend.

(In reply to comment #10)

Possibly needs reporting in hhvm as well, as most of the exif code is lifted
straight from zend.

Yep:

HipHop Warning: exif_process_IFD_in_TIFF/7123: Buffer overrun (0x16114a00c, 0x1179e63c6, 2, 0) in /www/test.php on line 2
HipHop Warning: Invalid TIFF file in /www/test.php on line 2

Reported with hhvm too: https://github.com/facebook/hiphop-php/issues/1154

Tim, do you know which are the erroneous values that the code is falling over? Just wondering if there's anything we can do to modify those values so we can import the images into Commons

(In reply to comment #12)

Tim, do you know which are the erroneous values that the code is falling
over?
Just wondering if there's anything we can do to modify those values so we can
import the images into Commons

Maybe a null conversion in ImageMagick would fix it. Or something like http://paste.tstarling.com/p/IKYsdr.html would probably work as a temporary patch.

*** Bug 57282 has been marked as a duplicate of this bug. ***

This is still causing issues on production, ori is cleaning again.

We should Puppetize a cron job to clean up /tmp ASAP.

In case of emergency, roots can ssh to palladium and run

salt 'mw*' cmd.run 'find /tmp -name \*tif -mmin +60  -exec rm {} \; '

(In reply to comment #16)

We should Puppetize a cron job to clean up /tmp ASAP.

In case of emergency, roots can ssh to palladium and run

salt 'mw*' cmd.run 'find /tmp -name \*tif -mmin +60  -exec rm {} \; '

There is/should already be one to clean up thumbnail stuff.... Maybe tif isn't included...

(In reply to comment #16)

We should Puppetize a cron job to clean up /tmp ASAP.

Done in https://gerrit.wikimedia.org/r/#/c/96915/. A cron job will run on the app servers 34 minutes past the hour every hour, purging *.tif files from /tmp that haven't been touched in an hour or longer. This should buy us some breathing room. The cron job should be removed once this bug is fixed.

(In reply to comment #17)

There is/should already be one to clean up thumbnail stuff.... Maybe tif
isn't included...

You're probably thinking of the 'removetmpfiles' cron job in manifests/imagescaler.pp. It only runs on image scalers.

So now that we have a workaround thansk to Ori, is anybody investigating the potential root cause?

(In reply to comment #20)

So now that we have a workaround thansk to Ori, is anybody investigating the
potential root cause?

Cleaning up temp files is not a workaround for a PHP segfault. The root cause is in comment #8 and the fix is in comment #13.

The root cause is in comment #8 and the fix is in comment #13.

Request to apply patch and repackage tracked in RT 6453; Alexandros agreed to take it on.

(In reply to comment #22)

The root cause is in comment #8 and the fix is in comment #13.

Request to apply patch and repackage tracked in RT #6453; Alexandros agreed
to take it on.

Alex rebuilt PHP with Tim's patch, packaged it, and deployed it to the cluster (thank you!). I removed the cron job kludge from Puppet.

Gilles raised the priority of this task from High to Unbreak Now!.Dec 4 2014, 10:11 AM
Gilles moved this task from Untriaged to Done on the Multimedia board.
Gilles lowered the priority of this task from Unbreak Now! to High.Dec 4 2014, 11:22 AM