Page MenuHomePhabricator

Error "Cannot decode Ogg file: Invalid page at offset X"
Closed, InvalidPublic

Description

Since yesterday, I am getting this error (along with "Error in transcode" statements) for files uploaded by my bot. Since I haven't changed anything on my end, I assume there has been a change in OGG handling.

Some of the affected files:


Version: master
Severity: normal

Details

Reference
bz57048

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:39 AM
bzimport set Reference to bz57048.

A test case: several videos from the same source uploaded in a row, some of them displaying OK, others not:
OK
https://commons.wikimedia.org/wiki/File:Expansion-of-the-Gateway-MultiSite-Recombination-Cloning-Toolkit-pone.0077724.s001.ogv

Not OK:
https://commons.wikimedia.org/wiki/File:Expansion-of-the-Gateway-MultiSite-Recombination-Cloning-Toolkit-pone.0077724.s002.ogv

It has been suggested (see
https://commons.wikimedia.org/w/index.php?title=User_talk%3AOpen_Access_Media_Importer_Bot&diff=109635590&oldid=109056863 ) that the problem may relate to some video files having frame sizes that are not 0 mod 4. In that case, it would likely be a bug on our end that just happened to surface now.

Information related to this bug is also tracked at
https://github.com/erlehmann/open-access-media-importer/issues/112 .

Another test case would be
http://v2v.cc/~j/theora_testsuite/322x242_not-divisible-by-sixteen-framesize.ogg
and the other files listed as "a decoder must play all these files without problems to comply with the theora specification" on
http://wiki.xiph.org/TheoraTestsuite (inaccessible to me, but visible via http://webcache.googleusercontent.com/search?q=cache:0vtwCYPjp6MJ:http://wiki.xiph.org/TheoraTestsuite%2Btheora+test+videos&safe=off&oe=UTF-8&hl=en&ct=clnk ). I have not uploaded the file to Commons due to unclear copyright.

So it looks like we can't extract metadata from the file (that's the cannot decode ogg file error). From there i think it fallsback to treating as an audio file. The transcode error output is:

'/usr/bin/avconv' -y -i '/tmp/localcopy_2ad9c8f12118-1.ogv' -vn -aq '1' -ar '44100' -ac '2' -acodec 'libvorbis' /tmp/transcode_ogg26f67012fce2-1.ogg

Exitcode: 1
Memory: 4194304

avconv version 0.8.9-4:0.8.9-0ubuntu0.12.04.1, Copyright (c) 2000-2013 the Libav developers

built on Nov  9 2013 19:08:00 with gcc 4.6.3

[theora @ 0x7048c0] 7 bits left in packet 82
Input #0, ogg, from '/tmp/localcopy_2ad9c8f12118-1.ogv':

Duration: 00:00:08.83, start: 0.000000, bitrate: 165 kb/s
  Stream #0.0: Video: theora, yuv420p, 352x288 [PAR 1:1 DAR 11:9], 30 tbr, 30 tbn, 30 tbc

Output #0, ogg, to '/tmp/transcode_ogg26f67012fce2-1.ogg':
Output file #0 does not contain any stream

Which is essentially saying, you asked to only transcode the audio (-vn stands for no video) but this is a file that has no audio part, only moving pictures.

When I put this file through ogginfo (ogginfo is not what we use, but another program for getting info about ogg files), the output is:

Processing file "20131113132838!How-Do-Ants-Make-Sense-of-Gravity-A-Boltzmann-Walker-Analysis-of-Lasius-niger-Trajectories-on-pone.0076531.s002.ogv"...

New logical stream (#1, serial: 218cbe29): type theora
Theora headers parsed for stream 1, information follows...
Version: 3.2.1
Vendor: Xiph.Org libtheora 1.1 20090822 (Thusnelda)
Width: 1158
Height: 722
Total image: 1168 by 736, crop offset (0, 14)
Framerate 60/1 (60.00 fps)
Pixel aspect ratio 1:1 (1.000000:1)
Frame aspect 1.603878:1
Colourspace unspecified
Pixel format 4:2:0
Target bitrate: 0 kbps
Nominal quality setting (0-63): 48
User comments section follows...
title=
album=How Do Ants Make Sense of Gravity? A Boltzmann Walker Analysis of Lasius niger Trajectories on Various Inclines
artist=Khuong A, Lecheval V, Fournier R, Blanco S, Weitz S, Bezian J, Gautrais J
copyrights=Khuong et al
license=http://creativecommons.org/licenses/by/3.0/
description=Short recording of a typical tracking session. The program tracks the location of the ant within a 40×40 pixels square centered around its location in the previous frame. Filtering and thresholding the background color yields a binary representation of this area, with white pixels corresponding to the ant and background speckles. A partition algorithm detects the largest spot, from which the centroid is extracted (red dot).
date=2013

WARNING: Hole in data (6 bytes) found at approximate offset 4500 bytes. Corrupted Ogg. WARNING: Hole in data (105 bytes) found at approximate offset 4500 bytes. Corrupted Ogg. WARNING: Hole in data (425 bytes) found at approximate offset 4500 bytes. Corrupted Ogg. WARNING: Hole in data (221 bytes) found at approximate offset 4500 bytes. Corrupted Ogg. WARNING: Expected frame 21, got 27 WARNING: Expected frame 36, got 43 WARNING: Invalid header page, no packet found WARNING: Invalid header page in stream 2, contains multiple packets WARNING: illegally placed page(s) for logical stream 2 This indicates a corrupt Ogg file: Ogg muxing constraints violated, new stream before EOS of all previous streams. New logical stream (#2, serial: 1eb6d2d0): type invalid WARNING: stream start flag not set on stream 2 WARNING: sequence number gap in stream 2. Got page 5 when expecting page 0. Indicates missing data. WARNING: sequence number gap in stream 1. Got page 6 when expecting page 5. Indicates missing data. WARNING: discontinuity in stream (1) WARNING: Expected frame 51, got 21 WARNING: granulepos in stream 1 decreases from 106 to 84 WARNING: Hole in data (684 bytes) found at approximate offset 130500 bytes. Corrupted Ogg. [.. This warning about 100 times..] WARNING: Invalid header page, no packet found WARNING: Invalid header page in stream 3, contains multiple packets WARNING: illegally placed page(s) for logical stream 3 This indicates a corrupt Ogg file: Ogg muxing constraints violated, new stream before EOS of all previous streams. New logical stream (#3, serial: 401dbc3b): type invalid WARNING: stream start flag not set on stream 3 WARNING: sequence number gap in stream 3. Got page 22 when expecting page 0. Indicates missing data. WARNING: Hole in data (541 bytes) found at approximate offset 144000 bytes. Corrupted Ogg. WARNING: Hole in data (207 bytes) found at approximate offset 1084500 bytes. Corrupted Ogg. [... Hole in data warning about 300 more times ...] WARNING: EOS not set on stream 1 Theora stream 1: Total data length: 58046 bytes Playback length: 0m:00.349s Average bitrate: 1326.765714 kb/s WARNING: EOS not set on stream 2

Which suggests, perhaps this file is just broken (?)

Also, when I tried to play the file, mplayer could only play about a second of it before exiting.

Thanks for checking.

We have done some further checks on our end, and it seems likely that there was a problem with our server, though we haven't yet tracked it down. For details, see
https://github.com/erlehmann/open-access-media-importer/issues/112#issuecomment-28667840 .

Comments 4-6 by two different people imply that the testcases were broken. Hence closing as INVALID.