Page MenuHomePhabricator

Unknown error: "stasherror" for >100 MB PDF file upload to Commons
Closed, ResolvedPublic

Description

Author: shijualex

Description:
This bug is as per [[:commons:Help:Server-side_upload]]

Please upload the PDF file of a Public Domain document from this link (http://archive.org/details/englishmalayalam00tobirich) to commons. It is around 130 MB.

Rename the file as "English_malayalam_sabdakosam_tobias_1907.pdf" while uploading.


Version: 1.22.0
Severity: major
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=36587

Details

Reference
bz51730

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:53 AM
bzimport set Reference to bz51730.

Please enable chunked uploads in your Preferences (Preferences => Uploads => Experimental features) and upload it yourself; no server-side upload is required for such small files.

shijualex wrote:

Actually initially I tried that. But it is showing error and the upload is not happening. Browser: FireFox 22.0. OS: Windows 7. That is why I logged a bug.

(In reply to comment #2)

But it is showing error

If there is an error, see https://www.mediawiki.org/wiki/How_to_report_a_bug

Unknown error: "stasherror" is what I'm getting. Not especially helpful.

sreejithk2000 wrote:

Screenshot of error

This is the error I am getting and the message is not helpful.

Attached:

Capture.PNG (294×941 px, 12 KB)

I can confirm I also received this error. The error mentioned in comment 4 occurred when I tried uploading the file for the second time.

moving to MediaWiki/Uploading - Looks to be an issue on the backend part of uploading.

shijualex wrote:

@tomasz W. Kozlowski, Could you please upload the file mentioned in Comment 1 to Commons.

See comment 4 and comment 6 to see that I tried and could not upload it.

shijualex wrote:

I am changing the priority of this bug to "high". The reason is, due to this bug as of today there is no option in Wikimedia Commons to upload files more than 100 MB. This bug require immediate attention from developers.

I can confirm when I tried to upload it hung on the {"upload":{"result":"Poll","stage":"queued"}}, stage.

Testing locally, it seems a significant amount of time is spent getting file metadata/text layer when assembling chunks. (18 seconds vs about 2 minutes)

I suspect this is what's causing problems in this specific case (However there's the larger issue of poor error reporting, and the fact that things are timing out anywhere near this quickly anyhow)

Mark: Could you take a look at this?

OTOH, Locally I'm also getting during the check status phase:

{"upload":{"result":"Poll","stage":"queued"}}<br />
<b>Fatal error</b>: Allowed memory size of 134217728 bytes exhausted (tried to allocate 2 bytes) in <b>/var/www/w/git/includes/normal/UtfNormal.php</b> on line <b>295</b><br />

Which is a giant wtf, since that type of request should not take lots of memory. However that seems more likely something wrong with my local set up.

For reference, I was able to successfully upload the djvu version: http://commons.wikimedia.org/wiki/File:English_malayalam_sabdakosam_tobias_1907.djvu, so I guess the problem is with the PDF version of the file.

Shiju Alex: Could you update the file info for that djvu file as appropriate.

Note, just to be clear, I'm not suggesting that just because the djvu file worked, that this bug is solved. Obviously there are still issues.

shijualex wrote:

For DjVu it worked because it's size is around 89 MB (that is, less than 100 MB). So I guess this bug (not able to upload files with size > 100 MB) is still valid.

Let me locate another PDF that is more than 100 MB to verify this.

(In reply to comment #17)

For DjVu it worked because it's size is around 89 MB (that is, less than 100
MB). So I guess this bug (not able to upload files with size > 100 MB) is
still
valid.

I suspect it more has to do with uploading a PDF > 100 MB. It may be a combination of the file format and the lower size that made the djvu file work. (Note I still used chunked uploading to upload the djvu file)

Let me locate another PDF that is more than 100 MB to verify this.

More interesting would be a PDF file that is ~89 mb to check if the problem is something to do with our metadata extraction on pdf files.

More interesting would be a PDF file that is ~89 mb to check if the problem
is
something to do with our metadata extraction on pdf files.

I tried uploading [[commons:File:Eugene and Frederick Sutermeister 1906.pdf]] (85 mb) to test2.wikipedia.org using chunked uploading. It did not work. This further suggests its related to our pdf handling.

(In reply to comment #19)

More interesting would be a PDF file that is ~89 mb to check if the problem
is
something to do with our metadata extraction on pdf files.

I tried uploading [[commons:File:Eugene and Frederick Sutermeister 1906.pdf]]
(85 mb) to test2.wikipedia.org using chunked uploading. It did not work. This
further suggests its related to our pdf handling.

Err, nevermind - Eugene and Frederick Sutermeister 1906-test.pdf did get uploaded to test2, I just didn't notice the success message. I still suspect issue is related to pdfs.

Isn't this bug just the revival of bug 36587?

(In reply to comment #21)

Isn't this bug just the revival of bug 36587?

Probably related, but not precisely the same. Seems like pdfs are much more likely to trigger due to much more expensive metadata operation which could be optimized.

(In reply to comment #22)

(In reply to comment #21)

Isn't this bug just the revival of bug 36587?

Probably related, but not precisely the same. Seems like pdfs are much more
likely to trigger due to much more expensive metadata operation which could
be
optimized.

If that's the issue, it looks similar to what happened with TIFF files, a problem which IIRC was resolved by this piece of config:

$wgTiffMaxMetaSize = 1048576;
matmarex subscribed.

I was able to upload the PDF file to Commons today (I only uploaded to stash, didn't publish it, since the DJVU version is already uploaded).

pasted_file (943×1 px, 110 KB)