Page MenuHomePhabricator

Uploads sometimes are incomplete: file is there, but image page is missing
Open, HighPublic

Description

We've had at the Commons recently several cases where uploads were not done completely. The images showed up, the file history showed even the full upload summary, but the corresponding image description pages were not created. As a result, if the user used the license dropdown to specify the license, that license was lost completely (because it doesn't show up in the upload summary).

An example is http://commons.wikimedia.org/wiki/Image:Palencia_-_San_Pablo_15.JPG the image page was created manually on the day after the upload, see the page history at http://commons.wikimedia.org/w/index.php?title=Image:Palencia_-_San_Pablo_15.JPG&action=history . The uploader was notified and did add the license after that, so all is well for this image now. But there are other such images where the licenses were lost, and where the uploaders did not react.

I've observed this behavior twice so far, once on July 31 from about 10:30 to 11:50 UTC, and now on August 31 from 17:19 to 17:39 UTC.

See also http://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard/Archive_10#Server_hiccups and http://commons.wikimedia.org/w/index.php?title=Commons:Administrators%27_noticeboard/Attention&oldid=14027516#Problems_with_upload


Version: unspecified
Severity: normal
See Also:
T63898 (restricted task)
T34551: Descriptionless files (Missing page_latest referential integrity issue)

Details

Reference
bz15430

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 10:16 PM
bzimport set Reference to bz15430.
bzimport added a subscriber: Unknown Object (MLST).

Assigning to Bryan as he rewrote the image handling code recently.

Bryan.TongMinh wrote:

The new code is not active yet on Wikimedia.

(In reply to comment #2)

The new code is not active yet on Wikimedia.

Hmm. In that case, assigned to you because you know a lot about that part of the code ;)

Seriously, though, it would probably be worth investigating/trying whether your rewrite fixes this.

Here's another example where it's even more evident that there was no image
page:
http://commons.wikimedia.org/wiki/Image:Basketball_29082008666_(5).jpg

The only edit to the image page was my nominating it for deletion:
http://commons.wikimedia.org/w/index.php?title=Image:Basketball_29082008666_(5).jpg&action=history

http://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard/Archive_10#Server_hiccups mentions "a bunch of wiki has a problem messages around that time"
I think this may be the same problem reported on #wikimedia-tech, with some apache trying to write to the old master (now read only) on external storage.
I recommend adding some check to the upload code for the case where the description can't be saved.

IRC reports were as follows (UTC time):
16:15:56 <[X]> The last attempted database query was: (SQL query hidden) from within function "ExternalStoreDB::store". MySQL returned error "1290: The MySQL server is running with the --read-only option so it cannot execute this statement (10.0.2.102)".
17:18:30 <Mike_lifeguard> ...within function "ExternalStoreDB::store". MySQL returned error "1290: The MySQL server is running with the --read-only option so it cannot execute this statement (10.0.2.107)".

Bryan.TongMinh wrote:

Apparently LocalFile::recordUpload2 continues unconditionally after Article::doEdit without any error checking whatsoever. Should be fixed.

(In reply to comment #5)

http://commons.wikimedia.org/wiki/Commons:Administrators%27_noticeboard/Archive_10#Server_hiccups
mentions "a bunch of wiki has a problem messages around that time"
I think this may be the same problem reported on #wikimedia-tech, with some
apache trying to write to the old master (now read only) on external storage.
I recommend adding some check to the upload code for the case where the
description can't be saved.

IRC reports were as follows (UTC time):
16:15:56 <[X]> The last attempted database query was: (SQL query hidden) from
within function "ExternalStoreDB::store". MySQL returned error "1290: The MySQL
server is running with the --read-only option so it cannot execute this
statement (10.0.2.102)".
17:18:30 <Mike_lifeguard> ...within function "ExternalStoreDB::store". MySQL
returned error "1290: The MySQL server is running with the --read-only option
so it cannot execute this statement (10.0.2.107)".

Related to bug 15391?

mike.lifeguard+bugs wrote:

(In reply to comment #5)

IRC reports were as follows (UTC time):
17:18:30 <Mike_lifeguard> ...within function "ExternalStoreDB::store". MySQL
returned error "1290: The MySQL server is running with the --read-only option
so it cannot execute this statement (10.0.2.107)".

That one was attempting to save either [[m:Talk:Spam blacklist]] or [[m:Spam blacklist]] (forget which one) not uploading an image. Not sure whether that's relevant or not though.

(In reply to comment #6)

Apparently LocalFile::recordUpload2 continues unconditionally after
Article::doEdit without any error checking whatsoever. Should be fixed.

Yeah, looks like bad transaction wrapping there.

Still, LB should rollback the last transaction if a db failure comes up.

Two more of these, uploaded 2008-09-12:

http://commons.wikimedia.org/wiki/Image:Fibersensing.JPG

http://commons.wikimedia.org/wiki/Image:Audolici.JPG

Any progress on this issue? Since we do lose licensing information when this occurs, this is a pretty serious failure, even if I haven't seen it anymore recently. Has this been fixed or not?

Bryan.TongMinh wrote:

Patch

This patch checks the result of Article::doEdit.

I am not sure whether I did the transaction stuff correctly. Somebody review.

attachment patch.txt ignored as obsolete

Bryan.TongMinh wrote:

Fixed in r45058.

Reverted in r45230

Article::doEdit() opens and commits transactions of its own, so the transaction we started earlier will already be committed and cannot be rolled back at this point unless one of two particular failure modes hits (ArticleSave hook aborts or mLatest isn't set). A more general fix needs to be devised...

Bryan.TongMinh wrote:

Updated patch

Patch that swaps the order around: first the article is editted and if that succeeded the upload is recorded.

Can't test it myself atm because my PHP uploading is broken.

attachment patch.txt ignored as obsolete

This seems like it would be more likely to break, and in a less recoverable fashion.

Current order of operations appears to be:

  1. Physically move in the file
  2. Save 'image'/'oldimage' record
  3. Save log entry
  4. Add/update description page

Failing at the page save at least should leave us with a working file record, history, and log update, unless we hit a rare case where the actual image record update gets rolled back [if the log and page saves don't start any transactions, but then fail].

Order of operations for the patch is:

  1. Physically move in the file
  2. Add/update description page
  3. Save 'image'/'oldimage' record
  4. Save log entry

If we fail at 2) now, we have a file sitting around with no record. If this is an update to an existing file, we have a broken history and an untraceable, unexplained, unlogged change -- very bad.

Either we need to accept that failing the page is ok, or we need to be able to roll back the change to the actual file -- or else do it after, and be able to roll back all our DB and page updates if the file move fails. Icky. :(

Bryan.TongMinh wrote:

(In reply to comment #16)

Order of operations for the patch is:

  1. Physically move in the file
  2. Add/update description page
  3. Save 'image'/'oldimage' record
  4. Save log entry

If we fail at 2) now, we have a file sitting around with no record. If this is
an update to an existing file, we have a broken history and an untraceable,
unexplained, unlogged change -- very bad.

In case this is an existing file, we never reach Article::doEdit, so it can never fail at that.

Either we need to accept that failing the page is ok, or we need to be able to
roll back the change to the actual file -- or else do it after, and be able to
roll back all our DB and page updates if the file move fails. Icky. :(

What I think is that we should accept that LocalFile::recordUpload2 does not always succeed. In case it doesn't it should be guaranteed that it did not make any changes and leave it to the caller to clean up the file.

The current situation is that in case it fails, it may or may not leave a log entry depending on what caused the failure. This seems to me less desirable than a clear guarantee that no log entry has been left.

(In reply to comment #17)

In case this is an existing file, we never reach Article::doEdit, so it can
never fail at that.

Mmm, true. That should simplify things a bit at least. :)

Either we need to accept that failing the page is ok, or we need to be able to
roll back the change to the actual file -- or else do it after, and be able to
roll back all our DB and page updates if the file move fails. Icky. :(

What I think is that we should accept that LocalFile::recordUpload2 does not
always succeed. In case it doesn't it should be guaranteed that it did not make
any changes and leave it to the caller to clean up the file.

The current situation is that in case it fails, it may or may not leave a log
entry depending on what caused the failure. This seems to me less desirable
than a clear guarantee that no log entry has been left.

We might want to consider what happens if various parts fail. Eg, if the edit succeeds, but the log entry fails, then we have to delete the page, the image record, and the file?

See

http://commons.wikimedia.org/wiki/User:Ilmari_Karonen/Zombie_images

for a list of more such images without description page (and, unfortunately, without licensing info).

http://commons.wikimedia.org/wiki/File:Arepas_cooked_with_firewood.jpg file and image entry is there but no image page or log entry.
It was uploaded at today's swapdeath of the servers.

And another one:

http://commons.wikimedia.org/wiki/File:1000_Minin.jpg

Uploaded 2010-05-17, image page was created only when the file was tagged as "no permission".

Bryan.TongMinh wrote:

Delete image row on Article::doEdit failure

Cooked up something simpler: since we know that the only thing that we did before editing the article is adding a row to the image table, we can safely delete this row.

Attached:

Bryan.TongMinh wrote:

(In reply to comment #24)

Created attachment 8136 [details]
Delete image row on Article::doEdit failure

*sigh*

The condition before deleting should obviously be ( !$reupload && !status->isOk() )

Attached:

Looks acceptable, given that we're not going to get consensus on doing the transaction bigger, which is what really should be done, due to fear of locks. I would perform an explicit commit before doEdit(), though.

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

I'm not sure if that belongs into this bug report: https://en.wikipedia.org/wiki/File:Aljazeera.svg was deleted and restored, but while restoring something went wrong - the image was available but not the description page. An user in Commons was aware of this bug and told the workaround is to delete and restore the image again, see log at https://en.wikipedia.org/w/index.php?title=Special:Log&page=File%3AAljazeera.svg

  • Bug 34228 has been marked as a duplicate of this bug. ***
  • Bug 37394 has been marked as a duplicate of this bug. ***

Bryan.TongMinh wrote:

(In reply to comment #24)

Created attachment 8136 [details]
Delete image row on Article::doEdit failure

Cooked up something simpler: since we know that the only thing that we did
before editing the article is adding a row to the image table, we can safely
delete this row.

Getting back to this again, my original reasoning was wrong. There are four possible cases:

  1. File does not exist, page does not exist
  2. File does not exist, page does exist
  3. File does exist, page does not exist
  4. File does exist, page does exist

In cases 3 & 4 we move the image row to oldimage, which is a cumbersome operation to recover. In case 1 & 3 we call Article::doEdit(), which potentially fails. If Article::doEdit() fails, and we are in case 3, this patch leaves a dangling oldimage row with no image row.

Attached:

Created attachment 14083
Screenshot showing the issue. Uploader of the photo this is a derivative of is https://commons.wikimedia.org/wiki/User:Red_Castle - License: unknown

This happened again at https://commons.wikimedia.org/wiki/File:D%C3%A9l%C3%A9gation_g%C3%A9n%C3%A9rale_du_Qu%C3%A9bec_%C3%A0_Londres_2013.JPG and
https://commons.wikimedia.org/wiki/File:Montreal_Place_Londres_2013.JPG

Timeframe: 2013-12-13 01:54:00 - 2013-12-13 01:54:00

Their uploads are also not listed under their contributions: https://commons.wikimedia.org/wiki/Special:Contributions/Red_Castle

(change visibility) 01:46, 13 December 2013 (diff | hist) . . (+1,221)‎ . . N User:Red Castle ‎ (←Created page with 'Bonjour, je s...') (current)
(change visibility) 01:45, 13 December 2013 (diff | hist) . . (0)‎ . . User:Dr Wilson ‎ (→‎Licence) (current) [rollback: 1 edit]
(change visibility) 06:35, 8 November 2013 (diff | hist) . . (+389)‎ . . User talk:Harfang ‎ (→‎Drapeau de l'Acadie: new section) (current) [rollback: 1 edit]

Another user was also affected:
https://commons.wikimedia.org/wiki/File:Ghost_creepy_2013-12-12_17-52.jpg

Attached:

commons_revision_missing_not_in_user_language.png (1×1 px, 280 KB)

https://commons.wikimedia.org/wiki/Commons:Village_pump#Upload_wizard_error

I got a weird error when I uploaded [[File:Facebook users by age population pyramid.png]]. When I try to tag it for deletion, I get "edit conflict". Can someone delete it? The correct file is at [[File:P opulation pyramid of Facebook users by age.png]]. --Dennis Bratland (talk) 03:29, 16 December 2013 (UTC)


The revision #0 of the page named "Facebook users by age population pyramid.png" does not exist.

Thank you all for bringing up this issue, which has gone on too long.

This issue will require major re-architecture work. We plan to revisit it in coming weeks as part of next quarter's improvements to the upload pipeline by the multimedia team. We will post here again once we have discussed possible solutions as a team in early January.

Thanks for your patience, and sorry we haven't been able to address this sooner.

(In reply to Fabrice Florin from comment #36)

Thank you all for bringing up this issue, which has gone on too long.

This issue will require major re-architecture work. We plan to revisit it in
coming weeks as part of next quarter's improvements to the upload pipeline
by the multimedia team. We will post here again once we have discussed
possible solutions as a team in early January.

Fabrice: Any news?

Unassigning, I have been working exclusively on MediaViewer for the last two months, and that probably won't change for a while.

Liuxinyu970226 updated the task description. (Show Details)
Liuxinyu970226 updated the task description. (Show Details)