Page MenuHomePhabricator

increase of internal_api_error_DBQueryError, internal_api_error_FileBackendError at Wikimedia Commons
Closed, ResolvedPublic

Details

Reference
bz53768

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:50 AM
bzimport set Reference to bz53768.
bzimport added a subscriber: Unknown Object (MLST).

"This happened with about a dozen files. I had to do alot of refreshing and delete button clicking to finally get them to delete. It only happened for a short time, and then I was able to delete 50+ further images with no issues."

We would be glad if you could give us (the community) a notice in case there are serious hickups. I know there are ganglia [1] and other monitors but I, for example do not understand how it could have a Current Load Avg of 108% (>100%). Does it mean that 8% of the operations fail?

[1] https://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&s=by+name&c=Swift%2520pmtpa&tab=m&vn=

Created attachment 13245
exceptions on wikimedia server

The autoreport https://commons.wikimedia.org/wiki/MediaWiki_talk:Gadget-AjaxQuickDelete.js/auto-errors/5#Autoreport_by_AjaxQuickDelete_1059957175808
shows:

API request failed (internal_api_error_DBQueryError): Database query error at Wed, 04 Sep 2013 17:10:14 GMT served by mw1205 ++++
Task: movePage
NextTask: removeTemplate
LastTask: reloadPage
Page: File:Rue_Général-Nouvion.JPG (hist • logs • abuse log)
Skin: vector
Contribs Log before error

Attached:

(In reply to comment #2)

We would be glad if you could give us (the community) a notice in case there
are serious hickups.

https://wikitech.wikimedia.org/wiki/Server_admin_log is probably a good first place to check (though very technical, and not helpful in this case).

According to ops there was a "Swift outage at about 18:00 UTC yesterday (Sep 04). As far as the backlog indicates, Swift processes were running at 100% CPU; restarted all of the daemons at the same time."

Rainer: Do you know if this is still (occasionally?) an issue, or can be this considered obsolete (RESOLVED WORKSFORME) nowadays?

(In reply to comment #2)

"This happened with about a dozen files. I had to do alot of refreshing and
delete button clicking to finally get them to delete. It only happened for a
short time, and then I was able to delete 50+ further images with no issues."

We would be glad if you could give us (the community) a notice in case there
are serious hickups. I know there are ganglia [1] and other monitors but I,
for
example do not understand how it could have a Current Load Avg of 108%
(>100%).
Does it mean that 8% of the operations fail?

I imagine that means something along the line of the server has multiple core, and the sum of the loads on all the cores. So on a dual core system 108% might 54% on each core (That is just a guess, don't quote me on it, it might be wrong).

Rainer: Do you know if this is still (occasionally?) an issue, or can be this considered obsolete (RESOLVED WORKSFORME) nowadays?

(In reply to Andre Klapper from comment #7)
internal_api_error_FileBackendError does not occur anymore
internal_api_error_DBQueryError still happens while moving files or creating file pages shortly after deleting the file (and file description page)

according to https://commons.wikimedia.org/w/index.php?title=MediaWiki_talk:Gadget-AjaxQuickDelete.js/auto-errors

I think there was another bug report about internal_api_error_DBQueryError ? Can you verify this Andre? If that's the case, please close this one. Thank you.

(In reply to Rainer Rillke @commons.wikimedia from comment #8)

internal_api_error_DBQueryError still happens while moving files or creating
file pages shortly after deleting the file (and file description page)

I think there was another bug report about internal_api_error_DBQueryError ?

Rainer: https://bugzilla.wikimedia.org/show_bug.cgi?id=37519 ?
https://bugzilla.wikimedia.org/show_bug.cgi?id=40927 might be a dup?

Rainer: https://bugzilla.wikimedia.org/show_bug.cgi?id=37519 ?

Definitely a dupe. Marking as a dupe for that one.

https://bugzilla.wikimedia.org/show_bug.cgi?id=40927 might be a dup?

bug 40927 comment 6 indicates these may be dupes. I'm not 100% sure.

  • This bug has been marked as a duplicate of bug 37519 ***
Gilles raised the priority of this task from High to Unbreak Now!.Dec 4 2014, 10:11 AM
Gilles moved this task from Untriaged to Done on the Multimedia board.
Gilles lowered the priority of this task from Unbreak Now! to High.Dec 4 2014, 11:21 AM