Page MenuHomePhabricator

Fix and restart job runners for Wikimedia wikis
Closed, ResolvedPublic

Description

Seems like they haven't been doing too much since 1.17...


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=27914

Details

Reference
bz27727

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 11:24 PM
bzimport set Reference to bz27727.

Rebuild new version of wikimedia-job-runner and install on job runners, with fix in r82821, r82822

  • Bug 27782 has been marked as a duplicate of this bug. ***

The job queue on English Wikipedia has been down for well over a week now and there are over 400,000 jobs stacked up waiting. See http://en.wikipedia.org/wiki/Help_talk:Job_queue#Where.27s_it_gone.3F

If this is not the correct place please advise exactly where we should be filing the bug report.

They're running, but enwiki being the biggest, will be the last to be dealt with.

So the job runners don't get stuck and stay only on the bigger wikis for a long time, they rank the smaller wikis higher...

It would seem, with like 12k being added to it in a 4 or 5 hour period, there are still big long running tasks going on. Most likely a rename.

A quick look shows that the first enwiki entry is currently a rename...

I'm asking to find out if the job runners are still running fine. Will report back

This sounds like a recipe for resource starvation. Surely taking age into account is part (I would say it should be all) of the mix?

Bear in mind that from a utilitarian POV a job entry in a big wiki queue is affecting more pages for more people...

If an even simpler sheduler were wanted - count the number of entries in a queue and do n% of them before moving to the next queue. This would get through the backlog at t0 at more or less the same time for each wiki.

There is no concept of age in the job queue, bar lower job is is older.

I have logged a bug asking for the time stamp to be also logged. So maybe it will be able to in future

  • Bug 27863 has been marked as a duplicate of this bug. ***

nuuanu wrote:

Do you have a time frame on when the job queue on English Wikipedia will be dealt with? As of March 6, it's getting worse, not better.
http://en.wikipedia.org/wiki/Help_talk:Job_queue#Where.27s_it_gone.3F

Now 529550 (over half a million) pending jobs. We're keeping a periodic update at the en:wp page mentioned in the last post. This works out at about 28000 per day.

  • Bug 27953 has been marked as a duplicate of this bug. ***

The backlog has now cleared and most job runners are sleeping. Marking fixed.