Page MenuHomePhabricator

[OPS] db1020.eqiad.wmnet is down (impacts Gerrit, OTRS, EventLogging)
Closed, ResolvedPublic

Description

T50061:

Guice provision errors:

1) Cannot open ReviewDb
  at com.google.gerrit.server.util.ThreadLocalRequestContext$1.provideReviewDb(ThreadLocalRequestContext.java:70)
  while locating com.google.gerrit.reviewdb.server.ReviewDb

1 error

Interactions with git over port 29418 is fine though.

Details

Reference
bz73555

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:51 AM
bzimport added projects: Gerrit, acl*sre-team.
bzimport set Reference to bz73555.
bzimport added a subscriber: Unknown Object (MLST).

According to puppet, Gerrit uses the DB 'm2-master.eqiad.wmnet'

From site.pp:

  1. m2 shard node /^db10(20)\.eqiad\.wmnet/ {

Mysqld died:

mysqld processes on db1020 is CRITICAL: PROCS CRITICAL: 0 processes with command name mysqld

The DB is also used by EventLogging and OTRS.

<springle> !log fail over m2 to m2-slave (db1046); investigating db1020

Gerrit is back, using the other mysql server now. Sean Pringle is investigating db1020 failure. Not much help to do on this bug ticket.

Krinkle set Security to None.