Whenever I deploy integration/slave-scripts from tin.eqiad.wmnet, the minions are stuck in pending state.
ssh tin.eqiad.wmnet
$ cd /srv/deployment/integration/slave-scripts
$ git deploy start
$ git pull
$ git deploy sync
...
- INFO : created tag 'integration/slave-scripts-20140205-131314' ...
Running: sudo salt-call -l quiet publish.runner deploy.fetch 'integration/slave-scripts'
Repo: integration/slave-scripts; checking tag: integration/slave-scripts-20140205-131314
2 minions pending (2 reporting)
Continue? ([d]etailed/[C]oncise report,[y]es,[n]o,[r]etry):
Details show:
Continue? ([d]etailed/[C]oncise report,[y]es,[n]o,[r]etry): d
Repo: integration/slave-scripts; checking tag: integration/slave-scripts-20140205-131314
lanthanum.eqiad.wmnet: integration/slave-scripts-20131210-164114 (fetch: 1 [started: 3 mins, last-return: 2 mins])
lanthanum.eqiad.wmnet: integration/slave-scripts-20131210-164114 (fetch: 1 [started: 3 mins, last-return: 3 mins])
2 minions pending (2 reporting)
Looking at the minion lanthanum.eqiad.wmnet in /srv/deployment/integration/slave-scripts , the tag has been fetched:
lanthanum$ git tag |fgrep integration/slave-scripts-20131210-164114
integration/slave-scripts-20131210-164114
lanthanum$
I then continue:
Continue? ([d]etailed/[C]oncise report,[y]es,[n]o,[r]etry): y
Running: sudo salt-call -l quiet publish.runner deploy.checkout 'integration/slave-scripts,False'
Repo: integration/slave-scripts; checking tag: integration/slave-scripts-20140205-131314
2 minions pending (2 reporting)
Continue? ([d]etailed/[C]oncise report,[y]es,[n]o,[r]etry): d
Repo: integration/slave-scripts; checking tag: integration/slave-scripts-20140205-131314
lanthanum.eqiad.wmnet: integration/slave-scripts-20131210-164114 (fetch: 1 [started: 0 mins, last-return: 0 mins])
lanthanum.eqiad.wmnet: integration/slave-scripts-20131210-164114 (fetch: 1 [started: 0 mins, last-return: 0 mins])
2 minions pending (2 reporting)
Continue? ([d]etailed/[C]oncise report,[y]es,[n]o,[r]etry):
Looking on the minion, the tag has been checked out properly.
At that point I just validate (y).
I guess salt is broken somehow with lanthanum.eqiad.wmnet and not reporting back properly.
Version: wmf-deployment
Severity: normal