Page MenuHomePhabricator

puppet labsstatus not reported when using role::puppet::self
Closed, DeclinedPublic

Description

The list of instances in OpenStack Manager has a column listing the status of puppet run on each instance.

On projects deployment-prep and integration, we are now using our own puppetmasters (respectively deployment-salt.eqiad.wmflabs and integration-puppetmaster.eqiad.wmflabs). All instances ends up having a 'stale' status.

I have no idea how the status is generated and collected, would be very nice to have a way to report the status from independent puppet masters.


Version: unspecified
Severity: normal

Details

Reference
bz63296

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:59 AM
bzimport set Reference to bz63296.
bzimport added a subscriber: Unknown Object (MLST).

The check is easy, it's done by a reporter that's on every puppetmaster. Currently, though, that fact is relayed to wikitech via nova instance metadata, which means that it can only get reported properly if the puppetmaster has Openstack auth. Which self-hosted instances generally don't.

Can we have a custom reporter to output the status of each nodes locally in a flat file? We could then fetch and render it somewhere.

Yes, be my guest :) The current reporter is in modules/puppetmaster/lib/puppet/reports, it's pretty straightforward.

Note, though, that we're in the process of adding some better icinga puppet reporting as well, so it might be best to just get that working for labs instances.

Getting reporting into wikitech working may be as easy as setting report_server in /etc/puppet/puppet.conf to point to the labs master server when ::puppet::self::* is applied to a labs node.

A slightly more complex approach would be to add some endpoint on a secure server that can talk to wikitech directly and use it to proxy reports sent by ::puppet::self::master nodes. The reporter used by the labs masters (modules/puppetmaster/lib/puppet/reports/labsstatus.rb) seems to only need the project name (eg deployment-prep) and hostname (eg i-0000010b.eqiad.wmflabs) along with the puppet run status and timestamp to update wikitech. The advantage of this method of reporting would be that the ::puppet::self::master would still have access to the full report from each host that it serves to allow additional custom reporting as desired.

Getting reporting into wikitech working may be as easy as setting
report_server in /etc/puppet/puppet.conf to point to the labs master server
when ::puppet::self::* is applied to a labs node.

It would shock me if that worked because, y'know, auth.

But, as I said, best not to get too deep into this as this will (I hope) be handled using proper monitoring tools shortly.

(In reply to Andrew Bogott from comment #5)

Getting reporting into wikitech working may be as easy as setting
report_server in /etc/puppet/puppet.conf to point to the labs master server
when ::puppet::self::* is applied to a labs node.

It would shock me if that worked because, y'know, auth.

Confirmed:

Error: Could not send report: SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed: [self signed certificate in certificate chain for /CN=Puppet CA: virt1000.wikimedia.org]

But, as I said, best not to get too deep into this as this will (I hope) be
handled using proper monitoring tools shortly.

Will the icinga monitoring that you've been working on be usable in labs? Or are you referring to the work that Yuvi is doing towards collecting information on the puppet runs with diamond? I'd be happy with either, but it would be nice to know which gerrit patches to follow. :)

The reason I filled this bug was to use OpenstackManager as a dashboard of puppet run. Over the last few weeks, Yuvi had puppet runs reported to graphite and monitored using Shinken: http://shinken.wmflabs.org/problems which fulfill the use case I have.