Page MenuHomePhabricator

analytics1012 lacks Hadoop metric groups in ganglia
Closed, ResolvedPublic

Description

Although other worker nodes in the cluster (e.g.: analytics1016 [1])
show Hadoop metric groups like:

  • Hadoop.DataNode metrics
  • Hadoop.NodeManager metrics
  • Hadoop.NodeManager.jvm_memory metrics

ganglia does not show them for analytics1012:

http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&c=Analytics+cluster
+eqiad&h=analytics1012.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=medium&metric_group=NOGROUPS

Should not those metrics also show up for analytics1012?

[1] http://ganglia.wikimedia.org/latest/?r=hour&cs=&ce=&c=Analytics+cluster+eqiad&h=analytics1016.eqiad.wmnet&tab=m&vn=&hide-hf=false&mc=2&z=medium&metric_group=NOGROUPS


Version: unspecified
Severity: normal

Details

Reference
bz63472

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 3:10 AM
bzimport set Reference to bz63472.
bzimport added a subscriber: Unknown Object (MLST).

bingle-admin wrote:

Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1523

otto wrote:

YES! Found it. /etc/hosts had a bad IP listed on analytics1012 for itself. Fixed and things look much better now!

Fixed by Andrew. Ganglia now shows the metric groups and corresponding metrics.

Stupid me. Looked at the wrong host. Sorry.

Reopening: analytics1012 still misses the metric groups of this bug's
description.