Last Updated: $Date: 2002/10/16 18:35:13 $
The Ganglia Metric Tool (gmetric) allows you to easily monitor any arbitrary host metrics that you like expanding on the core metrics that gmond measures by default.
If you want help with the gmetric sytax, simply use the "help" commandline option
prompt> gmetric --help gmetric 2.5.0 Purpose: The Ganglia Metric Client (gmetric) announces a metric value to all Ganglia Monitoring Daemons (gmonds) that are listening on the cluster multicast channel. Usage: ganglia-monitor-core [OPTIONS]... -h --help Print help and exit -V --version Print version and exit -nSTRING --name=STRING Name of the metric -vSTRING --value=STRING Value of the metric -tSTRING --type=STRING Either string|int8|uint8|int16|uint16|int32|uint32|float|double -uSTRING --units=STRING Unit of measure for the value e.g. Kilobytes, Celcius -sSTRING --slope=STRING Either zero|positive|negative|both (default='both') -xINT --tmax=INT The maximum time in seconds between gmetric calls (default=60) -cSTRING --mcast_channel=STRING Multicast channel to send/receive on (default='239.2.11.71') -pINT --mcast_port=INT Multicast port to send/receive on (default=8649) -iSTRING --mcast_if=STRING Network interface to multicast on e.g. 'eth1' (default='kernel decides') -lINT --mcast_ttl=INT Multicast Time-To-Live (TTL) (default=1) |
The gmetric tool formats a special multicast message and sends it to all gmonds that are listening.
All metrics in ganglia have a name, value, type and optionally units. For example, say I wanted to measure the temperature of my CPU (something gmond doesn't do) then I could multicast this metric with name="temperature", value="63", type="int16" and units="Celcius".
Assume I have a program called cputemp which outputs in text the temperature of the CPU
prompt> cputemp 63 |
I could easily send this data to all listening gmonds by running
prompt> gmetric --name temperature --value `cputemp` --type int16 \ --units Celcius |
Check the exit value of gmetric to see if it successfully sent the data: 0 on success and -1 on failure.
To constantly sample this temperature metric, you just need too add this command to your cron table.
The Ganglia Cluster Status Tool (gstat) is a commandline utility that allows you to get status report for your cluster. With time, it will be a more flexible way to query a gmond running locally or remotely.
To get the commandline options simply run...
prompt> gstat --help gstat 2.5.0 Purpose: The Ganglia Status Client (gstat) connects with a Ganglia Monitoring Daemon (gmond) and output a load-balanced list of cluster hosts Usage: gstat [OPTIONS]... -h --help Print help and exit -V --version Print version and exit -a --all List all hosts. Not just hosts running gexec (default=off) -d --dead Print only the hosts which are dead (default=off) -m --mpifile Print a load-balanced mpifile (default=off) -1 --single_line Print host and information all on one line (default=off) -l --list Print ONLY the host list (default=off) -iSTRING --gmond_ip=STRING Specify the ip address of the gmond to query (default='127.0.0.1') -pINT --gmond_port=INT Specify the gmond port to query (default=8649) |
Running gstat without any parameters will cause it print a load-balanced (least-loaded host first) list of all the hosts running gmond along with the process, load, and CPU information. If you want to see which hosts are down in your cluster, use the --dead gstat option. You can also have gstat produce a dynamic load-balanced mpimachine file with the --mpifile option.
Get a load-balanced list of hosts that are up...
prompt> gstat
CLUSTER INFORMATION
Name: unspecified
Hosts: 97
Gexec Hosts: 73
Dead Hosts: 0
Localtime: Mon Apr 22 16:58:43 2002
CLUSTER HOSTS
Hostname LOAD CPU Gexec
CPUs (Procs/Total) [ 1, 5, 15min] [ User, Nice, System, Idle]
mm92.millennium.berkeley.edu
4 ( 1/ 97) [ 1.10, 1.19, 0.99] [ 5.9, 0.0, 0.5, 100.0] ON
mm98.Millennium.Berkeley.EDU
4 ( 0/ 80) [ 1.16, 1.67, 1.25] [ 4.1, 0.0, 0.2, 98.5] ON
mm91.Millennium.Berkeley.EDU
4 ( 1/ 87) [ 1.67, 1.78, 1.69] [ 25.0, 0.0, 0.7, 74.9] ON
mm75.millennium.berkeley.edu
4 ( 3/ 103) [ 1.85, 2.54, 1.83] [ 72.6, 0.0, 0.2, 50.3] ON
mm67.millennium.Berkeley.EDU
4 ( 4/ 112) [ 1.89, 2.08, 1.38] [ 81.4, 0.0, 0.1, 38.5] ON
mm87.millennium.berkeley.edu
4 ( 4/ 112) [ 1.95, 1.67, 1.27] [ 3.2, 0.0, 0.4, 96.4] ON
mm83.millennium.Berkeley.EDU
4 ( 1/ 120) [ 2.00, 2.59, 2.24] [ 25.0, 0.0, 0.0, 75.0] ON
mm10.millennium.Berkeley.EDU
2 ( 0/ 77) [ 0.00, 0.06, 0.07] [ 0.2, 0.0, 0.0, 99.9] ON
... |
To get create a dynamic load-balanced mpifile list
prompt> gstat --mpifile mm56.Millennium.Berkeley.EDU:4 mm44.Millennium.Berkeley.EDU:4 mm31.Millennium.Berkeley.EDU:2 mm43.Millennium.Berkeley.EDU:4 mm15.Millennium.Berkeley.EDU:2 ... |