As system administrators we need to have some essential info about our servers as a minimum requirement for our job, and identify patterns and obtain knowledge about trends in our workload. And, when we need to use Dtrace, MDB, or need to understand a FMA ereport… we do not have much time. Actually, i want to post some notes about the last one in a future blog entry.
I think the big problem about these tools is that we as sysadmin do not use that tools daily as a devel guy should do. And so we are not debugging with these tools all the time. What is good! So i think the solution is to develop some tools/scripts to make our life easier (like the Dtrace Toolkit), NFS Block Size Monitor, dcmd’s for the MDB, and have some scripts to parse ereports in an easy way. Well, but that is for future posts…
In the old days, managing Solaris servers, i was used to Orca for gathering crucial, must have, server’s performance informations. It was pretty simple, extensible, and made for solaris. First thing was install the orca on the solaris servers and “see the big picture”. But these were old days…
In the transition for OpenSolaris i did try to use zabbix for this job, and as you can imagine was not so good. Not because of zabbix, that is a really fantastic tool (when used for the right job). Actually, i was thinking in use the zabbix for other things too, but was so much complexity and the better approuch was divide the administration tasks in specific areas. We do have a homebrew Administration System for our storage business, so for standard informations like: cpu, network, memory, and etc, we just need some tool to be the right replacement for the old handy orca. The answer was: Ganglia (Wikipedia: ganglion).
Well, there is no package for OpenSolaris (OpenSolaris???)… so, i will put here some notes on how i did install it on OpenSolaris and some tips for you do not waste much time on it.
Simple things first… you will need to install SUNWapr13.

# pkg install SUNWapr13

Second, you will need libconfuse (what a name)… what i did need here was to compile it using a specific configure option to create the shared library. Without it, the standard compilation did not create the shared library, just the static one (.a). There is a note about it on the Ganglia website:

# cd confuse-2.7/
# ./configure --enable-shared
# make && make install

After that you can compile the ganglia software. The tip here was to use a specifig LDFLAGS in the configure procedure. Without it, the software was failing in run time. I did try to use the –with-libapr option, and use the absolute path to apr-1-config without luck. So, as we need things working…:

# LDFLAGS="-R/usr/apr/1.3/lib/" ./configure --with-libconfuse=/usr/local/ --enable-gexec --sysconfdir=/etc/ganglia
# make && make install

The above configure line will configure and install ganglia monitor software on “/usr/”. That is one of a few packages that the installation is not default to “/usr/local”. Not so good… if you want to change that, you can do it on the configure line.

I’m assuming that you want just the monitor part on your OpenSolaris machines, without gmetad, because you have it on another system ( you just need the gmetad on the system you will centralize the data). If you want to install it on one OpenSolaris system, you will need to append the option “–with-gmetad” to the configure line.
You can create the gmond needed configuration file using the gmond itself:

# gmond --default_config > /etc/ganglia/gmond.conf

And to get it up and running the configuration can be simple like this:
– In the gmond cluster section, change the lines as you wish…

 cluster {
  name = "Servers"
  owner = "Company"
  latlong = "unspecified"
  url = "unspecified"

I did use the udp_channel, so was just uncomment the line bind_hostname, and inform the gmetad host:

udp_send_channel {
 bind_hostname = yes
   host = gmetadserver
  port = 8649
  ttl = 1

That’s it. This same file can be used for all your servers, and obviously you can customize it like you want! But with these few configurations, you will have all your hosts working will all the essential performance monitoring (just like the old orca ;-).
In the begining of the gmond.conf file there are generic handy parameters:

 globals {
  daemonize = yes
  setuid = yes
  user = ganglia
  debug_level = 0
  max_udp_msg_len = 1472
  mute = no
  deaf = no
  allow_extra_data = yes
  host_dmax = 0 /*secs */
  cleanup_threshold = 300 /*secs */
  gexec = no
  send_metadata_interval = 0 /*secs */

I did create a ganglia user to run this software on OpenSolaris, so we can use all the RBAC features on it. But you can use a standard nobody user for example, and should work just fine. You can change the daemonize option to no so the process gmond will stay in foreground. But you can just start gmond usind an option like “-d 2” and automatically the process will be at foreground with many useful debug messages. You can test the gmond on your machines using a simple telnet command like:

# telnet localhost 8649

That command should produce a lot of informations on XML format. Cool!
Finally, Ganglia has a powerful gexec feature that i’m not going to cover in this post, but you can enable it just changing the gexec line to “yes”.
After you did start ganglia on your servers, you can use gstat on the gmetad server to see them:

 gstat -a | head -7
       Name: Servers
      Hosts: 380
Gexec Hosts: 0
 Dead Hosts: 0
  Localtime: Wed Aug 25 11:56:42 2010

We do not have an open repository yet, but we are thinking in create one soon. So, if you want to create a package, here you can get the .ips i did create for Ganglia Monitor and the binaries and libraries it needs:

set            value="Ganglia"
set name=pkg.description     value="Ganglia Monitor"
dir mode=0755 owner=root group=bin  path=/lib
dir mode=0755 owner=root group=bin  path=/lib/svc
dir mode=0755 owner=root group=bin  path=/lib/svc/method
dir mode=0755 owner=root group=sys  path=/usr
dir mode=0755 owner=root group=bin  path=/usr/lib
dir mode=0755 owner=root group=sys  path=/usr/lib/ganglia
dir mode=0755 owner=root group=sys  path=/usr/lib/ganglia/python_modules
dir mode=0755 owner=root group=bin  path=/usr/sbin
dir mode=0755 owner=root group=bin  path=/usr/bin
dir mode=0755 owner=root group=sys  path=/etc
dir mode=0755 owner=root group=root path=/etc/ganglia
dir mode=0755 owner=root group=root path=/etc/ganglia/conf.d
dir mode=0755 owner=root group=sys  path=/var
dir mode=0755 owner=root group=sys  path=/var/svc
dir mode=0755 owner=root group=sys  path=/var/svc/manifest
dir mode=0755 owner=root group=sys  path=/var/svc/manifest/network
file usr/sbin/gmond mode=0755 owner=root group=root path=/usr/sbin/gmond
file usr/bin/gstat mode=0755 owner=root group=root path=/usr/bin/gstat
file usr/bin/gmetric mode=0755 owner=root group=root path=/usr/bin/gmetric
file lib/svc/method/ganglia mode=0755 owner=root group=root path=/lib/svc/method/ganglia
file etc/ganglia/gmond.conf mode=0644 owner=root group=root path=/etc/ganglia/gmond.conf
file etc/ganglia/conf.d/modpython.conf mode=0644 owner=root group=root path=/etc/ganglia/conf.d/modpython.conf
file var/svc/manifest/network/ganglia.xml mode=0644 owner=root group=root path=/var/svc/manifest/network/ganglia.xml
file usr/lib/ganglia/ mode=0644 owner=root group=bin path=/usr/lib/ganglia/
file usr/lib/ganglia/ mode=0644 owner=root group=bin path=/usr/lib/ganglia/
file usr/lib/ganglia/ mode=0644 owner=root group=bin path=/usr/lib/ganglia/
file usr/lib/ganglia/ mode=0644 owner=root group=bin path=/usr/lib/ganglia/
file usr/lib/ganglia/ mode=0644 owner=root group=bin path=/usr/lib/ganglia/
file usr/lib/ganglia/ mode=0644 owner=root group=bin path=/usr/lib/ganglia/
file usr/lib/ganglia/ mode=0644 owner=root group=bin path=/usr/lib/ganglia/
file usr/lib/ganglia/ mode=0644 owner=root group=bin path=/usr/lib/ganglia/
file usr/lib/ganglia/ mode=0644 owner=root group=bin path=/usr/lib/ganglia/
file usr/lib/ mode=0644 owner=root group=bin path=/usr/lib/
file usr/lib/libganglia.a mode=0644 owner=root group=bin path=/usr/lib/libganglia.a
file usr/lib/ mode=0644 owner=root group=bin path=/usr/lib/
link mode=0555 owner=root group=bin path=/usr/lib/
link mode=0555 owner=root group=bin path=/usr/lib/

And here you can get the xml file for the SMF Service, and the start method as well. Hoping can be useful for you…