Oracle Solaris CDOM/LDOM Monitoring

Skip Prerequisites, Web and LPAR2RRD tabs in case of configuring Virtual Appliance, Docker or a Container

Follow installation procedure for your operating system platform
LPAR2RRD monitoring performance of Oracle Solaris boxes with graphical presentation.
It is aware of Oracle Solaris Sparc and x86 virtualization. It presents data hierarchically.

Implementation is done through OS agent running on each Oracle Solaris host (LDOM/CDOM/Global Zone/Zone).
You must use LPAR2RRD agent and server 6.00+.

Solaris OS agent LDOM schema

Working modes

  1. install OS agents on all Control Domains (CDOM) only
  2. Install OS agents on all LDOMs and Global Zones
  3. Install OS agents on all LDOMs and Global Zones and Zones
1) you get all CDOM data and limited performance data set about all its LDOMs (CPU/Mem/Net).
2) brings you more details about each LDOM.
3) monitoring all Zones from OS point of view.

Installation summary

  1. Assure your network allows TCP connection initiated from OS agents to LPAR2RRD server on port 8162
  2. Make sure your LPAR2RRD daemon is running on LPAR2RRD server
  3. Install the OS agent on all LDOMs, CDOMs and Global Zones
  4. Optionally install the OS agent on all Zones to get additional OS based metrics

OS agent install on a LDOM/CDOM

  • Create user lpar2rrd with role solaris.ldoms.read

  • Installation under root:
    # gunzip lpar2rrd-agent-6.00-0.solaris-sparc.tar.gz
    # tar xf lpar2rrd-agent-6.00-0.solaris-sparc.tar
    # pkgadd -d .
      The following packages are available:
      1  lpar2rrd-agent     LPAR2RRD OS agent 6.00
      ...
    
    Upgrade (remove original package at first then install the new one):
    # pkgrm lpar2rrd-agent
    # pkgadd -d .
    
  • Assign LDOM/CDOM read rights solaris.ldoms.read for the user (lpar2rrd) which will run the agent:
    # usermod -A solaris.ldoms.read lpar2rrd
    
    Assure that rights are fine, "/sbin/ldm ls -p" should not return "Authorization failed"
    # su - lpar2rrd
    $ /sbin/ldm ls -p
    

OS agent install on Zone/Global Zone

    Use any unprivileged user (lpar2rrd preferably) for agent install and run.
    Use same Solaris package like in LDOM example above.
    Use Solaris x86 package on that platform: lpar2rrd-agent-6.00-0.solaris-i86pc.tar

Testing connection

  • Test connection to the LPAR2RRD server
    $ /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <LPAR2RRD-SERVER>
      ...
      OS agent working for server: <LPAR2RRD-SERVER>
      store file for sending is /var/tmp/lpar2rrd-agent-<LPAR2RRD-SERVER>-lpar2rrd.txt
    
    It means that data has been sent to the server, all is fine
    Here is example when the agent is not able to sent data :
    $ /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl -d <LPAR2RRD-SERVER>
      ...
      OS agent working for server: <LPAR2RRD-SERVER>
      store file for sending is /var/tmp/lpar2rrd-agent-<LPAR2RRD-SERVER>-lpar2rrd.txt
      Agent timed out after : 50 seconds /opt/lpar2rrd-agent/lpar2rrd-agent.pl:265
    
    It means that the agent could not contact the server.
    Check communication (if firewalls are open), DNS resolution of the server etc.

Schedule OS agent in Solaris lpar2rrd's crontab

  • CDOM: use 5 minutes schedule
    # su - lpar2rrd
    $ crontab -e
    0,5,10,15,20,25,30,35,40,45,50,55 * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <LPAR2RRD-SERVER> > /var/tmp/lpar2rrd-agent.out 2>&1
    
    Replace <LPAR2RRD-SERVER> by hostname of your LPAR2RRD server.

  • LDOM, Global Zone and Zone: use 1 minute schedule
    # su - lpar2rrd
    $ crontab -e
    * * * * * /usr/bin/perl /opt/lpar2rrd-agent/lpar2rrd-agent.pl <LPAR2RRD-SERVER> > /var/tmp/lpar2rrd-agent.out 2>&1
    
    Replace <LPAR2RRD-SERVER> by hostname of your LPAR2RRD server.

  • You might need to add lpar2rrd user into /etc/cron.allow (Linux) or /var/adm/cron/cron.allow (AIX) if 'crontab -e' command fails
    Allow it for lpar2rrd user as root user.
    # echo "lpar2rrd" >> /etc/cron.allow
    
You will see your Solaris boxes in the UI under Solaris folder within an hour (Ctrl-F5 in the web browser).

Install LPAR2RRD server (all under lpar2rrd user)

  • Definitelly use the latest available version 6.10+ where is fixed a lot of issues.

  • Install it:
    # su - lpar2rrd
    $ tar xvf lpar2rrd-7.XX.tar
    $ cd lpar2rrd-7.XX
    $ ./install.sh
    $ cd /home/lpar2rrd/lpar2rrd
    
  • Make sure all Perl modules are in place
    cd /home/lpar2rrd/lpar2rrd
    . etc/lpar2rrd.cfg; $PERL bin/perl_modules_check.pl
    
    If there is missing "LWP::Protocol::https" then check this docu to fix it

  • Enable Apache authorisation
    su - lpar2rrd
    umask 022
    cd /home/lpar2rrd/lpar2rrd
    cp html/.htaccess www
    cp html/.htaccess lpar2rrd-cgi
    
  • Schedule to run it from lpar2rrd crontab (it might already exist there)
    $ crontab -l | grep load.sh
    $
    
    Add if it does not exist as above
    $ crontab -e
    
    # LPAR2RRD UI
    0,30 * * * * /home/lpar2rrd/lpar2rrd/load.sh > /home/lpar2rrd/lpar2rrd/load.out 2>&1 
    
    Assure there is just one such entry in crontab.

  • You might need to add lpar2rrd user into /etc/cron.allow (Linux) or /var/adm/cron/cron.allow (AIX) if 'crontab -e' command fails
    Allow it for lpar2rrd user as root user.
    # echo "lpar2rrd" >> /etc/cron.allow
    
  • Initial start from cmd line:
    $ cd /home/lpar2rrd/lpar2rrd
    $ ./load.sh
    
  • Go to the web UI: http://<your web server>/lpar2rrd/
    Use Ctrl-F5 to refresh the web browser cache.

Troubleshooting

  • If you have any problems with the UI then check:
    (note that the path to Apache logs might be different, search apache logs in /var)
    tail /var/log/httpd/error_log             # Apache error log
    tail /var/log/httpd/access_log            # Apache access log
    tail /var/tmp/lpar2rrd-realt-error.log    # STOR2RRD CGI-BIN log
    tail /var/tmp/systemd-private*/tmp/lpar2rrd-realt-error.log # STOR2RRD CGI-BIN log when Linux has enabled private temp
    
  • Test of CGI-BIN setup
    umask 022
    cd /home/lpar2rrd/lpar2rrd/
    cp bin/test-healthcheck-cgi.sh lpar2rrd-cgi/
    
    go to the web browser: http://<your web server>/lpar2rrd/test.html
    You should see your Apache, LPAR2RRD, and Operating System variables, if not, then check Apache logs for connected errors