Nagios on Ubuntu

If you follow these instructions, here’s what you’ll end up with:

 Nagios and the plugins will be installed underneath /usr/local/nagios
  • Nagios will be configured to monitor a few aspects of your local system (CPU load, disk usage, etc.)
  • The Nagios web interface will be accessible at http://localhost/nagios/
Required Packages
Make sure you’ve installed the following packages on your Ubuntu installation before continuing.
  • Apache 2
  • PHP
  • GCC compiler and development libraries
  • GD development libraries

You can use apt-get to install these packages by running the following commands:
sudo apt-get install apache2

sudo apt-get install libapache2-mod-php5

sudo apt-get install build-essential

1) Create Account Information
Become the root user.
sudo -s

Create a new nagios user account and give it a password.
/usr/sbin/useradd -m -s /bin/bash nagios

passwd nagios

Create a new nagcmd group for allowing external commands to be submitted through the web interface. Add both the nagios user and the apache user to the group.
/usr/sbin/groupadd nagcmd

/usr/sbin/usermod -a -G nagcmd nagios

/usr/sbin/usermod -a -G nagcmd www-data

2) Download Nagios and the Plugins
Create a directory for storing the downloads.
mkdir ~/downloads

cd ~/downloads

Download the source code tarballs of both Nagios and the Nagios plugins (visit http://www.nagios.org/download/ for links to the latest versions).
wget http://sourceforge.net/projects/nagios/files/nagios-3.x/nagios-3.4.4/nagios-3.4.4.tar.gz

wget http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.16/nagios-plugins-1.4.16.tar.gz

3) Compile and Install Nagios
Extract the Nagios source code tarball.
cd ~/downloads

tar xzf nagios-3.4.4.tar.gz

cd nagios

Run the Nagios configure script, passing the name of the group you created earlier like so:
./configure --with-command-group=nagcmd

Compile the Nagios source code.
make all

Install binaries, init script, sample config files and set permissions on the external command directory.
make install

make install-init

make install-config

make install-commandmode

Don’t start Nagios yet – there’s still more that needs to be done…
4) Customize Configuration
Sample configuration files have now been installed in the /usr/local/nagios/etc directory. These sample files should work fine for getting started with Nagios. You’ll need to make just one change before you proceed…
Edit the /usr/local/nagios/etc/objects/contacts.cfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address you’d like to use for receiving alerts.
vi /usr/local/nagios/etc/objects/contacts.cfg

5) Configure the Web Interface
Install the Nagios web config file in the Apache conf.d directory.
make install-webconf

Create a nagiosadmin account for logging into the Nagios web interface. Remember the password you assign to this account – you’ll need it later.
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin

Restart Apache to make the new settings take effect.
/etc/init.d/apache2 reload

 6) Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball.
cd ~/downloads

tar xzf nagios-plugins-1.4.16.tar.gz

cd nagios-plugins-1.4.16

Compile and install the plugins.
./configure --with-nagios-user=nagios --with-nagios-group=nagios

make

make install

7) Start Nagios
Configure Nagios to automatically start when the system boots.
ln -s /etc/init.d/nagios /etc/rcS.d/S99nagios

Verify the sample Nagios configuration files.
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

If there are no errors, start Nagios.
/etc/init.d/nagios start

8) Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below. You’ll be prompted for the username (nagiosadmin) and password you specified earlier.
http://localhost/nagios/

Click on the “Service Detail” navbar link to see details of what’s being monitored on your local machine. It will take a few minutes for Nagios to check all the services associated with your machine, as the checks are spread out over time.
9) Other Modifications
If you want to receive email notifications for Nagios alerts, you need to install the Postfix package.
sudo apt-get install postfix

You’ll have to edit the Nagios email notification commands found in /usr/local/nagios/etc/objects/commands.cfg and change any ‘/bin/mail’ references to ‘/usr/bin/mail’. Once you do that you’ll need to restart Nagios to make the configuration changes live.
sudo /etc/init.d/nagios restart

Get Nagios Frontends - Enhance your Nagios experience with additional frontends.
CONFIGURATION
################################################################################
#
# SAMPLE NOTIFICATION COMMANDS
#
# These are some example notification commands. They may or may not work on
# your system without modification. As an example, some systems will require
# you to use "/usr/bin/mailx" instead of "/usr/bin/mail" in the commands below.
#
################################################################################
################################################################################
#
# SAMPLE HOST CHECK COMMANDS
#
################################################################################
# This command checks to see if a host is "alive" by pinging it
# The check must result in a 100% packet loss or 5 second (5000ms) round trip
# average time to produce a critical error.
# Note: Five ICMP echo packets are sent (determined by the '-p 5' argument)

# 'check-host-alive' command definition
define command{
command_name check-host-alive
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5
}

################################################################################
#
# SAMPLE SERVICE CHECK COMMANDS
#
# These are some example service check commands. They may or may not work on
# your system, as they must be modified for your plugins. See the HTML
# documentation on the plugins for examples of how to configure command definitions.
#
# NOTE: The following 'check_local_...' functions are designed to monitor
# various metrics on the host that Nagios is running on (i.e. this one).
################################################################################

# 'check_local_disk' command definition
define command{
command_name check_local_disk
command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
}

# 'check_local_load' command definition
define command{
command_name check_local_load
command_line $USER1$/check_load -w $ARG1$ -c $ARG2$
}

# 'check_local_procs' command definition
define command{
command_name check_local_procs
command_line $USER1$/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$
}

# 'check_local_users' command definition
define command{
command_name check_local_users
command_line $USER1$/check_users -w $ARG1$ -c $ARG2$
}

# 'check_local_swap' command definition
define command{
command_name check_local_swap
command_line $USER1$/check_swap -w $ARG1$ -c $ARG2$
}

# 'check_local_mrtgtraf' command definition
define command{
command_name check_local_mrtgtraf
command_line $USER1$/check_mrtgtraf -F $ARG1$ -a $ARG2$ -w $ARG3$ -c $ARG4$ -e $ARG5$
}

################################################################################
# NOTE: The following 'check_...' commands are used to monitor services on
# both local and remote hosts.
################################################################################

# 'check_ftp' command definition
define command{
command_name check_ftp
command_line $USER1$/check_ftp -H $HOSTADDRESS$ $ARG1$
}

# 'check_hpjd' command definition
define command{
command_name check_hpjd
command_line $USER1$/check_hpjd -H $HOSTADDRESS$ $ARG1$
}

# 'check_snmp' command definition
define command{
command_name check_snmp
command_line $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$
}


# 'check_http' command definition
define command{
command_name check_http
command_line $USER1$/check_http -I $HOSTADDRESS$ $ARG1$
}

# 'check_ssh' command definition
define command{
command_name check_ssh
command_line $USER1$/check_ssh $ARG1$ $HOSTADDRESS$
}


# 'check_dhcp' command definition
define command{
command_name check_dhcp
command_line $USER1$/check_dhcp $ARG1$
}

# 'check_ping' command definition
define command{
command_name check_ping
command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
}

# 'check_pop' command definition
define command{
command_name check_pop
command_line $USER1$/check_pop -H $HOSTADDRESS$ $ARG1$
}

# 'check_imap' command definition
define command{
command_name check_imap
command_line $USER1$/check_imap -H $HOSTADDRESS$ $ARG1$
}

# 'check_smtp' command definition
define command{
command_name check_smtp
command_line $USER1$/check_smtp -H $HOSTADDRESS$ $ARG1$
}

# 'check_tcp' command definition
define command{
command_name check_tcp
command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$
}

# 'check_udp' command definition
define command{
command_name check_udp
command_line $USER1$/check_udp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$
}

# 'check_nt' command definition
define command{
command_name check_nt
command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$
}

###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS LINUX LOCALHOST
#
###############################################################################
###############################################################################
# Define a service to "ping" the local machine

define service{
use local-service ; Name of service template to use
host_name localhost
service_description PING
check_command check_ping!100.0,20%!500.0,60%
}


# Define a service to check the disk space of the root partition
# on the local machine. Warning if <> 20 users, critical
# if > 50 users.

define service{
use local-service ; Name of service template to use
host_name localhost
service_description Current Users
check_command check_local_users!20!50
}


# Define a service to check the number of currently running procs
# on the local machine. Warning if > 250 processes, critical if
# > 400 users.

define service{
use local-service ; Name of service template to use
host_name localhost
service_description Total Processes
check_command check_local_procs!250!400!RSZDT
}

# Define a service to check the load on the local machine.

define service{
use local-service ; Name of service template to use
host_name localhost
service_description Current Load
check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0
}

# Define a service to check the swap usage the local machine.
# Critical if less than 10% of swap is free, warning if less than 20% is free

define service{
use local-service ; Name of service template to use
host_name localhost
service_description Swap Usage
check_command check_local_swap!20!10
}

# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.

define service{
use local-service ; Name of service template to use
host_name localhost
service_description SSH
check_command check_ssh
notifications_enabled 0
}

# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.

define service{
use local-service ; Name of service template to use
host_name localhost
service_description HTTP
check_command check_http
notifications_enabled 0
}

###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS WINDOWS SERVER
#
###############################################################################
###############################################################################

# Create a service for monitoring the version of NSCLient++ that is installed
# Change the host_name to match the name of the host you defined above

define service{
use generic-service
host_name vista
service_description NSClient++ Version
check_command check_nt!CLIENTVERSION
}

# Create a service for monitoring the uptime of the server
# Change the host_name to match the name of the host you defined above

define service{
use generic-service
host_name vista
service_description Uptime
check_command check_nt!UPTIME
}

# Create a service for monitoring CPU load
# Change the host_name to match the name of the host you defined above

define service{
use generic-service
host_name vista
service_description CPU Load
check_command check_nt!CPULOAD!-l 5,80,90
}

# Create a service for monitoring memory usage
# Change the host_name to match the name of the host you defined above

define service{
use generic-service
host_name vista
service_description Memory Usage
check_command check_nt!MEMUSE!-w 80 -c 90
}

# Create a service for monitoring C:\ disk usage
# Change the host_name to match the name of the host you defined above

define service{
use generic-service
host_name vista
service_description C:\ Drive Space
check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90
}

# Create a service for monitoring the W3SVC service
# Change the host_name to match the name of the host you defined above

define service{
use generic-service
host_name vista
service_description W3SVC
check_command check_nt!SERVICESTATE!-d SHOWALL -l W3SVC
}

# Create a service for monitoring the Explorer.exe process
# Change the host_name to match the name of the host you defined above

define service{
use generic-service
host_name vista
service_description Explorer
check_command check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe
}
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS NETWORK PRINTER
#
###############################################################################
###############################################################################

# Create a service for monitoring the status of the printer
# Change the host_name to match the name of the host you defined above
# If the printer has an SNMP community string other than "public", change the check_command directive to reflect that

define service{
use generic-service ; Inherit values from a template
host_name hplj2605dn ; The name of the host the service is associated with
service_description Printer Status ; The service description
check_command check_hpjd!-C public ; The command used to monitor the service
normal_check_interval 10 ; Check the service every 10 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every minute until its final/hard state is determined
}

# Create a service for "pinging" the printer occassionally. Useful for monitoring RTA, packet loss, etc.

define service{
use generic-service
host_name hplj2605dn
service_description PING
check_command check_ping!3000.0,80%!5000.0,100%
normal_check_interval 10
retry_check_interval 1
}
###############################################################################
###############################################################################
#
# SERVICE DEFINITIONS ROUTER - SWITCH
#
###############################################################################
###############################################################################

# Create a service to PING to switch

define service{
use generic-service ; Inherit values from a template
host_name cisco ; The name of the host the service is associated with
service_description PING ; The service description
check_command check_ping!200.0,20%!600.0,60% ; The command used to monitor the service
normal_check_interval 5 ; Check the service every 5 minutes under normal conditions
retry_check_interval 1 ; Re-check the service every minute until its final/hard state is determined
}


# Monitor uptime via SNMP

# define service{
# use generic-service ; Inherit values from a template
# host_name cisco
# service_description Uptime
# check_command check_snmp!-C public -o sysUpTime.0
# }



# Monitor Port 1 status via SNMP

# define service{
# use generic-service ; Inherit values from a template
# host_name cisco
# service_description Port 1 Link Status
# check_command check_snmp!-C public -o ifOperStatus.1 -r 1 -m RFC1213-MIB
# }cisco
#


# Monitor bandwidth via MRTG logs

# define service{
# use generic-service ; Inherit values from a template
# host_name 
# service_description Port 1 Bandwidth Usage
# check_command check_local_mrtgtraf!/var/lib/mrtg/192.168.1.254_1.log!AVG!1000000,1000000!5000000,5000000!10
# }
#
Configuring email notifications is outside the scope of this documentation. Refer to your system documentation, search the web, or look to the Nagios Support PortalNagios Community Wiki for specific instructions on configuring your Ubuntu system to send email messages to external addresses.
Once you get Nagios installed and running properly, you’ll no doubt want to start monitoring more than just your local machine. Check out the following docs for how to go about monitoring other things…