If you follow these instructions, here’s what you’ll end up with:
Nagios and the plugins will be installed underneath /usr/local/nagios
- Nagios will be configured to monitor a few aspects of your local system (CPU load, disk usage, etc.)
- The Nagios web interface will be accessible at http://localhost/nagios/
Required Packages
Make sure you’ve installed the following packages on your Ubuntu installation before continuing.
- Apache 2
- PHP
- GCC compiler and development libraries
- GD development libraries
You can use apt-get to install these packages by running the following commands:
sudo apt-get install apache2 sudo apt-get install libapache2-mod-php5 sudo apt-get install build-essential
1) Create Account Information
Become the root user.
sudo -s
Create a new nagios user account and give it a password.
/usr/sbin/useradd -m -s /bin/bash nagios passwd nagios
Create a new nagcmd group for allowing external commands to be submitted through the web interface. Add both the nagios user and the apache user to the group.
/usr/sbin/groupadd nagcmd /usr/sbin/usermod -a -G nagcmd nagios /usr/sbin/usermod -a -G nagcmd www-data
2) Download Nagios and the Plugins
Create a directory for storing the downloads.
mkdir ~/downloads cd ~/downloads
Download the source code tarballs of both Nagios and the Nagios plugins (visit http://www.nagios.org/download/ for links to the latest versions).
wget http://sourceforge.net/projects/nagios/files/nagios-3.x/nagios-3.4.4/nagios-3.4.4.tar.gz wget http://sourceforge.net/projects/nagiosplug/files/nagiosplug/1.4.16/nagios-plugins-1.4.16.tar.gz
3) Compile and Install Nagios
Extract the Nagios source code tarball.
cd ~/downloads tar xzf nagios-3.4.4.tar.gz cd nagios
Run the Nagios configure script, passing the name of the group you created earlier like so:
./configure --with-command-group=nagcmd
Compile the Nagios source code.
make all
Install binaries, init script, sample config files and set permissions on the external command directory.
make install make install-init make install-config make install-commandmode
Don’t start Nagios yet – there’s still more that needs to be done…
4) Customize Configuration
Sample configuration files have now been installed in the /usr/local/nagios/etc directory. These sample files should work fine for getting started with Nagios. You’ll need to make just one change before you proceed…
Edit the /usr/local/nagios/etc/objects/contacts.cfg config file with your favorite editor and change the email address associated with the nagiosadmin contact definition to the address you’d like to use for receiving alerts.
vi /usr/local/nagios/etc/objects/contacts.cfg
5) Configure the Web Interface
Install the Nagios web config file in the Apache conf.d directory.
make install-webconf
Create a nagiosadmin account for logging into the Nagios web interface. Remember the password you assign to this account – you’ll need it later.
htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
Restart Apache to make the new settings take effect.
/etc/init.d/apache2 reload
6) Compile and Install the Nagios Plugins
Extract the Nagios plugins source code tarball.
cd ~/downloads tar xzf nagios-plugins-1.4.16.tar.gz cd nagios-plugins-1.4.16
Compile and install the plugins.
./configure --with-nagios-user=nagios --with-nagios-group=nagios make make install
7) Start Nagios
Configure Nagios to automatically start when the system boots.
ln -s /etc/init.d/nagios /etc/rcS.d/S99nagios
Verify the sample Nagios configuration files.
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
If there are no errors, start Nagios.
/etc/init.d/nagios start
8) Login to the Web Interface
You should now be able to access the Nagios web interface at the URL below. You’ll be prompted for the username (nagiosadmin) and password you specified earlier.
http://localhost/nagios/
Click on the “Service Detail” navbar link to see details of what’s being monitored on your local machine. It will take a few minutes for Nagios to check all the services associated with your machine, as the checks are spread out over time.
9) Other Modifications
If you want to receive email notifications for Nagios alerts, you need to install the Postfix package.
sudo apt-get install postfix
You’ll have to edit the Nagios email notification commands found in /usr/local/nagios/etc/objects/commands.cfg and change any ‘/bin/mail’ references to ‘/usr/bin/mail’. Once you do that you’ll need to restart Nagios to make the configuration changes live.
sudo /etc/init.d/nagios restart
Get Nagios Frontends - Enhance your Nagios experience with additional frontends.
CONFIGURATION ################################################################################ # # SAMPLE NOTIFICATION COMMANDS # # These are some example notification commands. They may or may not work on # your system without modification. As an example, some systems will require # you to use "/usr/bin/mailx" instead of "/usr/bin/mail" in the commands below. # ################################################################################ ################################################################################ # # SAMPLE HOST CHECK COMMANDS # ################################################################################ # This command checks to see if a host is "alive" by pinging it # The check must result in a 100% packet loss or 5 second (5000ms) round trip # average time to produce a critical error. # Note: Five ICMP echo packets are sent (determined by the '-p 5' argument) # 'check-host-alive' command definition define command{ command_name check-host-alive command_line $USER1$/check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 5 } ################################################################################ # # SAMPLE SERVICE CHECK COMMANDS # # These are some example service check commands. They may or may not work on # your system, as they must be modified for your plugins. See the HTML # documentation on the plugins for examples of how to configure command definitions. # # NOTE: The following 'check_local_...' functions are designed to monitor # various metrics on the host that Nagios is running on (i.e. this one). ################################################################################ # 'check_local_disk' command definition define command{ command_name check_local_disk command_line $USER1$/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$ } # 'check_local_load' command definition define command{ command_name check_local_load command_line $USER1$/check_load -w $ARG1$ -c $ARG2$ } # 'check_local_procs' command definition define command{ command_name check_local_procs command_line $USER1$/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$ } # 'check_local_users' command definition define command{ command_name check_local_users command_line $USER1$/check_users -w $ARG1$ -c $ARG2$ } # 'check_local_swap' command definition define command{ command_name check_local_swap command_line $USER1$/check_swap -w $ARG1$ -c $ARG2$ } # 'check_local_mrtgtraf' command definition define command{ command_name check_local_mrtgtraf command_line $USER1$/check_mrtgtraf -F $ARG1$ -a $ARG2$ -w $ARG3$ -c $ARG4$ -e $ARG5$ } ################################################################################ # NOTE: The following 'check_...' commands are used to monitor services on # both local and remote hosts. ################################################################################ # 'check_ftp' command definition define command{ command_name check_ftp command_line $USER1$/check_ftp -H $HOSTADDRESS$ $ARG1$ } # 'check_hpjd' command definition define command{ command_name check_hpjd command_line $USER1$/check_hpjd -H $HOSTADDRESS$ $ARG1$ } # 'check_snmp' command definition define command{ command_name check_snmp command_line $USER1$/check_snmp -H $HOSTADDRESS$ $ARG1$ } # 'check_http' command definition define command{ command_name check_http command_line $USER1$/check_http -I $HOSTADDRESS$ $ARG1$ } # 'check_ssh' command definition define command{ command_name check_ssh command_line $USER1$/check_ssh $ARG1$ $HOSTADDRESS$ } # 'check_dhcp' command definition define command{ command_name check_dhcp command_line $USER1$/check_dhcp $ARG1$ } # 'check_ping' command definition define command{ command_name check_ping command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5 } # 'check_pop' command definition define command{ command_name check_pop command_line $USER1$/check_pop -H $HOSTADDRESS$ $ARG1$ } # 'check_imap' command definition define command{ command_name check_imap command_line $USER1$/check_imap -H $HOSTADDRESS$ $ARG1$ } # 'check_smtp' command definition define command{ command_name check_smtp command_line $USER1$/check_smtp -H $HOSTADDRESS$ $ARG1$ } # 'check_tcp' command definition define command{ command_name check_tcp command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$ } # 'check_udp' command definition define command{ command_name check_udp command_line $USER1$/check_udp -H $HOSTADDRESS$ -p $ARG1$ $ARG2$ } # 'check_nt' command definition define command{ command_name check_nt command_line $USER1$/check_nt -H $HOSTADDRESS$ -p 12489 -v $ARG1$ $ARG2$ } ############################################################################### ############################################################################### # # SERVICE DEFINITIONS LINUX LOCALHOST # ############################################################################### ############################################################################### # Define a service to "ping" the local machine define service{ use local-service ; Name of service template to use host_name localhost service_description PING check_command check_ping!100.0,20%!500.0,60% } # Define a service to check the disk space of the root partition # on the local machine. Warning if <> 20 users, critical # if > 50 users. define service{ use local-service ; Name of service template to use host_name localhost service_description Current Users check_command check_local_users!20!50 } # Define a service to check the number of currently running procs # on the local machine. Warning if > 250 processes, critical if # > 400 users. define service{ use local-service ; Name of service template to use host_name localhost service_description Total Processes check_command check_local_procs!250!400!RSZDT } # Define a service to check the load on the local machine. define service{ use local-service ; Name of service template to use host_name localhost service_description Current Load check_command check_local_load!5.0,4.0,3.0!10.0,6.0,4.0 } # Define a service to check the swap usage the local machine. # Critical if less than 10% of swap is free, warning if less than 20% is free define service{ use local-service ; Name of service template to use host_name localhost service_description Swap Usage check_command check_local_swap!20!10 } # Define a service to check SSH on the local machine. # Disable notifications for this service by default, as not all users may have SSH enabled. define service{ use local-service ; Name of service template to use host_name localhost service_description SSH check_command check_ssh notifications_enabled 0 } # Define a service to check HTTP on the local machine. # Disable notifications for this service by default, as not all users may have HTTP enabled. define service{ use local-service ; Name of service template to use host_name localhost service_description HTTP check_command check_http notifications_enabled 0 } ############################################################################### ############################################################################### # # SERVICE DEFINITIONS WINDOWS SERVER # ############################################################################### ############################################################################### # Create a service for monitoring the version of NSCLient++ that is installed # Change the host_name to match the name of the host you defined above define service{ use generic-service host_name vista service_description NSClient++ Version check_command check_nt!CLIENTVERSION } # Create a service for monitoring the uptime of the server # Change the host_name to match the name of the host you defined above define service{ use generic-service host_name vista service_description Uptime check_command check_nt!UPTIME } # Create a service for monitoring CPU load # Change the host_name to match the name of the host you defined above define service{ use generic-service host_name vista service_description CPU Load check_command check_nt!CPULOAD!-l 5,80,90 } # Create a service for monitoring memory usage # Change the host_name to match the name of the host you defined above define service{ use generic-service host_name vista service_description Memory Usage check_command check_nt!MEMUSE!-w 80 -c 90 } # Create a service for monitoring C:\ disk usage # Change the host_name to match the name of the host you defined above define service{ use generic-service host_name vista service_description C:\ Drive Space check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90 } # Create a service for monitoring the W3SVC service # Change the host_name to match the name of the host you defined above define service{ use generic-service host_name vista service_description W3SVC check_command check_nt!SERVICESTATE!-d SHOWALL -l W3SVC } # Create a service for monitoring the Explorer.exe process # Change the host_name to match the name of the host you defined above define service{ use generic-service host_name vista service_description Explorer check_command check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe } ############################################################################### ############################################################################### # # SERVICE DEFINITIONS NETWORK PRINTER # ############################################################################### ############################################################################### # Create a service for monitoring the status of the printer # Change the host_name to match the name of the host you defined above # If the printer has an SNMP community string other than "public", change the check_command directive to reflect that define service{ use generic-service ; Inherit values from a template host_name hplj2605dn ; The name of the host the service is associated with service_description Printer Status ; The service description check_command check_hpjd!-C public ; The command used to monitor the service normal_check_interval 10 ; Check the service every 10 minutes under normal conditions retry_check_interval 1 ; Re-check the service every minute until its final/hard state is determined } # Create a service for "pinging" the printer occassionally. Useful for monitoring RTA, packet loss, etc. define service{ use generic-service host_name hplj2605dn service_description PING check_command check_ping!3000.0,80%!5000.0,100% normal_check_interval 10 retry_check_interval 1 } ############################################################################### ############################################################################### # # SERVICE DEFINITIONS ROUTER - SWITCH # ############################################################################### ############################################################################### # Create a service to PING to switch define service{ use generic-service ; Inherit values from a template host_name cisco ; The name of the host the service is associated with service_description PING ; The service description check_command check_ping!200.0,20%!600.0,60% ; The command used to monitor the service normal_check_interval 5 ; Check the service every 5 minutes under normal conditions retry_check_interval 1 ; Re-check the service every minute until its final/hard state is determined } # Monitor uptime via SNMP # define service{ # use generic-service ; Inherit values from a template # host_name cisco # service_description Uptime # check_command check_snmp!-C public -o sysUpTime.0 # } # Monitor Port 1 status via SNMP # define service{ # use generic-service ; Inherit values from a template # host_name cisco # service_description Port 1 Link Status # check_command check_snmp!-C public -o ifOperStatus.1 -r 1 -m RFC1213-MIB # }cisco # # Monitor bandwidth via MRTG logs # define service{ # use generic-service ; Inherit values from a template # host_name # service_description Port 1 Bandwidth Usage # check_command check_local_mrtgtraf!/var/lib/mrtg/192.168.1.254_1.log!AVG!1000000,1000000!5000000,5000000!10 # } #
Configuring email notifications is outside the scope of this documentation. Refer to your system documentation, search the web, or look to the Nagios Support PortalNagios Community Wiki for specific instructions on configuring your Ubuntu system to send email messages to external addresses.
Once you get Nagios installed and running properly, you’ll no doubt want to start monitoring more than just your local machine. Check out the following docs for how to go about monitoring other things…