Nagios using NRPE ( Centos5.* )

Di Copy dari wiki yang saya buat di kantor, jadi rada acak-acakkan maaf... :D. abis ngoprek di server kantor.

Howto setup NAGIOS a monitoring server

Nagios is a tool for monitoring, this article will explain how to install Nagios at server and using Nagios-plugins such as nrpe and check_logfiles plugins at nagios client.

Contents

  • 1 Installation
    • 1.1 Preparing Packets
    • 1.2 Install Nagios core packet
    • 1.3 Install Perl packet
    • 1.4 Install fping packet
    • 1.5 Install Nagios-plugins and Nagios-plugins-nrpe packet
  • 2 Setup Nagios Web access User
  • 3 Configuring Nagios core Server
    • 3.1 Make Nagios config files for host,group,template,service and commands
    • 3.2 Write the config files
      • 3.2.1 Host config file ( nrpe.host.cfg )
      • 3.2.2 Host Group config file ( nrpe.group.cfg )
      • 3.2.3 Service config file ( nrpe.service.cfg )
      • 3.2.4 Template config file ( nrpe.template.cfg )
      • 3.2.5 Commands config file ( nrpe.command.cfg )
    • 3.3 Modify Nagios config file
  • 4 Test Nagios server config
  • 5 Client Side Installation
    • 5.1 Install NRPE and Nagios plugins packet
    • 5.2 Configure NRPE
    • 5.3 Install check_logfiles plugins
      • 5.3.1 Configure and Install
      • 5.3.2 Make a config file for check_logfiles plugin
      • 5.3.3 Generate log report for check_logfiles
      • 5.3.4 Add the command to /etc/nagios/nrpe.cfg
      • 5.3.5 Define check_logfiles service at HOST
    • 5.4 Open 5666 port for NRPE
    • 5.5 Edit host.allow
  • 6 Run Nagios Server and NRPE at Client
  • 7 Access the Nagios web

Installation


Preparing Packets

Make Directory for store the packets
mkdir Nagios-packets
cd Nagios-packets
Download all necessaries packet for Nagios server
/usr/bin/wget -c http://labs.consol.de/wp-content/uploads/2010/10/check_logfiles-3.4.2.2.tar.gz
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-3.2.3-3.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-devel-3.2.3-3.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-plugins-1.4.15-2.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-plugins-nrpe-2.12-1.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/fping-2.4-1.b2.3.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/perl-Net-SNMP-5.2.0-1.2.el5.rf.noarch.rpm
/usr/bin/wget -c http://packages.sw.be/perl-Socket6/perl-Socket6-0.23-1.el5.rfx.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/perl-Crypt-DES-2.05-3.2.el5.rf.x86_64.rpm


Install Nagios core packet

yum --nogpgcheck localinstall nagios-*
Dependencies Resolved
=======================================================================================================================================================
 Package                              Arch                    Version                               Repository                                     Size
=======================================================================================================================================================
Installing:
nagios                               x86_64                  3.2.3-3.el5.rf                            /nagios-3.2.3-3.el5.rf.x86_64              7.3 M
Installing for dependencies:
apr                                  x86_64                  1.2.7-11.el5_5.3                          Centos-5.5-x86_64-updates                  118 k
apr-util                             x86_64                  1.2.7-11.el5_5.1                          Centos-5.5-x86_64-updates                   79 k
aspell                               x86_64                  12:0.60.3-7.1                             core-0                                     946 k
aspell-en                            x86_64                  50:6.0-2.1                                core-0                                     1.6 M
curl                                 x86_64                  7.15.5-9.el5                              core-0                                     230 k
fontconfig                           x86_64                  2.4.1-7.el5                               core-0                                     175 k
freetype                             x86_64                  2.2.1-28.el5_5.1                          Centos-5.5-x86_64-updates                  311 k
gd                                   x86_64                  2.0.33-9.4.el5_4.2                        core-0                                     155 k
gmp                                  x86_64                  4.1.4-10.el5                              core-0                                     201 k
httpd                                x86_64                  2.2.3-43.el5.centos.3                     Centos-5.5-x86_64-updates                  1.2 M
libX11                               x86_64                  1.0.3-11.el5                              core-0                                     798 k
libXau                               x86_64                  1.0.1-3.1                                 core-0                                      18 k
libXdmcp                             x86_64                  1.0.1-2.1                                 core-0                                      19 k
libXpm                               x86_64                  3.5.5-3                                   core-0                                      44 k
libjpeg                              x86_64                  6b-37                                     core-0                                     139 k
libpng                               x86_64                  2:1.2.10-7.1.el5_5.3                      Centos-5.5-x86_64-updates                  234 k
libtool-ltdl                         x86_64                  1.5.22-7.el5_4                            core-0                                      38 k
php                                  x86_64                  5.1.6-27.el5                              core-0                                     2.3 M
php-cli                              x86_64                  5.1.6-27.el5                              core-0                                     2.2 M
php-common                           x86_64                  5.1.6-27.el5                              core-0                                     153 k
pkgconfig                            x86_64                  1:0.21-2.el5                              core-0                                      61 k
postgresql-libs                      x86_64                  8.1.22-1.el5_5.1                          Centos-5.5-x86_64-updates                  196 k
xorg-x11-filesystem                  noarch                  7.1-2.fc6                                 core-0                                     5.4 k
Transaction Summary
=======================================================================================================================================================
Install      24 Package(s)
Upgrade       0 Package(s)
Total size: 18 M
Total download size: 11 M
Is this ok [y/N]: y
yum will automatic resolved the dependencies for Nagios-core packet


Install Perl packet

yum --nogpgcheck localinstall perl-*
Dependencies Resolved
=======================================================================================================================================================
Package                           Arch                    Version                             Repository                                           Size
=======================================================================================================================================================
Installing:
perl-Crypt-DES                    x86_64                  2.05-3.2.el5.rf                     /perl-Crypt-DES-2.05-3.2.el5.rf.x86_64               71 k
perl-Net-SNMP                     noarch                  5.2.0-1.2.el5.rf                    /perl-Net-SNMP-5.2.0-1.2.el5.rf.noarch              340 k
perl-Socket6                      x86_64                  0.23-1.el5.rfx                      /perl-Socket6-0.23-1.el5.rfx.x86_64                 110 k
Installing for dependencies:
perl-Digest-HMAC                  noarch                  1.01-15                             core-0                                               12 k
perl-Digest-SHA1                  x86_64                  2.11-1.2.1                          core-0                                               49 k
Transaction Summary
=======================================================================================================================================================
Install       5 Package(s)
Upgrade       0 Package(s)
Total size: 582 k
Total download size: 61 k
Is this ok [y/N]: y 



Install fping packet

yum --nogpgcheck localinstall fping-*


Install Nagios-plugins and Nagios-plugins-nrpe packet

yum --nogpgcheck localinstall nagios-plugins-*


Setup Nagios Web access User

Make htpasswd.usre file for store username and password
touch /etc/nagios/htpasswd.users
Change ownership so the apache can access it
chown root:apache /etc/nagios/htpasswd.users
chmod 640 /etc/nagios/htpasswd.users
Create a user for Nagios web access
htpasswd -c /etc/nagios/htpasswd.users <nagios user admin>
<remember the password>


Configuring Nagios core Server

Nagios config file in Centos5.* x86_64 stored at
/etc/nagios/
/etc/nagios/objects
And the Plugins stored at
/usr/lib64/nagios/plugins
CGI stored at
/usr/lib64/nagios/cgi


Make Nagios config files for host,group,template,service and commands

cd /etc/nagios/objects/
mkdir nrpe
touch {nrpe.service.cfg,nrpe.template.cfg,nrpe.host.cfg,nrpe.group.cfg,nrpe.command.cfg}
  • nrpe.service.cfg store service configuration for host
  • nrpe.template.cfg store template configuration for host
  • nrpe.host.cfg store host name and address
  • nrpe.group.cfg store host group name and host member
  • nrpe.command.cfg store additional command for nagios-plugins


Write the config files



Host config file ( nrpe.host.cfg )

vi nrpe.host.cfg

#start nrpe.host.cfg
define host{
   use  id41-host-server ;template
   host_name ID41-ND016
   alias ID41-ND016
   address 10.62.41.16
}
#end nrpe.host.cfg
save file by type
<esc> 
Shift + <:> 
wq
note
  • use = this host using id41-host-server template (example)
  • host_name = name of host
  • alias = host alias
  • address = ip address host


Host Group config file ( nrpe.group.cfg )

vi nrpe.group.cfg
#start nrpe.group.cfg
define hostgroup{
   hostgroup_name  SRV-ID41
   alias  Server at ID41
   members  ID41-ND016
}
#end nrpe.group.cfg
save file by type
<esc> 
Shift + <:> 
wq
note
  • hostgroup_name = name of group host
  • alias = host alias
  • members = name of host,will be a group member


Service config file ( nrpe.service.cfg )

This service will using the NRPE plugins, it will explains later, for direct monitoring without use nrpe plugins you can see at /etc/nagios/objects/localhost.cfg as a reference
vi nrpe.service.cfg
#start nrpe.service.cfg
# Define a service to check the disk space of the root partition
# on the local machine.  Warning if < 20% free, critical if
# < 10% free space on partition.
define service{
   use                             id41-generic-service         ; Name of service template to use
   #host_name                      ;use host name ,and mark the hosts group
   hostgroup_name                  SRV-ID41
   service_description             Root Partition
   check_command                   check_nrpe!check_disk
              }
# Define a service to check the number of currently running procs
# on the local machine.  Warning if > 250 processes, critical if
# > 400 users.
define service{
  use                             id41-generic-service         ; Name of service template to use
  hostgroup_name                  SRV-ID41
  service_description             Total Processes
  check_command                   check_nrpe!check_procs 
  }
# Define a service to check the load on the local machine. 
define service{
  use                             id41-generic-service         ; Name of service template to use
  hostgroup_name                  SRV-ID41
  service_description             Current Load
  check_command                   check_nrpe!check_load
  }
# Define a service to check the swap usage the local machine. 
# Critical if less than 10% of swap is free, warning if less than 20% is free
define service{
  use                             id41-generic-service         ; Name of service template to use
  hostgroup_name                  SRV-ID41
  service_description             Swap Usage
  check_command                   check_nrpe!check_swap
  }
# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.
define service{
  use                             id41-generic-service         ; Name of service template to use
  hostgroup_name                  SRV-ID41
  service_description             SSH
  check_command                   check_nrpe!check_ssh
  notifications_enabled           0
  }
# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.
define service{
  use                             id41-generic-service         ; Name of service template to use
  hostgroup_name                  SRV-ID41
  service_description             HTTP
  check_command                   check_nrpe!check_http
  notifications_enabled           0
  }
#end nrpe.service.cfg
save file by type
<esc> 
Shift + <:> 
wq


Template config file ( nrpe.template.cfg )

vi nrpe.template.cfg
#start nrpe.template.cfg
define host{
   name                            id41-host-server    ; The name of this host template
   use                             generic-host    ; This template inherits other values from the generic-host template
   check_period                    24x7            ; By default, Linux hosts are checked round the clock
   check_interval                  5               ; Actively check the host every 5 minutes
   retry_interval                  1               ; Schedule host check retries at 1 minute intervals
   max_check_attempts              10              ; Check each Linux host 10 times (max)
   check_command                   check-host-alive ; Default command to check Linux hosts
   notification_period             24x7            ;workhours
                                                   ; Linux admins hate to be woken up, so we only notify during the day
                                         ; Note that the notification_period variable is being overridden from
                                         ; the value that is inherited from the generic-host template!
   #statusmap_image                 redhat.png
   notification_interval           120             ; Resend notifications every 2 hours
   notification_options            d,u,r           ; Only send notifications for specific host states
   contact_groups                  admins          ; Notifications get sent to the admins by default
   register                        0               ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
   }
# Generic service definition template - This is NOT a real service, just a template!
define service{
   name                            id41-generic-service         ; The 'name' of this service template
   active_checks_enabled           1            ; Active service checks are enabled
   passive_checks_enabled          1            ; Passive service checks are enabled/accepted
   parallelize_check               1            ; Active service checks should be parallelized (disabling this can lead to major performance problems)
   obsess_over_service             1            ; We should obsess over this service (if necessary)
   check_freshness                 0            ; Default is to NOT check service 'freshness'
   notifications_enabled           1            ; Service notifications are enabled
   event_handler_enabled           1            ; Service event handler is enabled
   flap_detection_enabled          1            ; Flap detection is enabled
   failure_prediction_enabled      1            ; Failure prediction is enabled
   process_perf_data               1            ; Process performance data
   retain_status_information       1            ; Retain status information across program restarts
   retain_nonstatus_information    1            ; Retain non-status information across program restarts
   is_volatile                     0            ; The service is not volatile
   check_period                    24x7         ; The service can be checked at any time of the day
   max_check_attempts              3            ; Re-check the service up to 3 times in order to determine its final (hard) state
   normal_check_interval           10           ; Check the service every 10 minutes under normal conditions
   retry_check_interval            2            ; Re-check the service every two minutes until a hard state can be determined
   contact_groups                  admins       ; Notifications get sent out to everyone in the 'admins' group
   notification_options            w,u,c,r      ; Send notifications about warning, unknown, critical, and recovery events
   notification_interval           60           ; Re-notify about service problems every hour
   notification_period             24x7         ; Notifications can be sent out at any time
   register                        0            ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
   }
make template service using check_logfiles
define service{
   name                            id41-queue-service
   is_volatile                     1
   check_period                    24x7
   max_check_attempts              1
   contact_groups                  admins
   notification_options            w,c,u
   notification_interval           30
   notification_period             24x7
   }
#end nrpe.template.cfg
save file by type
<esc> 
Shift + <:> 
wq


Commands config file ( nrpe.command.cfg )

I will add two modification commands for check the mysql status and read the log files
vi nrpe.command.cfg
#start nrpe.command.cfg
## mysql check
define command{
       command_name    check_mysql
       command_line    $USER1$/check_tcp -H $HOSTADDRESS$ -p 3306
}
## nrpe check
define command{
       command_name    check_nrpe
       command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
#end nrpe.command.cfg
save file by type
<esc> 
Shift + <:> 
wq


Modify Nagios config file

Edit the Nagios config file at /etc/nagios/nagios.cfg, so the nagios can read our config stored at /etc/nagios/objects/nrpe folder add this following line
cfg_dir=/etc/nagios/objects/nrpe


Test Nagios server config

Before we run the server, better if we try to check the configuration files type this at terminal
nagios -v /etc/nagios/nagios.cfg
the result should be OK, that means there's no misconfigure at the Nagios server
Nagios Core 3.2.3
Copyright (c) 2009-2010 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 10-03-2010
License: GPL
Website: http://www.nagios.org
Reading configuration data...
  Read main config file okay...
Processing object config file '/etc/nagios/objects/commands.cfg'...
Processing object config file '/etc/nagios/objects/contacts.cfg'...
Processing object config file '/etc/nagios/objects/timeperiods.cfg'...
Processing object config file '/etc/nagios/objects/templates.cfg'...
Processing object config file '/etc/nagios/objects/localhost.cfg'...
Processing object config directory '/etc/nagios/objects/nrpe'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.template.cfg'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.host.cfg'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.group.cfg'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.commands.cfg'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.service.cfg'...
  Read object config files okay...
Running pre-flight check on configuration data...
Checking services...
      Checked 14 services.
  Checking hosts...
      Checked 2 hosts.
  Checking host groups...
      Checked 2 host groups.
  Checking service groups...
      Checked 0 service groups.
  Checking contacts...
      Checked 1 contacts.
  Checking contact groups...
      Checked 1 contact groups.
  Checking service escalations...
      Checked 0 service escalations.
  Checking service dependencies...
      Checked 0 service dependencies.
  Checking host escalations...
      Checked 0 host escalations.
  Checking host dependencies...
      Checked 0 host dependencies.
  Checking commands...
      Checked 26 commands.
  Checking time periods...
      Checked 5 time periods.
  Checking for circular paths between hosts...
  Checking for circular host and service dependencies...
  Checking global event handlers...
  Checking obsessive compulsive processor commands...
  Checking misc settings...
  Total Warnings: 0
  Total Errors:   0
  Things look okay - No serious problems were detected during the pre-flight check


Client Side Installation

Because we are using the nrpe addons, the client should have the nagios plugins. the nrpe will execute the plugins at the client side, for more detail about nrpe


Install NRPE and Nagios plugins packet

mkdir nrpe-client
cd nrpe-client
/usr/bin/wget -c http://labs.consol.de/wp-content/uploads/2010/10/check_logfiles-3.4.2.2.tar.gz
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-plugins-1.4.15-2.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-nrpe-2.12-1.el5.rf.x86_64.rpm
yum --nogpgcheck localinstall nagios-*


Configure NRPE

NRPE config file store at /etc/nagios/nrpe.conf for run the nagios plugins at client side, edit the nrpe.cfg
vi /etc/nagios/nrpe.cfg
add Nagios server IP address
allowed_hosts = 127.0.0.1,< add ip address server >
add commands
command[check_ping]=/usr/lib64/nagios/plugins/check_ping -H 127.0.0.1 -w 100.0,20% -c 500.0,60%
command[check_procs]=/usr/lib64/nagios/plugins/check_procs -w 250 -c 400 -s RSZDT
command[check_ssh]=/usr/lib64/nagios/plugins/check_ssh -H 127.0.0.1
command[check_swap]=/usr/lib64/nagios/plugins/check_swap -w 20 -c 10
command[check_http]=/usr/lib64/nagios/plugins/check_http -I 127.0.0.1
command[check_mysql]=/usr/lib64/nagios/plugins/check_tcp -H 127.0.0.1 -p 3306
save file by type
<esc> 
Shift + <:> 
wq


Install check_logfiles plugins

In this article i used Nagios plugins named check_logfiles, this plugins will read the log files, if the logfiles contain a word is define at the configuration file, then the Nagios will send an email with critical or warning status. for more information about check_logfiles you find it at http://labs.consol.de/nagios/check_logfiles/


Configure and Install

Extract check_logfiles packets
tar -xf check_logfiles-3.4.2.2.tar.gz
cd check_logfiles-3.4.2.2
./configure --prefix=/etc/nagios/ --libexec=/usr/lib64/nagios/plugins/
make && make install 


Make a config file for check_logfiles plugin

touch /etc/nagios/logfiles.cfg
vi /etc/nagios/logfiles.cfg
@searches = (
   {
    tag => 'queue',
    logfile => '/tmp/queue.log',       #pointing to log files
    criticalpatterns => [
    'user-queue: 50000',
    'user-queue: 65000',
    'user-queue: 80000',
    'warning: user-queue is .*'
    ],
   },
  );
save file by type
<esc> 
Shift + <:> 
wq
this configuration will check log file at /tmp/queue.log, if the queue.log contain user-queue: 50000/65000/80000 word
example
user-queue: 80000 
or if the queue.log contain warning: user-queue word Nagios will send an email warning
example
warning: user-queue is 123.000


Generate log report for check_logfiles

touch /tmp/check_logfiles.trace 


Add the command to /etc/nagios/nrpe.cfg

Add this line
vi /etc/nagios/nrpe.cfg

command[check_queue]=/usr/lib64/nagios/plugins/check_logfiles -f /etc/nagios/logfiles.cfg <<= pointing to config file >
save file by type
<esc> 
Shift + <:> 
wq
with the -f option, the check_logfiles will read the configuration at logfiles.cfg


Define check_logfiles service at HOST

add this line at /etc/nagios/objects/nrpe/nrpe.service.cfg
vi /etc/nagios/objects/nrpe/nrpe.service.cfg
define service{
  use                             id41-generic-service ; Name of service template to use
  hostgroup_name                  SRV-ID41
  service_description             User Queue
  check_command                   check_nrpe!check_queue
  notifications_enabled           0
        }
save file by type
<esc> 
Shift + <:> 
wq


Open 5666 port for NRPE

nrpe using port 5666 for communicate, we should open it
iptables -I INPUT -p tcp -s 0/0 --dport 5666 -j ACCEPT


Edit host.allow

edit the host.allow at /etc/host.allow so the server can acces the client
vi /etc/host.allow

ALL: < ip nagios server >
save file by type
<esc> 
Shift + <:> 
wq


Run Nagios Server and NRPE at Client

In Host run Nagios core
service nagios start
In Client run NRPE daemon
service nrpe start


Access the Nagios web

http://ip_or_hostname_Nagios_server/nagios/
type nagios user and password, predefined at step 2. Setup Nagios Web Access user
you can use the Nagios VShell for better look, you can find at exchange.nagios.org 

Comments

Post a Comment