Di Copy dari wiki yang saya buat di kantor, jadi rada acak-acakkan maaf... :D. abis ngoprek di server kantor.
Howto setup NAGIOS a monitoring server
Nagios is a tool for monitoring, this article will explain how to install Nagios at server and using Nagios-plugins such as nrpe and check_logfiles plugins at nagios client.
Contents
- 1 Installation
- 1.1 Preparing Packets
- 1.2 Install Nagios core packet
- 1.3 Install Perl packet
- 1.4 Install fping packet
- 1.5 Install Nagios-plugins and Nagios-plugins-nrpe packet
- 2 Setup Nagios Web access User
- 3 Configuring Nagios core Server
- 3.1 Make Nagios config files for host,group,template,service and commands
- 3.2 Write the config files
- 3.2.1 Host config file ( nrpe.host.cfg )
- 3.2.2 Host Group config file ( nrpe.group.cfg )
- 3.2.3 Service config file ( nrpe.service.cfg )
- 3.2.4 Template config file ( nrpe.template.cfg )
- 3.2.5 Commands config file ( nrpe.command.cfg )
- 3.3 Modify Nagios config file
- 4 Test Nagios server config
- 5 Client Side Installation
- 5.1 Install NRPE and Nagios plugins packet
- 5.2 Configure NRPE
- 5.3 Install check_logfiles plugins
- 5.3.1 Configure and Install
- 5.3.2 Make a config file for check_logfiles plugin
- 5.3.3 Generate log report for check_logfiles
- 5.3.4 Add the command to /etc/nagios/nrpe.cfg
- 5.3.5 Define check_logfiles service at HOST
- 5.4 Open 5666 port for NRPE
- 5.5 Edit host.allow
- 6 Run Nagios Server and NRPE at Client
- 7 Access the Nagios web
Installation
Preparing Packets
Make Directory for store the packets
mkdir Nagios-packets
cd Nagios-packets
Download all necessaries packet for Nagios server
/usr/bin/wget -c http://labs.consol.de/wp-content/uploads/2010/10/check_logfiles-3.4.2.2.tar.gz
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-3.2.3-3.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-devel-3.2.3-3.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-plugins-1.4.15-2.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-plugins-nrpe-2.12-1.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/fping-2.4-1.b2.3.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/perl-Net-SNMP-5.2.0-1.2.el5.rf.noarch.rpm
/usr/bin/wget -c http://packages.sw.be/perl-Socket6/perl-Socket6-0.23-1.el5.rfx.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/perl-Crypt-DES-2.05-3.2.el5.rf.x86_64.rpm
Install Nagios core packet
yum --nogpgcheck localinstall nagios-*
Dependencies Resolved
=======================================================================================================================================================
Package Arch Version Repository Size
=======================================================================================================================================================
Installing:
nagios x86_64 3.2.3-3.el5.rf /nagios-3.2.3-3.el5.rf.x86_64 7.3 M
Installing for dependencies:
apr x86_64 1.2.7-11.el5_5.3 Centos-5.5-x86_64-updates 118 k
apr-util x86_64 1.2.7-11.el5_5.1 Centos-5.5-x86_64-updates 79 k
aspell x86_64 12:0.60.3-7.1 core-0 946 k
aspell-en x86_64 50:6.0-2.1 core-0 1.6 M
curl x86_64 7.15.5-9.el5 core-0 230 k
fontconfig x86_64 2.4.1-7.el5 core-0 175 k
freetype x86_64 2.2.1-28.el5_5.1 Centos-5.5-x86_64-updates 311 k
gd x86_64 2.0.33-9.4.el5_4.2 core-0 155 k
gmp x86_64 4.1.4-10.el5 core-0 201 k
httpd x86_64 2.2.3-43.el5.centos.3 Centos-5.5-x86_64-updates 1.2 M
libX11 x86_64 1.0.3-11.el5 core-0 798 k
libXau x86_64 1.0.1-3.1 core-0 18 k
libXdmcp x86_64 1.0.1-2.1 core-0 19 k
libXpm x86_64 3.5.5-3 core-0 44 k
libjpeg x86_64 6b-37 core-0 139 k
libpng x86_64 2:1.2.10-7.1.el5_5.3 Centos-5.5-x86_64-updates 234 k
libtool-ltdl x86_64 1.5.22-7.el5_4 core-0 38 k
php x86_64 5.1.6-27.el5 core-0 2.3 M
php-cli x86_64 5.1.6-27.el5 core-0 2.2 M
php-common x86_64 5.1.6-27.el5 core-0 153 k
pkgconfig x86_64 1:0.21-2.el5 core-0 61 k
postgresql-libs x86_64 8.1.22-1.el5_5.1 Centos-5.5-x86_64-updates 196 k
xorg-x11-filesystem noarch 7.1-2.fc6 core-0 5.4 k
Transaction Summary
=======================================================================================================================================================
Install 24 Package(s)
Upgrade 0 Package(s)
Total size: 18 M
Total download size: 11 M
Is this ok [y/N]: y
yum will automatic resolved the dependencies for Nagios-core packet
Install Perl packet
yum --nogpgcheck localinstall perl-*
Dependencies Resolved
=======================================================================================================================================================
Package Arch Version Repository Size
=======================================================================================================================================================
Installing:
perl-Crypt-DES x86_64 2.05-3.2.el5.rf /perl-Crypt-DES-2.05-3.2.el5.rf.x86_64 71 k
perl-Net-SNMP noarch 5.2.0-1.2.el5.rf /perl-Net-SNMP-5.2.0-1.2.el5.rf.noarch 340 k
perl-Socket6 x86_64 0.23-1.el5.rfx /perl-Socket6-0.23-1.el5.rfx.x86_64 110 k
Installing for dependencies:
perl-Digest-HMAC noarch 1.01-15 core-0 12 k
perl-Digest-SHA1 x86_64 2.11-1.2.1 core-0 49 k
Transaction Summary
=======================================================================================================================================================
Install 5 Package(s)
Upgrade 0 Package(s)
Total size: 582 k
Total download size: 61 k
Is this ok [y/N]: y
Install fping packet
yum --nogpgcheck localinstall fping-*
Install Nagios-plugins and Nagios-plugins-nrpe packet
yum --nogpgcheck localinstall nagios-plugins-*
Setup Nagios Web access User
Make htpasswd.usre file for store username and password
touch /etc/nagios/htpasswd.users
Change ownership so the apache can access it
chown root:apache /etc/nagios/htpasswd.users
chmod 640 /etc/nagios/htpasswd.users
Create a user for Nagios web access
htpasswd -c /etc/nagios/htpasswd.users <nagios user admin>
<remember the password>
Configuring Nagios core Server
Nagios config file in Centos5.* x86_64 stored at
/etc/nagios/
/etc/nagios/objects
And the Plugins stored at
/usr/lib64/nagios/plugins
CGI stored at
/usr/lib64/nagios/cgi
Make Nagios config files for host,group,template,service and commands
cd /etc/nagios/objects/
mkdir nrpe
touch {nrpe.service.cfg,nrpe.template.cfg,nrpe.host.cfg,nrpe.group.cfg,nrpe.command.cfg}
- nrpe.service.cfg store service configuration for host
- nrpe.template.cfg store template configuration for host
- nrpe.host.cfg store host name and address
- nrpe.group.cfg store host group name and host member
- nrpe.command.cfg store additional command for nagios-plugins
Write the config files
Host config file ( nrpe.host.cfg )
vi nrpe.host.cfg
#start nrpe.host.cfg
define host{
use id41-host-server ;template
host_name ID41-ND016
alias ID41-ND016
address 10.62.41.16
}
#end nrpe.host.cfg
save file by type
<esc>
Shift + <:>
wq
note
- use = this host using id41-host-server template (example)
- host_name = name of host
- alias = host alias
- address = ip address host
Host Group config file ( nrpe.group.cfg )
vi nrpe.group.cfg
#start nrpe.group.cfg
define hostgroup{
hostgroup_name SRV-ID41
alias Server at ID41
members ID41-ND016
}
#end nrpe.group.cfg
save file by type
<esc>
Shift + <:>
wq
note
- hostgroup_name = name of group host
- alias = host alias
- members = name of host,will be a group member
Service config file ( nrpe.service.cfg )
This service will using the NRPE plugins, it will explains later, for direct monitoring without use nrpe plugins you can see at /etc/nagios/objects/localhost.cfg as a reference
vi nrpe.service.cfg
#start nrpe.service.cfg
# Define a service to check the disk space of the root partition
# on the local machine. Warning if < 20% free, critical if
# < 10% free space on partition.
define service{
use id41-generic-service ; Name of service template to use
#host_name ;use host name ,and mark the hosts group
hostgroup_name SRV-ID41
service_description Root Partition
check_command check_nrpe!check_disk
}
# Define a service to check the number of currently running procs
# on the local machine. Warning if > 250 processes, critical if
# > 400 users.
define service{
use id41-generic-service ; Name of service template to use
hostgroup_name SRV-ID41
service_description Total Processes
check_command check_nrpe!check_procs
}
# Define a service to check the load on the local machine.
define service{
use id41-generic-service ; Name of service template to use
hostgroup_name SRV-ID41
service_description Current Load
check_command check_nrpe!check_load
}
# Define a service to check the swap usage the local machine.
# Critical if less than 10% of swap is free, warning if less than 20% is free
define service{
use id41-generic-service ; Name of service template to use
hostgroup_name SRV-ID41
service_description Swap Usage
check_command check_nrpe!check_swap
}
# Define a service to check SSH on the local machine.
# Disable notifications for this service by default, as not all users may have SSH enabled.
define service{
use id41-generic-service ; Name of service template to use
hostgroup_name SRV-ID41
service_description SSH
check_command check_nrpe!check_ssh
notifications_enabled 0
}
# Define a service to check HTTP on the local machine.
# Disable notifications for this service by default, as not all users may have HTTP enabled.
define service{
use id41-generic-service ; Name of service template to use
hostgroup_name SRV-ID41
service_description HTTP
check_command check_nrpe!check_http
notifications_enabled 0
}
#end nrpe.service.cfg
save file by type
<esc>
Shift + <:>
wq
Template config file ( nrpe.template.cfg )
vi nrpe.template.cfg
#start nrpe.template.cfg
define host{
name id41-host-server ; The name of this host template
use generic-host ; This template inherits other values from the generic-host template
check_period 24x7 ; By default, Linux hosts are checked round the clock
check_interval 5 ; Actively check the host every 5 minutes
retry_interval 1 ; Schedule host check retries at 1 minute intervals
max_check_attempts 10 ; Check each Linux host 10 times (max)
check_command check-host-alive ; Default command to check Linux hosts
notification_period 24x7 ;workhours
; Linux admins hate to be woken up, so we only notify during the day
; Note that the notification_period variable is being overridden from
; the value that is inherited from the generic-host template!
#statusmap_image redhat.png
notification_interval 120 ; Resend notifications every 2 hours
notification_options d,u,r ; Only send notifications for specific host states
contact_groups admins ; Notifications get sent to the admins by default
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE!
}
# Generic service definition template - This is NOT a real service, just a template!
define service{
name id41-generic-service ; The 'name' of this service template
active_checks_enabled 1 ; Active service checks are enabled
passive_checks_enabled 1 ; Passive service checks are enabled/accepted
parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems)
obsess_over_service 1 ; We should obsess over this service (if necessary)
check_freshness 0 ; Default is to NOT check service 'freshness'
notifications_enabled 1 ; Service notifications are enabled
event_handler_enabled 1 ; Service event handler is enabled
flap_detection_enabled 1 ; Flap detection is enabled
failure_prediction_enabled 1 ; Failure prediction is enabled
process_perf_data 1 ; Process performance data
retain_status_information 1 ; Retain status information across program restarts
retain_nonstatus_information 1 ; Retain non-status information across program restarts
is_volatile 0 ; The service is not volatile
check_period 24x7 ; The service can be checked at any time of the day
max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state
normal_check_interval 10 ; Check the service every 10 minutes under normal conditions
retry_check_interval 2 ; Re-check the service every two minutes until a hard state can be determined
contact_groups admins ; Notifications get sent out to everyone in the 'admins' group
notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events
notification_interval 60 ; Re-notify about service problems every hour
notification_period 24x7 ; Notifications can be sent out at any time
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
}
make template service using check_logfiles
define service{
name id41-queue-service
is_volatile 1
check_period 24x7
max_check_attempts 1
contact_groups admins
notification_options w,c,u
notification_interval 30
notification_period 24x7
}
#end nrpe.template.cfg
save file by type
<esc>
Shift + <:>
wq
Commands config file ( nrpe.command.cfg )
I will add two modification commands for check the mysql status and read the log files
vi nrpe.command.cfg
#start nrpe.command.cfg
## mysql check
define command{
command_name check_mysql
command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p 3306
}
## nrpe check
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
#end nrpe.command.cfg
save file by type
<esc>
Shift + <:>
wq
Modify Nagios config file
Edit the Nagios config file at /etc/nagios/nagios.cfg, so the nagios can read our config stored at /etc/nagios/objects/nrpe folder add this following line
cfg_dir=/etc/nagios/objects/nrpe
Test Nagios server config
Before we run the server, better if we try to check the configuration files type this at terminal
nagios -v /etc/nagios/nagios.cfg
the result should be OK, that means there's no misconfigure at the Nagios server
Nagios Core 3.2.3
Copyright (c) 2009-2010 Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 10-03-2010
License: GPL
Website: http://www.nagios.org
Reading configuration data...
Read main config file okay...
Processing object config file '/etc/nagios/objects/commands.cfg'...
Processing object config file '/etc/nagios/objects/contacts.cfg'...
Processing object config file '/etc/nagios/objects/timeperiods.cfg'...
Processing object config file '/etc/nagios/objects/templates.cfg'...
Processing object config file '/etc/nagios/objects/localhost.cfg'...
Processing object config directory '/etc/nagios/objects/nrpe'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.template.cfg'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.host.cfg'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.group.cfg'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.commands.cfg'...
Processing object config file '/etc/nagios/objects/nrpe/nrpe.service.cfg'...
Read object config files okay...
Running pre-flight check on configuration data...
Checking services...
Checked 14 services.
Checking hosts...
Checked 2 hosts.
Checking host groups...
Checked 2 host groups.
Checking service groups...
Checked 0 service groups.
Checking contacts...
Checked 1 contacts.
Checking contact groups...
Checked 1 contact groups.
Checking service escalations...
Checked 0 service escalations.
Checking service dependencies...
Checked 0 service dependencies.
Checking host escalations...
Checked 0 host escalations.
Checking host dependencies...
Checked 0 host dependencies.
Checking commands...
Checked 26 commands.
Checking time periods...
Checked 5 time periods.
Checking for circular paths between hosts...
Checking for circular host and service dependencies...
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 0
Total Errors: 0
Things look okay - No serious problems were detected during the pre-flight check
Client Side Installation
Because we are using the nrpe addons, the client should have the nagios plugins. the nrpe will execute the plugins at the client side, for more detail about nrpe
Install NRPE and Nagios plugins packet
mkdir nrpe-client
cd nrpe-client
/usr/bin/wget -c http://labs.consol.de/wp-content/uploads/2010/10/check_logfiles-3.4.2.2.tar.gz
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-plugins-1.4.15-2.el5.rf.x86_64.rpm
/usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-nrpe-2.12-1.el5.rf.x86_64.rpm
yum --nogpgcheck localinstall nagios-*
Configure NRPE
NRPE config file store at /etc/nagios/nrpe.conf for run the nagios plugins at client side, edit the nrpe.cfg
vi /etc/nagios/nrpe.cfg
add Nagios server IP address
allowed_hosts = 127.0.0.1,< add ip address server >
add commands
command[check_ping]=/usr/lib64/nagios/plugins/check_ping -H 127.0.0.1 -w 100.0,20% -c 500.0,60%
command[check_procs]=/usr/lib64/nagios/plugins/check_procs -w 250 -c 400 -s RSZDT
command[check_ssh]=/usr/lib64/nagios/plugins/check_ssh -H 127.0.0.1
command[check_swap]=/usr/lib64/nagios/plugins/check_swap -w 20 -c 10
command[check_http]=/usr/lib64/nagios/plugins/check_http -I 127.0.0.1
command[check_mysql]=/usr/lib64/nagios/plugins/check_tcp -H 127.0.0.1 -p 3306
save file by type
<esc>
Shift + <:>
wq
Install check_logfiles plugins
In this article i used Nagios plugins named check_logfiles, this plugins will read the log files, if the logfiles contain a word is define at the configuration file, then the Nagios will send an email with critical or warning status. for more information about check_logfiles you find it at http://labs.consol.de/nagios/check_logfiles/
Configure and Install
Extract check_logfiles packets
tar -xf check_logfiles-3.4.2.2.tar.gz
cd check_logfiles-3.4.2.2
./configure --prefix=/etc/nagios/ --libexec=/usr/lib64/nagios/plugins/
make && make install
Make a config file for check_logfiles plugin
touch /etc/nagios/logfiles.cfg
vi /etc/nagios/logfiles.cfg
@searches = (
{
tag => 'queue',
logfile => '/tmp/queue.log', #pointing to log files
criticalpatterns => [
'user-queue: 50000',
'user-queue: 65000',
'user-queue: 80000',
'warning: user-queue is .*'
],
},
);
save file by type
<esc>
Shift + <:>
wq
this configuration will check log file at /tmp/queue.log, if the queue.log contain user-queue: 50000/65000/80000 word
example
user-queue: 80000
or if the queue.log contain warning: user-queue word Nagios will send an email warning
example
warning: user-queue is 123.000
Generate log report for check_logfiles
touch /tmp/check_logfiles.trace
Add the command to /etc/nagios/nrpe.cfg
Add this line
vi /etc/nagios/nrpe.cfg
command[check_queue]=/usr/lib64/nagios/plugins/check_logfiles -f /etc/nagios/logfiles.cfg <<= pointing to config file >
save file by type
<esc>
Shift + <:>
wq
with the -f option, the check_logfiles will read the configuration at logfiles.cfg
Define check_logfiles service at HOST
add this line at /etc/nagios/objects/nrpe/nrpe.service.cfg
vi /etc/nagios/objects/nrpe/nrpe.service.cfg
define service{
use id41-generic-service ; Name of service template to use
hostgroup_name SRV-ID41
service_description User Queue
check_command check_nrpe!check_queue
notifications_enabled 0
}
save file by type
<esc>
Shift + <:>
wq
Open 5666 port for NRPE
nrpe using port 5666 for communicate, we should open it
iptables -I INPUT -p tcp -s 0/0 --dport 5666 -j ACCEPT
Edit host.allow
edit the host.allow at /etc/host.allow so the server can acces the client
vi /etc/host.allow
ALL: < ip nagios server >
save file by type
<esc>
Shift + <:>
wq
Run Nagios Server and NRPE at Client
In Host run Nagios core
service nagios start
In Client run NRPE daemon
service nrpe start
Access the Nagios web
http://ip_or_hostname_Nagios_server/nagios/
type nagios user and password, predefined at step 2. Setup Nagios Web Access user
you can use the Nagios VShell for better look, you can find at exchange.nagios.org
Nagios is a tool for monitoring, this article will explain how to install Nagios at server and using Nagios-plugins such as nrpe and check_logfiles plugins at nagios client.
Contents
|
Installation
Preparing Packets
Make Directory for store the packets
mkdir Nagios-packets cd Nagios-packets
Download all necessaries packet for Nagios server
/usr/bin/wget -c http://labs.consol.de/wp-content/uploads/2010/10/check_logfiles-3.4.2.2.tar.gz /usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-3.2.3-3.el5.rf.x86_64.rpm /usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-devel-3.2.3-3.el5.rf.x86_64.rpm /usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-plugins-1.4.15-2.el5.rf.x86_64.rpm /usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-plugins-nrpe-2.12-1.el5.rf.x86_64.rpm /usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/fping-2.4-1.b2.3.el5.rf.x86_64.rpm /usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/perl-Net-SNMP-5.2.0-1.2.el5.rf.noarch.rpm /usr/bin/wget -c http://packages.sw.be/perl-Socket6/perl-Socket6-0.23-1.el5.rfx.x86_64.rpm /usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/perl-Crypt-DES-2.05-3.2.el5.rf.x86_64.rpm
Install Nagios core packet
yum --nogpgcheck localinstall nagios-*
Dependencies Resolved ======================================================================================================================================================= Package Arch Version Repository Size ======================================================================================================================================================= Installing: nagios x86_64 3.2.3-3.el5.rf /nagios-3.2.3-3.el5.rf.x86_64 7.3 M Installing for dependencies: apr x86_64 1.2.7-11.el5_5.3 Centos-5.5-x86_64-updates 118 k apr-util x86_64 1.2.7-11.el5_5.1 Centos-5.5-x86_64-updates 79 k aspell x86_64 12:0.60.3-7.1 core-0 946 k aspell-en x86_64 50:6.0-2.1 core-0 1.6 M curl x86_64 7.15.5-9.el5 core-0 230 k fontconfig x86_64 2.4.1-7.el5 core-0 175 k freetype x86_64 2.2.1-28.el5_5.1 Centos-5.5-x86_64-updates 311 k gd x86_64 2.0.33-9.4.el5_4.2 core-0 155 k gmp x86_64 4.1.4-10.el5 core-0 201 k httpd x86_64 2.2.3-43.el5.centos.3 Centos-5.5-x86_64-updates 1.2 M libX11 x86_64 1.0.3-11.el5 core-0 798 k libXau x86_64 1.0.1-3.1 core-0 18 k libXdmcp x86_64 1.0.1-2.1 core-0 19 k libXpm x86_64 3.5.5-3 core-0 44 k libjpeg x86_64 6b-37 core-0 139 k libpng x86_64 2:1.2.10-7.1.el5_5.3 Centos-5.5-x86_64-updates 234 k libtool-ltdl x86_64 1.5.22-7.el5_4 core-0 38 k php x86_64 5.1.6-27.el5 core-0 2.3 M php-cli x86_64 5.1.6-27.el5 core-0 2.2 M php-common x86_64 5.1.6-27.el5 core-0 153 k pkgconfig x86_64 1:0.21-2.el5 core-0 61 k postgresql-libs x86_64 8.1.22-1.el5_5.1 Centos-5.5-x86_64-updates 196 k xorg-x11-filesystem noarch 7.1-2.fc6 core-0 5.4 k Transaction Summary ======================================================================================================================================================= Install 24 Package(s) Upgrade 0 Package(s) Total size: 18 M Total download size: 11 M Is this ok [y/N]: y
yum will automatic resolved the dependencies for Nagios-core packet
Install Perl packet
yum --nogpgcheck localinstall perl-*
Dependencies Resolved ======================================================================================================================================================= Package Arch Version Repository Size ======================================================================================================================================================= Installing: perl-Crypt-DES x86_64 2.05-3.2.el5.rf /perl-Crypt-DES-2.05-3.2.el5.rf.x86_64 71 k perl-Net-SNMP noarch 5.2.0-1.2.el5.rf /perl-Net-SNMP-5.2.0-1.2.el5.rf.noarch 340 k perl-Socket6 x86_64 0.23-1.el5.rfx /perl-Socket6-0.23-1.el5.rfx.x86_64 110 k Installing for dependencies: perl-Digest-HMAC noarch 1.01-15 core-0 12 k perl-Digest-SHA1 x86_64 2.11-1.2.1 core-0 49 k Transaction Summary ======================================================================================================================================================= Install 5 Package(s) Upgrade 0 Package(s) Total size: 582 k Total download size: 61 k Is this ok [y/N]: y
Install fping packet
yum --nogpgcheck localinstall fping-*
Install Nagios-plugins and Nagios-plugins-nrpe packet
yum --nogpgcheck localinstall nagios-plugins-*
Setup Nagios Web access User
Make htpasswd.usre file for store username and password
touch /etc/nagios/htpasswd.users
Change ownership so the apache can access it
chown root:apache /etc/nagios/htpasswd.users chmod 640 /etc/nagios/htpasswd.users
Create a user for Nagios web access
htpasswd -c /etc/nagios/htpasswd.users <nagios user admin> <remember the password>
Configuring Nagios core Server
Nagios config file in Centos5.* x86_64 stored at
/etc/nagios/ /etc/nagios/objects
And the Plugins stored at
/usr/lib64/nagios/plugins
CGI stored at
/usr/lib64/nagios/cgi
Make Nagios config files for host,group,template,service and commands
cd /etc/nagios/objects/ mkdir nrpe touch {nrpe.service.cfg,nrpe.template.cfg,nrpe.host.cfg,nrpe.group.cfg,nrpe.command.cfg}
- nrpe.service.cfg store service configuration for host
- nrpe.template.cfg store template configuration for host
- nrpe.host.cfg store host name and address
- nrpe.group.cfg store host group name and host member
- nrpe.command.cfg store additional command for nagios-plugins
Write the config files
Host config file ( nrpe.host.cfg )
vi nrpe.host.cfg #start nrpe.host.cfg define host{ use id41-host-server ;template host_name ID41-ND016 alias ID41-ND016 address 10.62.41.16 } #end nrpe.host.cfg
save file by type
<esc> Shift + <:> wq
note
- use = this host using id41-host-server template (example)
- host_name = name of host
- alias = host alias
- address = ip address host
Host Group config file ( nrpe.group.cfg )
vi nrpe.group.cfg
#start nrpe.group.cfg define hostgroup{ hostgroup_name SRV-ID41 alias Server at ID41 members ID41-ND016 } #end nrpe.group.cfg
save file by type
<esc> Shift + <:> wq
note
- hostgroup_name = name of group host
- alias = host alias
- members = name of host,will be a group member
Service config file ( nrpe.service.cfg )
This service will using the NRPE plugins, it will explains later, for direct monitoring without use nrpe plugins you can see at /etc/nagios/objects/localhost.cfg as a reference
vi nrpe.service.cfg
#start nrpe.service.cfg # Define a service to check the disk space of the root partition # on the local machine. Warning if < 20% free, critical if # < 10% free space on partition. define service{ use id41-generic-service ; Name of service template to use #host_name ;use host name ,and mark the hosts group hostgroup_name SRV-ID41 service_description Root Partition check_command check_nrpe!check_disk } # Define a service to check the number of currently running procs # on the local machine. Warning if > 250 processes, critical if # > 400 users. define service{ use id41-generic-service ; Name of service template to use hostgroup_name SRV-ID41 service_description Total Processes check_command check_nrpe!check_procs } # Define a service to check the load on the local machine. define service{ use id41-generic-service ; Name of service template to use hostgroup_name SRV-ID41 service_description Current Load check_command check_nrpe!check_load } # Define a service to check the swap usage the local machine. # Critical if less than 10% of swap is free, warning if less than 20% is free define service{ use id41-generic-service ; Name of service template to use hostgroup_name SRV-ID41 service_description Swap Usage check_command check_nrpe!check_swap } # Define a service to check SSH on the local machine. # Disable notifications for this service by default, as not all users may have SSH enabled. define service{ use id41-generic-service ; Name of service template to use hostgroup_name SRV-ID41 service_description SSH check_command check_nrpe!check_ssh notifications_enabled 0 } # Define a service to check HTTP on the local machine. # Disable notifications for this service by default, as not all users may have HTTP enabled. define service{ use id41-generic-service ; Name of service template to use hostgroup_name SRV-ID41 service_description HTTP check_command check_nrpe!check_http notifications_enabled 0 } #end nrpe.service.cfg
save file by type
<esc> Shift + <:> wq
Template config file ( nrpe.template.cfg )
vi nrpe.template.cfg
#start nrpe.template.cfg define host{ name id41-host-server ; The name of this host template use generic-host ; This template inherits other values from the generic-host template check_period 24x7 ; By default, Linux hosts are checked round the clock check_interval 5 ; Actively check the host every 5 minutes retry_interval 1 ; Schedule host check retries at 1 minute intervals max_check_attempts 10 ; Check each Linux host 10 times (max) check_command check-host-alive ; Default command to check Linux hosts notification_period 24x7 ;workhours ; Linux admins hate to be woken up, so we only notify during the day ; Note that the notification_period variable is being overridden from ; the value that is inherited from the generic-host template! #statusmap_image redhat.png notification_interval 120 ; Resend notifications every 2 hours notification_options d,u,r ; Only send notifications for specific host states contact_groups admins ; Notifications get sent to the admins by default register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A TEMPLATE! } # Generic service definition template - This is NOT a real service, just a template! define service{ name id41-generic-service ; The 'name' of this service template active_checks_enabled 1 ; Active service checks are enabled passive_checks_enabled 1 ; Passive service checks are enabled/accepted parallelize_check 1 ; Active service checks should be parallelized (disabling this can lead to major performance problems) obsess_over_service 1 ; We should obsess over this service (if necessary) check_freshness 0 ; Default is to NOT check service 'freshness' notifications_enabled 1 ; Service notifications are enabled event_handler_enabled 1 ; Service event handler is enabled flap_detection_enabled 1 ; Flap detection is enabled failure_prediction_enabled 1 ; Failure prediction is enabled process_perf_data 1 ; Process performance data retain_status_information 1 ; Retain status information across program restarts retain_nonstatus_information 1 ; Retain non-status information across program restarts is_volatile 0 ; The service is not volatile check_period 24x7 ; The service can be checked at any time of the day max_check_attempts 3 ; Re-check the service up to 3 times in order to determine its final (hard) state normal_check_interval 10 ; Check the service every 10 minutes under normal conditions retry_check_interval 2 ; Re-check the service every two minutes until a hard state can be determined contact_groups admins ; Notifications get sent out to everyone in the 'admins' group notification_options w,u,c,r ; Send notifications about warning, unknown, critical, and recovery events notification_interval 60 ; Re-notify about service problems every hour notification_period 24x7 ; Notifications can be sent out at any time register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE! } make template service using check_logfiles define service{ name id41-queue-service is_volatile 1 check_period 24x7 max_check_attempts 1 contact_groups admins notification_options w,c,u notification_interval 30 notification_period 24x7 } #end nrpe.template.cfg
save file by type
<esc> Shift + <:> wq
Commands config file ( nrpe.command.cfg )
I will add two modification commands for check the mysql status and read the log files
vi nrpe.command.cfg
#start nrpe.command.cfg ## mysql check define command{ command_name check_mysql command_line $USER1$/check_tcp -H $HOSTADDRESS$ -p 3306 } ## nrpe check define command{ command_name check_nrpe command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ } #end nrpe.command.cfg
save file by type
<esc> Shift + <:> wq
Modify Nagios config file
Edit the Nagios config file at /etc/nagios/nagios.cfg, so the nagios can read our config stored at /etc/nagios/objects/nrpe folder add this following line
cfg_dir=/etc/nagios/objects/nrpe
Test Nagios server config
Before we run the server, better if we try to check the configuration files type this at terminal
nagios -v /etc/nagios/nagios.cfg
the result should be OK, that means there's no misconfigure at the Nagios server
Nagios Core 3.2.3 Copyright (c) 2009-2010 Nagios Core Development Team and Community Contributors Copyright (c) 1999-2009 Ethan Galstad Last Modified: 10-03-2010 License: GPL
Website: http://www.nagios.org Reading configuration data... Read main config file okay... Processing object config file '/etc/nagios/objects/commands.cfg'... Processing object config file '/etc/nagios/objects/contacts.cfg'... Processing object config file '/etc/nagios/objects/timeperiods.cfg'... Processing object config file '/etc/nagios/objects/templates.cfg'... Processing object config file '/etc/nagios/objects/localhost.cfg'... Processing object config directory '/etc/nagios/objects/nrpe'... Processing object config file '/etc/nagios/objects/nrpe/nrpe.template.cfg'... Processing object config file '/etc/nagios/objects/nrpe/nrpe.host.cfg'... Processing object config file '/etc/nagios/objects/nrpe/nrpe.group.cfg'... Processing object config file '/etc/nagios/objects/nrpe/nrpe.commands.cfg'... Processing object config file '/etc/nagios/objects/nrpe/nrpe.service.cfg'... Read object config files okay...
Running pre-flight check on configuration data...
Checking services... Checked 14 services. Checking hosts... Checked 2 hosts. Checking host groups... Checked 2 host groups. Checking service groups... Checked 0 service groups. Checking contacts... Checked 1 contacts. Checking contact groups... Checked 1 contact groups. Checking service escalations... Checked 0 service escalations. Checking service dependencies... Checked 0 service dependencies. Checking host escalations... Checked 0 host escalations. Checking host dependencies... Checked 0 host dependencies. Checking commands... Checked 26 commands. Checking time periods... Checked 5 time periods. Checking for circular paths between hosts... Checking for circular host and service dependencies... Checking global event handlers... Checking obsessive compulsive processor commands... Checking misc settings... Total Warnings: 0 Total Errors: 0 Things look okay - No serious problems were detected during the pre-flight check
Client Side Installation
Because we are using the nrpe addons, the client should have the nagios plugins. the nrpe will execute the plugins at the client side, for more detail about nrpe
Install NRPE and Nagios plugins packet
mkdir nrpe-client cd nrpe-client
/usr/bin/wget -c http://labs.consol.de/wp-content/uploads/2010/10/check_logfiles-3.4.2.2.tar.gz /usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-plugins-1.4.15-2.el5.rf.x86_64.rpm /usr/bin/wget -c http://apt.sw.be/redhat/el5/en/x86_64/rpmforge/RPMS/nagios-nrpe-2.12-1.el5.rf.x86_64.rpm
yum --nogpgcheck localinstall nagios-*
Configure NRPE
NRPE config file store at /etc/nagios/nrpe.conf for run the nagios plugins at client side, edit the nrpe.cfg
vi /etc/nagios/nrpe.cfg
add Nagios server IP address
allowed_hosts = 127.0.0.1,< add ip address server >
add commands
command[check_ping]=/usr/lib64/nagios/plugins/check_ping -H 127.0.0.1 -w 100.0,20% -c 500.0,60% command[check_procs]=/usr/lib64/nagios/plugins/check_procs -w 250 -c 400 -s RSZDT command[check_ssh]=/usr/lib64/nagios/plugins/check_ssh -H 127.0.0.1 command[check_swap]=/usr/lib64/nagios/plugins/check_swap -w 20 -c 10 command[check_http]=/usr/lib64/nagios/plugins/check_http -I 127.0.0.1 command[check_mysql]=/usr/lib64/nagios/plugins/check_tcp -H 127.0.0.1 -p 3306
save file by type
<esc> Shift + <:> wq
Install check_logfiles plugins
In this article i used Nagios plugins named check_logfiles, this plugins will read the log files, if the logfiles contain a word is define at the configuration file, then the Nagios will send an email with critical or warning status. for more information about check_logfiles you find it at http://labs.consol.de/nagios/check_logfiles/
Configure and Install
Extract check_logfiles packets
tar -xf check_logfiles-3.4.2.2.tar.gz cd check_logfiles-3.4.2.2
./configure --prefix=/etc/nagios/ --libexec=/usr/lib64/nagios/plugins/ make && make install
Make a config file for check_logfiles plugin
touch /etc/nagios/logfiles.cfg vi /etc/nagios/logfiles.cfg
@searches = ( { tag => 'queue', logfile => '/tmp/queue.log', #pointing to log files criticalpatterns => [ 'user-queue: 50000', 'user-queue: 65000', 'user-queue: 80000', 'warning: user-queue is .*' ], }, );
save file by type
<esc> Shift + <:> wq
this configuration will check log file at /tmp/queue.log, if the queue.log contain user-queue: 50000/65000/80000 word
example user-queue: 80000
or if the queue.log contain warning: user-queue word Nagios will send an email warning
example warning: user-queue is 123.000
Generate log report for check_logfiles
touch /tmp/check_logfiles.trace
Add the command to /etc/nagios/nrpe.cfg
Add this line
vi /etc/nagios/nrpe.cfg command[check_queue]=/usr/lib64/nagios/plugins/check_logfiles -f /etc/nagios/logfiles.cfg <<= pointing to config file >
save file by type
<esc> Shift + <:> wq
with the -f option, the check_logfiles will read the configuration at logfiles.cfg
Define check_logfiles service at HOST
add this line at /etc/nagios/objects/nrpe/nrpe.service.cfg
vi /etc/nagios/objects/nrpe/nrpe.service.cfg
define service{ use id41-generic-service ; Name of service template to use hostgroup_name SRV-ID41 service_description User Queue check_command check_nrpe!check_queue notifications_enabled 0 }
save file by type
<esc> Shift + <:> wq
Open 5666 port for NRPE
nrpe using port 5666 for communicate, we should open it
iptables -I INPUT -p tcp -s 0/0 --dport 5666 -j ACCEPT
Edit host.allow
edit the host.allow at /etc/host.allow so the server can acces the client
vi /etc/host.allow ALL: < ip nagios server >
save file by type
<esc> Shift + <:> wq
Run Nagios Server and NRPE at Client
In Host run Nagios core
service nagios start
In Client run NRPE daemon
service nrpe start
Access the Nagios web
http://ip_or_hostname_Nagios_server/nagios/
type nagios user and password, predefined at step 2. Setup Nagios Web Access user
you can use the Nagios VShell for better look, you can find at exchange.nagios.org
Good !!1
ReplyDelete