NSCABottleneck

From NagiosCommunity

Jump to: navigation, search

Contents

The problem

Using obsessing at distributed setups often leads to huge latency on the sending hosts. This is caused because Nagios doesn't do anything while executing the OCSP/OCHP command. As every plugin the execution of the OC*P commands spawns a shell, executes a executable, ... All this takes its time and during this process Nagios does not do anything else instead it queues the jobs. So with the OCSP/OCHP setup latencies up to hundreds of seconds are quite usual, whereas the following workaround solves this Nagios Weakness.

The solution

To solve this bottleneck we use PerfCommands instead. These commands do only use functions of Nagios itself. There is no need to start an external command.

Configuration

The receiving server does not need any special config. It only needs the normal nsca daemon. On the sending server we only need this simple shell script /usr/lib/nagios/scripts/submit_nsca_data.sh:

TYPE=$1
DATE=`date +%s`
mv $TYPE $TYPE.$DATE \;
for i in /dev/shm/$TYPE.*; do
   cat /dev/shm/$i | /usr/sbin/send_nsca  -H MASTERs_IP_ADDRESS -c /etc/nagios/send_nsca.cfg
   rm /dev/shm/$i
done


To let Nagios collect and send performancevalues every 10 seconds following changes inside the /etc/nagios/nagios.cfg are required:

process_performance_data=1
perfdata_timeout=60
host_perfdata_file=/dev/shm/host
service_perfdata_file=/dev/shm/service
host_perfdata_file_template=$HOSTNAME$\t$HOSTSTATEID$\t$HOSTOUTPUT$|$HOSTPERFDATA$
service_perfdata_file_template=$HOSTNAME$\t$SERVICEDESC$\t$SERVICESTATEID$\t$SERVICEOUTPUT$|$SERVICEPERFDATA$
host_perfdata_file_mode=a
service_perfdata_file_mode=a 
host_perfdata_file_processing_interval=10
service_perfdata_file_processing_interval=10
host_perfdata_file_processing_command=process-host-perfdata-file
service_perfdata_file_processing_command=process-service-perfdata-file

Additional this two commands have to be defined:

define command{
  command_name    process-host-perfdata-file
  command_line    /usr/lib/nagios/scripts/submit_nsca_data.sh host
}

define command{
  command_name    process-service-perfdata-file
  command_line    /usr/lib/nagios/scripts/submit_nsca_data.sh service
}

Afterwards Nagios needs an restart.

That's all

Drawbacks

  • Since NSCA does not forward any timestamps, all check results will have the wrong "Last Check Time" on the receiving server.
  • Perfcommands cannot be used for other purposes.
Personal tools