pnp4nagios流量检测

cacti对流量的监控非常强大,但其他方面的监控能力相比nagios略有不足。而大多IT企业里的开源监控系统以nagios居多,而nagios上也有相应的配套流量插件 —— pnp4nagios 。不过同大多数流量监控绘图软件实现方式一样,大多是通过SNMP协议获取数据,存储为rrdtool 格式。SNMP协议确实十分强大,不过nagios用户使用nrpe的应该更多些。当然,配合nagios使用时,两者之间并不冲突。难道就没办法通过check_nrpe方式配合pnp4nagios实现绘图了吗? 当然是可以的。

通过一段时间内ifconfig获取的流量值求差并求除,计算出平均大小,然后按pnp4nagios所要求的格式输出即可。具体脚本如下:

#!/bin/sh
########################################################
#                                                      #
#         www.361way.com                               #
# Useage: check_traffic -i Interface -w warn -c cirt   #
#                                                      #
########################################################
while getopts ":i:c:w:h" optname
  do
    case "$optname" in
      "i")
        INT=$OPTARG
        ;;
      "c")
        CIRT=$OPTARG
        ;;
      "w")
        WARN=$OPTARG
        ;;
      "h")
        echo "Useage: check_traffic -i Interface -w warn -c cirt"
        exit
        ;;
      "?")
        echo "Unknown option $OPTARG"
        exit
        ;;
      ":")
        echo "No argument value for option $OPTARG"
        exit
        ;;
      *)
      # Should not occur
        echo "Unknown error while processing options"
        exit
        ;;
    esac
  done
[ -z $INT ]&& echo "Please input Device!"&&exit
ifconfig $INT >/dev/null 2>&1
[ $? -ne 0 ] && echo "error: no device $INT" && exit || DEVICE=$INT
[ -z $WARN ] && WARN=1048576
[ -z $CIRT ] && CIRT=2097152
DIR=/App/nagios/tmp
FILE=$DIR/.network-$DEVICE.tmp
[ -e $DIR ] || mkdir -p $DIR
chown -R nagios.nagios $DIR
[ -e $FILE ] || >$FILE
if [ `cat /App/nagios/tmp/.network-$DEVICE.tmp | wc -c` -eq 0 ];then
        echo -en `date +%s`"t" >$FILE
        echo -en `ifconfig $DEVICE | grep "RX bytes" | awk '{print $2}' | awk -F: '{print $NF}'`"t" >>$FILE
        echo `ifconfig $DEVICE | grep "RX bytes" | awk '{print $6}' | awk -F: '{print $NF}'`>>$FILE
        echo "This is first run"
else
        New_Time=`date +%s`
        New_In=`ifconfig $DEVICE | grep "RX bytes" | awk '{print $2}' | awk -F: '{print $NF}'`
        New_Out=`ifconfig $DEVICE | grep "RX bytes" | awk '{print $6}' | awk -F: '{print $NF}'`
        Old_Time=`cat $FILE | awk '{print $1}'`
        Old_In=`cat $FILE | awk '{print $2}'`
        Old_Out=`cat $FILE | awk '{print $3}'`
        Diff_Time=`echo "$New_Time-$Old_Time"|bc`
        [ $Diff_Time -le 5 ] && echo "less 5s" && exit
        Diff_In=`echo "scale=0;($New_In-$Old_In)*8/$Diff_Time"|bc`
        Diff_Out=`echo "scale=0;($New_Out-$Old_Out)*8/$Diff_Time"|bc`
        [ $Diff_In -le 0 ] && Diff_In=`cat $FILE | awk '{print $4}'`
        [ $Diff_Out -le 0 ] && Diff_Out=`cat $FILE | awk '{print $5}'`
        echo "$New_Time $New_In $New_Out $Diff_In $Diff_Out" >$FILE
        if [ $Diff_In -gt $CIRT -o $Diff_In -eq $CIRT ];then
                echo -e "CIRT - $Diff_In|In=${Diff_In};${WARN};${CIRT};0;0;Out=${Diff_Out};${WARN};${CIRT};0;0"
                exit 2
        fi
        if [ $Diff_In -gt $WARN -o $Diff_In -eq $WARN ];then
                echo -e "WARN - $Diff_In|In=${Diff_In};${WARN};${CIRT};0;0;Out=${Diff_Out};${WARN};${CIRT};0;0"
                exit 1
        fi
        if [ $Diff_In -lt $WARN ];then
                echo -e "OK - $Diff_In|In=${Diff_In};${WARN};${CIRT};0;0;Out=${Diff_Out};${WARN};${CIRT};0;0"
                exit 0
        fi
fi

另外我们也可以从cat /proc/net/dev的结果里进行分析,具体可以根据下面的脚本修改下,和上面ifconfig计算平均流量的原理都是一样的:

#!/bin/bash
echo -n "which nic?"
read eth
echo "the nic is "$eth
echo -n "how much seconds:"
read sec
echo "duration is "$sec" seconds, wait please..."
infirst=$(awk '/'$eth'/{print $1 }' /proc/net/dev |sed 's/'$eth'://')
outfirst=$(awk '/'$eth'/{print $10 }' /proc/net/dev)
sumfirst=$(($infirst+$outfirst))
sleep $sec"s"
inend=$(awk '/'$eth'/{print $1 }' /proc/net/dev |sed 's/'$eth'://')
outend=$(awk '/'$eth'/{print $10 }' /proc/net/dev)
sumend=$(($inend+$outend))
sum=$(($sumend-$sumfirst))
echo $sec" seconds total :"$sum"bytes"
aver=$(($sum/$sec))
echo "avrage :"$aver"bytes/sec"

注:第二个脚本获取的结果都是以bytes为单位的,即B 。和iptraf等流量监控工具获取到的结果是有8倍的差距的 ,iftraf 等工具获取的结果是以bit为单位的,即b 。如kb/s 、Mb/s 。为了同一般IDC公司所谓流量统一,所以我将第一个脚本里的结果转化也了bit

pnp4nagios所使用的流量监控模板为:

 '#FF0000',
       'green' => '#00FF00',
       'blue' => '#0000FF',
       'yellow' => '#FFFF00',
       'black' => '#000000',
       'deepred' => '#330000',
        );
$def[1] =  "DEF:var1=$rrdfile:$DS[1]:AVERAGE " ;
$def[1] .= "DEF:var2=$rrdfile:$DS[2]:AVERAGE " ;
$def[1] .= "HRULE:$WARN[1]#FFFF00 ";
$def[1] .= "HRULE:$CRIT[1]#FF0000 ";
$def[1] .= "AREA:var1$colors[green]:"In " " ;
$def[1] .= "GPRINT:var1:LAST:"%6.2lf last" " ;
$def[1] .= "GPRINT:var1:AVERAGE:"%6.2lf avg" " ;
$def[1] .= "GPRINT:var1:MAX:"%6.2lf maxn" ";
$def[1] .= "LINE:var2$colors[blue]:"Out " " ;
$def[1] .= "GPRINT:var2:LAST:"%6.2lf last" " ;
$def[1] .= "GPRINT:var2:AVERAGE:"%6.2lf avg" " ;
$def[1] .= "GPRINT:var2:MAX:"%6.2lf Totaln" " ;
/*
$def[1] .= "CDEF:total=var1,var2,+ " ;
$def[1] .= "LINE1:total$colors[black]:"Total " " ;
*/
?>

最后,得到的监控结果如下图:

另外从http://exchange.nagios.org/ 站点上还发现有一个号称也是通过check_nrpe实现,可以达到上面效果的一个插件。不过其用的是另外一个模板。具体可以参看nagios上的相关页面:http://exchange.nagios.org/directory/Plugins/Network-Connections%2C-Stats-and-Bandwidth/check_iftraffic_nrpe/details 。有兴趣的可以试下效果 。

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注