pnp4nagios流量检测

cacti对流量的监控非常强大,但其他方面的监控能力相比nagios略有不足。而大多IT企业里的开源监控系统以nagios居多,而nagios上也有相应的配套流量插件 —— pnp4nagios 。不过同大多数流量监控绘图软件实现方式一样,大多是通过SNMP协议获取数据,存储为rrdtool 格式。SNMP协议确实十分强大,不过nagios用户使用nrpe的应该更多些。当然,配合nagios使用时,两者之间并不冲突。难道就没办法通过check_nrpe方式配合pnp4nagios实现绘图了吗? 当然是可以的。

通过一段时间内ifconfig获取的流量值求差并求除,计算出平均大小,然后按pnp4nagios所要求的格式输出即可。具体脚本如下:

#!/bin/sh
########################################################
#                                                      #
#         www.361way.com                               #
# Useage: check_traffic -i Interface -w warn -c cirt   #
#                                                      #
########################################################
while getopts ":i:c:w:h" optname
  do
    case "optname" in
      "i")
        INT=OPTARG
        ;;
      "c")
        CIRT=OPTARG
        ;;
      "w")
        WARN=OPTARG
        ;;
      "h")
        echo "Useage: check_traffic -i Interface -w warn -c cirt"
        exit
        ;;
      "?")
        echo "Unknown option OPTARG"
        exit
        ;;
      ":")
        echo "No argument value for optionOPTARG"
        exit
        ;;
      *)
      # Should not occur
        echo "Unknown error while processing options"
        exit
        ;;
    esac
  done
[ -z INT ]&& echo "Please input Device!"&&exit
ifconfigINT >/dev/null 2>&1
[ ? -ne 0 ] && echo "error: no deviceINT" && exit || DEVICE=INT
[ -zWARN ] && WARN=1048576
[ -z CIRT ] && CIRT=2097152
DIR=/App/nagios/tmp
FILE=DIR/.network-DEVICE.tmp
[ -eDIR ] || mkdir -p DIR
chown -R nagios.nagiosDIR
[ -e FILE ] || >FILE
if [ `cat /App/nagios/tmp/.network-DEVICE.tmp | wc -c` -eq 0 ];then
        echo -en `date +%s`"t" >FILE
        echo -en `ifconfig DEVICE | grep "RX bytes" | awk '{print2}' | awk -F: '{print NF}'`"t" >>FILE
        echo `ifconfig DEVICE | grep "RX bytes" | awk '{print6}' | awk -F: '{print NF}'`>>FILE
        echo "This is first run"
else
        New_Time=`date +%s`
        New_In=`ifconfig DEVICE | grep "RX bytes" | awk '{print2}' | awk -F: '{print NF}'`
        New_Out=`ifconfigDEVICE | grep "RX bytes" | awk '{print 6}' | awk -F: '{printNF}'`
        Old_Time=`cat FILE | awk '{print1}'`
        Old_In=`cat FILE | awk '{print2}'`
        Old_Out=`cat FILE | awk '{print3}'`
        Diff_Time=`echo "New_Time-Old_Time"|bc`
        [ Diff_Time -le 5 ] && echo "less 5s" && exit
        Diff_In=`echo "scale=0;(New_In-Old_In)*8/Diff_Time"|bc`
        Diff_Out=`echo "scale=0;(New_Out-Old_Out)*8/Diff_Time"|bc`
        [Diff_In -le 0 ] && Diff_In=`cat FILE | awk '{print4}'`
        [ Diff_Out -le 0 ] && Diff_Out=`catFILE | awk '{print 5}'`
        echo "New_Time New_InNew_Out Diff_InDiff_Out" >FILE
        if [Diff_In -gt CIRT -oDiff_In -eq CIRT ];then
                echo -e "CIRT -Diff_In|In={Diff_In};{WARN};{CIRT};0;0;Out={Diff_Out};{WARN};{CIRT};0;0"
                exit 2
        fi
        if [ Diff_In -gtWARN -o Diff_In -eqWARN ];then
                echo -e "WARN - Diff_In|In={Diff_In};{WARN};{CIRT};0;0;Out={Diff_Out};{WARN};{CIRT};0;0"
                exit 1
        fi
        if [Diff_In -lt WARN ];then
                echo -e "OK -Diff_In|In={Diff_In};{WARN};{CIRT};0;0;Out={Diff_Out};{WARN};{CIRT};0;0"
                exit 0
        fi
fi

另外我们也可以从cat /proc/net/dev的结果里进行分析,具体可以根据下面的脚本修改下,和上面ifconfig计算平均流量的原理都是一样的:

#!/bin/bash
echo -n "which nic?"
read eth
echo "the nic is "eth
echo -n "how much seconds:"
read sec
echo "duration is "sec" seconds, wait please..."
infirst=(awk '/'eth'/{print 1 }' /proc/net/dev |sed 's/'eth'://')
outfirst=(awk '/'eth'/{print 10 }' /proc/net/dev)
sumfirst=((infirst+outfirst))
sleep sec"s"
inend=(awk '/'eth'/{print1 }' /proc/net/dev |sed 's/'eth'://')
outend=(awk '/'eth'/{print10 }' /proc/net/dev)
sumend=((inend+outend))
sum=((sumend-sumfirst))
echo sec" seconds total :"sum"bytes"
aver=((sum/sec))
echo "avrage :"aver"bytes/sec"

注:第二个脚本获取的结果都是以bytes为单位的,即B 。和iptraf等流量监控工具获取到的结果是有8倍的差距的 ,iftraf 等工具获取的结果是以bit为单位的,即b 。如kb/s 、Mb/s 。为了同一般IDC公司所谓流量统一,所以我将第一个脚本里的结果转化也了bit

pnp4nagios所使用的流量监控模板为:

 '#FF0000',
       'green' => '#00FF00',
       'blue' => '#0000FF',
       'yellow' => '#FFFF00',
       'black' => '#000000',
       'deepred' => '#330000',
        );
def[1] =  "DEF:var1=rrdfile:DS[1]:AVERAGE " ;def[1] .= "DEF:var2=rrdfile:DS[2]:AVERAGE " ;
def[1] .= "HRULE:WARN[1]#FFFF00 ";
def[1] .= "HRULE:CRIT[1]#FF0000 ";
def[1] .= "AREA:var1colors[green]:"In " " ;
def[1] .= "GPRINT:var1:LAST:"%6.2lf last" " ;def[1] .= "GPRINT:var1:AVERAGE:"%6.2lf avg" " ;
def[1] .= "GPRINT:var1:MAX:"%6.2lf maxn" ";def[1] .= "LINE:var2colors[blue]:"Out " " ;def[1] .= "GPRINT:var2:LAST:"%6.2lf last" " ;
def[1] .= "GPRINT:var2:AVERAGE:"%6.2lf avg" " ;def[1] .= "GPRINT:var2:MAX:"%6.2lf Totaln" " ;
/*
def[1] .= "CDEF:total=var1,var2,+ " ;def[1] .= "LINE1:total$colors[black]:"Total " " ;
*/
?>

最后,得到的监控结果如下图:

另外从http://exchange.nagios.org/ 站点上还发现有一个号称也是通过check_nrpe实现,可以达到上面效果的一个插件。不过其用的是另外一个模板。具体可以参看nagios上的相关页面:http://exchange.nagios.org/directory/Plugins/Network-Connections%2C-Stats-and-Bandwidth/check_iftraffic_nrpe/details 。有兴趣的可以试下效果 。

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注