因为最进要增加对所有主机的IO检测,首先上nagios exchange看了下插件,不能满足我的个人需求。于是到网上先转转吧,实在不行就自己写。果然在网上找到了一个还不错的脚本,是通过第三方工具sysstat里的iostat工具实现的检测。(看来和我的想法一致,本来我自己写也打算用这个工具。)具体脚本如下:
#!/bin/sh iostat=`which iostat 2>/dev/null` bc=`which bc 2>/dev/null` function help { echo -e "This plugin shows the I/O usage_rate of the specified disk, using the iostat external program.nt example nt ./io -d sda2 -w 10 -c 20" exit -1 } # Ensuring we have the needed tools: ( [ ! -f $iostat ] ) && ( echo "ERROR: iostat command not found .Please install" && exit -1 ) # Getting parameters: while getopts "d:w:c:h" OPT; do case $OPT in "d") disk=$OPTARG;; "w") warning=$OPTARG;; "c") critical=$OPTARG;; "h") help;; esac done # Adjusting the three warn and crit levels: crit_util=`echo $critical` warn_util=`echo $warning` # Checking parameters: [ ! -b "/dev/$disk" ] && echo "ERROR: Device incorrectly specified" && help ( [ $warn_util == "" ] || [ $crit_util == "" ] ) && echo "ERROR: You must specify all warning and critical levels" && help ( [[ "$warn_util" -ge "$crit_util" ]] ) && echo "ERROR: critical levels must be highter than warning levels" && help # Doing the actual check: util=`$iostat -dx 1 10 $disk | grep $disk | awk '{print $12}'|sort -nr | head -n 1 ` # Comparing the result and setting the correct level: if ( echo ${util} ${crit_util}|awk '!($1>=$2){exit 1}' );then msg="CRITICAL" status=2 else if ( echo ${util} ${warn_util} |awk '!($1>=$2){exit 1}');then msg="WARNING" status=1 else msg="OK" status=0 fi fi # Printing the results: echo "$msg - I/O stats util_rate=$util " # Bye! exit $status
不过该脚本在对所有的磁盘进行检测时,需要一个一个去配置。感觉有点麻烦,所以我将该脚本改成了:
#!/bin/sh iostat=`which iostat 2>/dev/null` bc=`which bc 2>/dev/null` function help { echo -e "This plugin shows the I/O usage_rate of the specified disk, using the iostat external program.nt example nt ./io -w 10 -c 20" exit -1 } # Ensuring we have the needed tools: ( [ ! -f $iostat ] ) && ( echo "ERROR: iostat command not found .Please install" && exit -1 ) # Getting parameters: while getopts "w:c:h" OPT; do case $OPT in "w") warning=$OPTARG;; "c") critical=$OPTARG;; "h") help;; esac done # Adjusting the three warn and crit levels: crit_util=`echo $critical` warn_util=`echo $warning` # Checking parameters: #[ ! -b "/dev/$disk" ] && echo "ERROR: Device incorrectly specified" && help ( [ $warn_util == "" ] || [ $crit_util == "" ] ) && echo "ERROR: You must specify all warning and critical levels" && help ( [[ "$warn_util" -ge "$crit_util" ]] ) && echo "ERROR: critical levels must be highter than warning levels" && help # Doing the actual check: getio=`$iostat -dx 1 3 -p ALL| awk '{print $NF,$1}'|grep sda|sort -nr|head -n 1` util=`echo $getio|awk '{ print $1}'` getdisk=`echo $getio|awk '{print $2}' ` # Comparing the result and setting the correct level: if ( echo ${util} ${crit_util}|awk '!($1>=$2){exit 1}' );then msg="CRITICAL" status=2 else if ( echo ${util} ${warn_util} |awk '!($1>=$2){exit 1}');then msg="WARNING" status=1 else msg="OK" status=0 fi fi # Printing the results: echo "$msg - $getdisk I/O stats util_rate=$util " # Bye! exit $status
这样可以检测所有的sda及其分区(如:/dev/sda1、/dev/sda2)的所有IO情况。并在结果中提示是那个分区或整个磁盘的IO比较高。