smokeping告警配置

smokeping 默认用sendmail 发邮件告警,也可以直接调用外部程序进行报警。smokeping的alert设置有点复杂,但是却很好用,设置很灵活,考虑得很周全。

一、配置邮件支持

默认情况下sendmail 是不通过163、qq之类的邮件服务器发送的,不过我们略微修改下配置即可以支持。首先需要安装 Authen::SASL 模块(auth 需要用的),然后修改 smokeping/lib/Smokeping.pm 。



如下:



<br />
#头上加
use Authen::SASL;
#定位到sendmail函数,默认新版本sendmail函数的定义的已经是下面这样了,如果不是改成下面这样
sub sendmail ($$){
    myfrom = shift;
    my to = shift;to = 1 ifto =~ /<(.*?)>/;
    my body = shift;
    if (cfg->{General}{mailhost} and
        my smtp = Net::SMTP->new([split /\s*,\s*/,cfg->{General}{mailhost}],Timeout=>5) ){
        smtp->auth(split(/\s*,\s*/,cfg->{General}{mailusr}),split(/\s*,\s*/, cfg->{General}{mailpwd}));smtp->mail(from);smtp->to(split(/\s*,\s*/, to));smtp->data();
        smtp->datasend(body);
        smtp->dataend();smtp->quit;
    } elsif (cfg->{General}{sendmail} or -x "/usr/lib/sendmail"){
        open (M, "|-") || exec ((cfg->{General}{sendmail} || "/usr/lib/sendmail"),"-f",from,to);
        print M body;
        close M;
    } else {
        warn "ERROR: not sending mail toto, as all methodes failed\n";
    }
}
#找到  '_vars =>' ,把 mailusr mailpwd  加进去。不然不能启动哦!General configuration values valid for the whole SmokePing setup.
DOC
_vars =>
[ qw(owner imgcache imgurl datadir dyndir pagedir piddir sendmail offset
smokemail cgiurl mailhost mailusr mailpwd snpphost contact display_name
syslogfacility syslogpriority concurrentprobes changeprocessnames tmail
changecgiprogramname linkstyle precreateperms ) ],
配置完成后修改配置文件/opt/smokeping/etc/config ,增加邮件服务器的配置:



<br />
mailhost = smtp.361way.com
mailusr = monitor@361way.com
mailpwd = xxxxxxxxx

二、Alerts部分配置

1、自定义alerts告警策略

如下定义了几种告警策略,bigloss 、someloss等



<br />
*** Alerts ***
to = admin@361way.com
from = monitor@361way.com
+bigloss
type = loss
# in percent
pattern = ==0%,==0%,==0%,==0%,>0%,>0%,>0%
comment = suddenly there is packet loss
+someloss
type = loss
# in percent
pattern = >0%,*12*,>0%,*12*,>0%
comment = loss 3 times  in a row
+startloss
type = loss
# in percent
pattern = ==S,>0%,>0%,>0%
comment = loss at startup
+rttdetect
type = rtt
# in milli seconds
pattern = <10,<10,<10,<10,<10,<100,>100,>100,>100
comment = routing messed up again ?
+hostdown
type = loss
# in percent
pattern = ==0%,==0%,==0%, ==U
comment = no reply
+lossdetect
type = loss
# in percent
pattern = ==0%,==0%,==0%,==0%,>20%,>20%,>20%
comment = suddenly there is packet loss
以上几种告警,这里选取三种说明如下:



<br />
  • someloss: 如果在12次检查中出现了3次丢包的情况(不论丢多少个包),就进行alert; 
  • rttbad:
    如果连续出现两次50毫秒以上的延时,就进行alert; 
  • rrtdetect:
    之前5次检查延时都少于10毫秒,前6次检查延时都少于100毫秒,第7次开始连续3次检查延时都大于100毫秒的话,就进行alert。
<br />

2、策略应用

在target里面加上相应的策略配置即可,如下:
++ 361way
menu = 361way_host
title =361way.com
host = www.361way.com
alerts = someloss,hostdown
主机一旦出现告警,就会通过邮件发送到我们事先配置的邮箱中,类似下图:



<img src="https://www.361way.com/wp-content/uploads/2016/05/smokeping-alert.png" width="594" height="316" title="smokeping-alert" alt="smokeping-alert" />



上图中的邮件告警内容也可以进行自定义格式输出和修改,这个还是修改Smokeping.pm文件,修改其中如下部分即可:



<br />
                    my $default_mail = < <##WHAT##> on <##LINE##>
<##STAMP##>
Alert "<##ALERT##>" <##WHAT##> for <##URL##>
Pattern
-------
<##PAT##>
Data (old --> now)
------------------
<##LOSS##>
<##RTT##>
Comment
-------
<##COMMENT##>
DOC

3、pattern匹配

pattern匹配是编写alert规则中很最要的一部分,<a href="http://oss.oetiker.ch/smokeping/doc/smokeping_config.en.html" target="_blank" rel="noopener">官方对这部分有说明</a>,不过我查到了<a href="https://datahunter.org/smokeping" target="_blank" rel="noopener">一个香港同胞</a>总结的要好一些,这里摘录下,不做翻译了。



<a href="https://www.361way.com/wp-content/uploads/2016/05/smokeping-pattern.png" target="_blank" rel="noopener"><img src="https://www.361way.com/wp-content/uploads/2016/05/smokeping-pattern.png" width="836" height="469" title="smokeping-pattern" alt="smokeping-pattern" /></a>

三、特殊调用

1、外部程序调用

调用外部告警程序,如IM、短信等的示例如下:



<br />
*** Alerts ***
to = |/usr/local/smokeping/bin/alert.sh
from = joe@somehost
"to" 选项,默认是要填入一个email地址的,但是只要在"="后面加上"|",后面再跟你自定义的脚本的路径,就可以调用自己的脚本进行alert了。脚本会读入5或者6个参数:name-of-alert, target, loss-pattern, rtt-pattern, hostname,[raise]。自己选择使用哪些参数alert即可。



这里从<a href="http://bitrefinery.com/blog/smokeping-traceroute-alerts/" target="_blank" rel="noopener">老外站点</a>上找到有一个示例如下:



config 配置



<br />
to = |/etc/smokeping/config.d/trace_alert.sh 2> /tmp/trace.log

脚本内容:

<br />



<br />
########################################################
# Script to email a mtr report on alert from Smokeping #
########################################################
alertname=1
target=2
losspattern=3
rtt=4
hostname=5
email="monitoring@email.com"
smokename="BR-NYC-"
if [ “losspattern" = “loss: 0%" ];
then
subject="Clear-{smokename}-Alert:target host: {hostname}"
else
subject="{smokename}Alert: {target} –{hostname}"
fi
echo “MTR Report for hostname: {hostname}" > /tmp/mtr.txt
echo “" >> /tmp/mtr.txt
echo “sudo mtr -n –report{hostname} "
sudo /usr/sbin/mtr -n –report {hostname} >> /tmp/mtr.txt
echo “" >> /tmp/mtr.txt
echo “Name of Alert: "alertname >> /tmp/mtr.txt
echo “Target: " target >> /tmp/mtr.txt
echo “Loss Pattern: "losspattern >> /tmp/mtr.txt
echo “RTT Pattern: " rtt >> /tmp/mtr.txt
echo “Hostname: "hostname >> /tmp/mtr.txt
echo “" >> /tmp/mtr.txt
echo “Full mtr command is: sudo /usr/sbin/mtr -n –report {hostname}" >> /tmp/mtr.txt
echo “subject: "subject
if [ -s /tmp/mtr.txt ] then
mailx -s “{subject}"email
fi
<br />

2、特定主机发送到特定用户

to 表示接受所有报警的邮箱,如果需要在特定的节点报警发送到特定的邮箱

则在该节点上增加alertee = testmonitor@139.com即可。具体可以参看/opt/smokeping/lib/Smokeping.pm源码,如下部分:

foreach my addr (map {_ ? (split /\s*,\s*/,_) : ()}cfg->{Alerts}{to},tree->{alertee},alert->{to}){
<br />

smokeping告警配置》有3条评论

  1. 您好,博主
    根据你你文档思路,设置通过smokeing发生alerts邮件(发件人为第三方邮件服务器,我实验使用阿里云企业邮箱),设置已经完成,在/var/log/messages文件出现以下内容,说明已经触发报警,同时查看/var/log/maillog,发现任何新邮件记录,个人判断有可能是smokeping没有触发sendmail,麻烦博主基于解答,谢谢。Dec 20 17:17:46 localhost smokeping[2824]: Alert someloss is active for Test.Zhejiang_DNS.Zhejiang_Hangzhou_CTC_master_DNS
    Dec 20 17:18:47 localhost smokeping[2824]: Alert someloss is active for Test.Zhejiang_DNS.Zhejiang_Hangzhou_CTC_master_DNS
    Dec 20 17:19:46 localhost smokeping[2824]: Alert someloss is active for Test.Zhejiang_DNS.Zhejiang_Hangzhou_CTC_master_DNS

  2. 求问一下,我设置脚本告警一直不运行。手工运行脚本正常,但是配置到Alerts里就不触发脚本运行。我想问一下,alerts是否必须要有邮件告警才可以?没有邮件告警,只有脚本告警是不是不行?或者是至少要正确设置好sendmail(我怀疑可能是我没有设置邮件服务?)

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注