nagios系列(六)之nagios实现对服务器cpu温度的监控
1、安装硬件传感器监控软件sensors
yum install -y lm_sensors*
2、运行sensors-detect进行传感器检测
##一路回车即可
Do you want to overwrite /etc/sysconfig/lm_sensors? (YES/no):
Starting lm_sensors: loading module coretemp [ OK ]
Unloading i2c-dev... OK
3、运行sensors看是否能读取数据,如下像下面这样表示正常
# sensors
coretemp-isa-0000
Adapter: ISA adapter
ERROR: Can't get value of subfeature temp1_input: Can't read
Physical id 0: +0.0°C (high = +100.0°C, crit = +100.0°C)
ERROR: Can't get value of subfeature temp2_input: Can't read
Core 0: +0.0°C (high = +100.0°C, crit = +100.0°C)
ERROR: Can't get value of subfeature temp3_input: Can't read
Core 1: +0.0°C (high = +100.0°C, crit = +100.0°C)
coretemp-isa-0002
Adapter: ISA adapter
ERROR: Can't get value of subfeature temp1_input: Can't read
Physical id 1: +0.0°C (high = +100.0°C, crit = +100.0°C)
ERROR: Can't get value of subfeature temp2_input: Can't read
Core 0: +0.0°C (high = +100.0°C, crit = +100.0°C)
ERROR: Can't get value of subfeature temp3_input: Can't read
Core 1: +0.0°C (high = +100.0°C, crit = +100.0°C)
4、添加监控脚本vim /usr/local/nagios/libexec/check_cputemp
#!/bin/sh
#########check_cputemp###########
#date : May 2013
#Licence GPLv2
#by Barlow
#/usr/local/nagios/libexec/check_cputemp
#you can use NRPE to define service in nagios
#check_nrpe!check_cputemp
# Plugin return statements
STATE_OK=0
STATE_WARNING=1
STATE_CRITICAL=2
STATE_UNKNOWN=3
print_help_msg(){
$Echo "Usage: $0 -h to get help."
}
print_full_help_msg(){
$Echo "Usage:"
$Echo "$0 [ -v ] -m sensors -w cpuT -c cpuT"
$Echo "Sepicify the method to use the temperature data sensors."
$Echo "And the corresponding Critical value must greater than Warning value."
$Echo "Example:"
$Echo "${0} -m sensors -w 40 -c 50"
}
print_err_msg(){
$Echo "Error."
print_full_help_msg
}
to_debug(){
if [ "$Debug" = "true" ]; then
$Echo "$*" >> /var/log/check_sys_temperature.log.$$ 2>&1
fi
}
unset LANG
Echo="echo -e"
if [ $# -lt 1 ]; then
print_help_msg
exit 3
else
while getopts :vhm:w:c: OPTION
do
case $OPTION
in
v)
#$Echo "Verbose mode."
Debug=true
;;
m)
method=$OPTARG
;;
w)
WARNING=$OPTARG
;;
c)
CRITICAL=$OPTARG ;;
h)
print_full_help_msg
exit 3
;;
?)
$Echo "Error: Illegal Option."
print_help_msg
exit 3
;;
esac
done
if [ "$method" = "sensors" ]; then
use_sensors="true"
to_debug use_sensors
else
$Echo "Error. Must to sepcify the method to use sensors."
print_full_help_msg
exit 3
fi
to_debug All Values are \" Warning: "$WARNING" and Critical: "$CRITICAL" \".
fi
#########lm_sensors##################
if [ "$use_sensors" = "true" ]; then
sensorsCheckOut=`which sensors 2>&1`
if [ $? -ne 0 ];then
echo $sensorsCheckOut
echo Maybe you need to check your sensors.
exit 3
fi
to_debug Use $sensorsCheckOut to check system temperature
TEMP1=`sensors | head -3 | tail -1 | gawk '{print $3}' | grep -o [0-9][0-9]`
TEMP2=`sensors | head -4 | tail -1 | gawk '{print $3}' | grep -o [0-9][0-9]`
TEMP3=`sensors | head -5 | tail -1 | gawk '{print $3}' | grep -o [0-9][0-9]`
TEMP4=`sensors | head -6 | tail -1 | gawk '{print $3}' | grep -o [0-9][0-9]`
##温度的取数根据你cpu的核数确定,我的是四核,所以取TEMP1-4个CPU温度数并计算平均值
SUM=$(( $TEMP1 + $TEMP2 + $TEMP3 + $TEMP4 ))
TEMP=$(($SUM/4))
if [ -z "$TEMP" ] ; then
$Echo "No Data been get here. Please confirm your ARGS and re-check it with Verbose mode, then to check the log."
exit 3
fi
to_debug temperature data is $TEMP
else
$Echo "Error. Must to sepcify the method to use sensors"
print_full_help_msg
exit 3
fi
######### Comparaison with the warnings and criticals thresholds given by user############
CPU_TEMP=$TEMP
#if [ "$WARNING" != "0" ] || [ "$CRITICAL" != "0" ]; then
if [ "$CPU_TEMP" -gt "$CRITICAL" ] && [ "$CRITICAL" != "0" ]; then
STATE="$STATE_CRITICAL"
STATE_MESSAGE="CRITICAL"
to_debug $STATE , Message is $STATE_MESSAGE
elif [ "$CPU_TEMP" -gt "$WARNING" ] && [ "$WARNING" != "0" ]; then
STATE="$STATE_WARNING"
STATE_MESSAGE="WARNING"
to_debug $STATE , Message is $STATE_MESSAGE
else
STATE="$STATE_OK"
STATE_MESSAGE="OK"
to_debug $STATE , Message is $STATE_MESSAGE
fi
##返回值中注意要包含性能数据,即采用|分隔的后半部数据,且数据单位不能包含中文,否则使用PNP等绘图软件无法正常绘图。
echo "The TEMPERATURE "$STATE_MESSAGE" "-" The CPU's Temperature is "$CPU_TEMP" ℃ ! | 温度=`echo $CPU_TEMP`Celsius;$WARNING;$CRITICAL"
exit $STATE
5、赋予脚本执行权限:
chmod +x /usr/local/nagios/libexec/check_cputemp
6、配置vim /usr/local/nagios/etc/nrpe.cfg,添加如下一行:
echo "command[check_cputemp]=/usr/local/nagios/libexec/check_cputemp -m sensors -w 38 -c 45" >>/usr/local/nagios/etc/nrpe.cfg
重新启动客户端nrpe服务
-w 表示警告值,-c表示关键(紧急)值,自行根据实际情况调整
注意:以上六步均在被监控机上完成。
在客户端测试是否ok,虚拟机测试不成功,需要在物理机上实现
# /usr/local/nagios/libexec/check_cputemp -m sensors -w 38 -c 45
The TEMPERATURE OK - The CPU's Temperature is 14 ℃ ! | 温度=14Celsius;38;45
服务端执行测试:
/usr/local/nagios/libexec/check_nrpe -H 192.168.8.93 -c check_cputemp
7、在Nagios服务端配置服务:
define service{
use generic-service
host_name
需要被监控的hostname
service_description CPU Temperature
check_command check_nrpe!check_cputemp
}
保存后重启nagios服务
nagios系列(六)之nagios实现对服务器cpu温度的监控的更多相关文章
- nagios系列(四)之nagios主动方式监控tcp常用的80/3306等端口监控web/syncd/mysql及url服务
nagios主动方式监控tcp服务web/syncd/mysql及url cd /usr/local/nagios/libexec/ [root@node4 libexec]# ./check_tcp ...
- nagios系列(五)之nagios图形显示的配置及自定义插件检测密码是否修改详解
nagios图形显示的配置 在服务端安装相关软件 #1.图形显示管理的依赖库 yum install cairo pango zlib zlib-devel freetype freetype-dev ...
- nagios系列(三)之nagios被动监控模式之添加系统负载load、swap、磁盘iostat及memory内存监控详解
环境: nagios server:192.168.8.42 host_name:node4.chinasoft.com nagios client:192.168.8.41 host_name:no ...
- nagios系列(八)之nagios通过nsclient监控windows主机
nagios通过nsclient监控windows主机 1.下载NSClient -0.3.8-Win32.rar安装在需要被监控的windows主机中 可以设置密码,此处密码留空 2.通过在nagi ...
- nagios系列(二)之nagios客户端的安装及配置
1.添加nagios用户 echo "------ step 1: add nagios user------" #create user group /usr/sbin/user ...
- Nagios系列1,选择
Zabbix和Nagios哪个更好 zabbix: 1.分布式监控,适合于构建分布式监控系统,具有node,proxy 2种分布式模式 2.自动化功能,自动发现,自动注册主机,自动添加模板,自动添加分 ...
- Nagios学习笔记二:Nagios概述
1.简介 Nagios是插件式的结构,它本身没有任何监控功能,所有的监控都是通过插件进行的,因此其是高度模块化和富于弹性的.Nagios监控的对象可分为两类:主机和服务.主机通常指的是物理主机,如服务 ...
- Nagios详解(基础、安装、配置文件解析及监控实例)
一.Nagios基础 1.简介Nagios是一款开源网络监视工具.可监控网络服务(SMTP.POP3.HTTP.NNTP.ICMP.SNMP.FTP.SSH.PING---).监控主机资源.根据需求设 ...
- Netty4.x中文教程系列(六) 从头开始Bootstrap
Netty4.x中文教程系列(六) 从头开始Bootstrap 其实自从中文教程系列(五)一直不知道自己到底想些什么.加上忙着工作上出现了一些问题.本来想就这么放弃维护了.没想到有朋友和我说百度搜索推 ...
随机推荐
- Luogu 1941 【NOIP2014】飞扬的小鸟 (动态规划)
Luogu 1941 [NOIP2014]飞扬的小鸟 (动态规划) Description Flappy Bird 是一款风靡一时的休闲手机游戏.玩家需要不断控制点击手机屏幕的频率来调节小鸟的飞行高度 ...
- NOIP2018普及组模拟赛
向老师给的模拟赛,还没普及组难... 题目在洛谷团队里. 第一试三道水题,我46分钟就打完了,然后就AK了. 第二试一看,除了第二题要思考一段时间之外,还是比较水的,但是我得了Rank倒1,115分. ...
- A1047. Student List for Course
Zhejiang University has 40000 students and provides 2500 courses. Now given the registered course li ...
- (转)git中关于fetch的使用
将远程仓库的分支及分支最新版本代码拉取到本地: 命令:git fetch 该命令执行后,不会将拉取的分支的最新代码合并到当前分支,仅仅是拉取/下载下来到本地仓库中. 首先,我们使用git branch ...
- 1.C和C++的区别
C和C++的区别 C语言语法简单,但使用不易 C++语法非常庞大复杂,但使用方便,更注重的是它的编程思想(面向对象). 一.第一个C++程序 1.文件扩展名 C++源文件扩展名 .cpp,C ...
- request请求地址
1.String contextPath = httpServletRequest.getServletContext().getContextPath(); /项目名称 2.String conte ...
- GBDT
一.决策树分类 决策树分为两大类,分类树和回归树 分类树用于分类标签值,如晴天/阴天/雾/雨.用户性别.网页是否是垃圾页面 回归树用于预测实数值,如明天的温度.用户的年龄 两者的区别: 分类树的结果不 ...
- mac安装神器brew
安装方法:命令行输入 /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/ma ...
- 转--Python标准库之一句话概括
作者原文链接 想掌握Python标准库,读它的官方文档很重要.本文并非此文档的复制版,而是对每一个库的一句话概括以及它的主要函数,由此用什么库心里就会有数了. 文本处理 string: 提供了字符集: ...
- C#使用Font Awesome字体
这个类是一个开源类,我做了一些功能优化1.如果没有安装Font Awesome字体,可能需要直接去exe路径下使用对应名称字体.2.可以直接返回\uFxxx类型字体,方便winform按钮使用,不然的 ...