Open-Falcon 监控系统监控 MySQL/Redis/MongoDB 状态监控
背景:
Open-Falcon 是小米运维部开源的一款互联网企业级监控系统解决方案,具体的安装和使用说明请见官网:http://open-falcon.org/,是一款比较全的监控。而且提供各种API,只需要把数据按照规定给出就能出图,以及报警、集群支持等等。
监控:
1) MySQL 收集信息脚本(mysql_monitor.py)
- #!/bin/env python
- # -*- encoding: utf-8 -*-
- from __future__ import division
- import MySQLdb
- import datetime
- import time
- import os
- import sys
- import fileinput
- import requests
- import json
- import re
- class MySQLMonitorInfo():
- def __init__(self,host,port,user,password):
- self.host = host
- self.port = port
- self.user = user
- self.password = password
- def stat_info(self):
- try:
- m = MySQLdb.connect(host=self.host,user=self.user,passwd=self.password,port=self.port,charset='utf8')
- query = "SHOW GLOBAL STATUS"
- cursor = m.cursor()
- cursor.execute(query)
- Str_string = cursor.fetchall()
- Status_dict = {}
- for Str_key,Str_value in Str_string:
- Status_dict[Str_key] = Str_value
- cursor.close()
- m.close()
- return Status_dict
- except Exception, e:
- print (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")
- print e
- Status_dict = {}
- return Status_dict
- def engine_info(self):
- try:
- m = MySQLdb.connect(host=self.host,user=self.user,passwd=self.password,port=self.port,charset='utf8')
- _engine_regex = re.compile(ur'(History list length) ([0-9]+\.?[0-9]*)\n')
- query = "SHOW ENGINE INNODB STATUS"
- cursor = m.cursor()
- cursor.execute(query)
- Str_string = cursor.fetchone()
- a,b,c = Str_string
- cursor.close()
- m.close()
- return dict(_engine_regex.findall(c))
- except Exception, e:
- print (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")
- print e
- return dict(History_list_length=0)
- if __name__ == '__main__':
- open_falcon_api = 'http://192.168.200.86:1988/v1/push'
- db_list= []
- for line in fileinput.input():
- db_list.append(line.strip())
- for db_info in db_list:
- # host,port,user,password,endpoint,metric = db_info.split(',')
- host,port,user,password,endpoint = db_info.split(',')
- timestamp = int(time.time())
- step = 60
- # tags = "port=%s" %port
- tags = ""
- conn = MySQLMonitorInfo(host,int(port),user,password)
- stat_info = conn.stat_info()
- engine_info = conn.engine_info()
- mysql_stat_list = []
- monitor_keys = [
- ('Com_select','COUNTER'),
- ('Qcache_hits','COUNTER'),
- ('Com_insert','COUNTER'),
- ('Com_update','COUNTER'),
- ('Com_delete','COUNTER'),
- ('Com_replace','COUNTER'),
- ('MySQL_QPS','COUNTER'),
- ('MySQL_TPS','COUNTER'),
- ('ReadWrite_ratio','GAUGE'),
- ('Innodb_buffer_pool_read_requests','COUNTER'),
- ('Innodb_buffer_pool_reads','COUNTER'),
- ('Innodb_buffer_read_hit_ratio','GAUGE'),
- ('Innodb_buffer_pool_pages_flushed','COUNTER'),
- ('Innodb_buffer_pool_pages_free','GAUGE'),
- ('Innodb_buffer_pool_pages_dirty','GAUGE'),
- ('Innodb_buffer_pool_pages_data','GAUGE'),
- ('Bytes_received','COUNTER'),
- ('Bytes_sent','COUNTER'),
- ('Innodb_rows_deleted','COUNTER'),
- ('Innodb_rows_inserted','COUNTER'),
- ('Innodb_rows_read','COUNTER'),
- ('Innodb_rows_updated','COUNTER'),
- ('Innodb_os_log_fsyncs','COUNTER'),
- ('Innodb_os_log_written','COUNTER'),
- ('Created_tmp_disk_tables','COUNTER'),
- ('Created_tmp_tables','COUNTER'),
- ('Connections','COUNTER'),
- ('Innodb_log_waits','COUNTER'),
- ('Slow_queries','COUNTER'),
- ('Binlog_cache_disk_use','COUNTER')
- ]
- for _key,falcon_type in monitor_keys:
- if _key == 'MySQL_QPS':
- _value = int(stat_info.get('Com_select',0)) + int(stat_info.get('Qcache_hits',0))
- elif _key == 'MySQL_TPS':
- _value = int(stat_info.get('Com_insert',0)) + int(stat_info.get('Com_update',0)) + int(stat_info.get('Com_delete',0)) + int(stat_info.get('Com_replace',0))
- elif _key == 'Innodb_buffer_read_hit_ratio':
- try:
- _value = round((int(stat_info.get('Innodb_buffer_pool_read_requests',0)) - int(stat_info.get('Innodb_buffer_pool_reads',0)))/int(stat_info.get('Innodb_buffer_pool_read_requests',0)) * 100,3)
- except ZeroDivisionError:
- _value = 0
- elif _key == 'ReadWrite_ratio':
- try:
- _value = round((int(stat_info.get('Com_select',0)) + int(stat_info.get('Qcache_hits',0)))/(int(stat_info.get('Com_insert',0)) + int(stat_info.get('Com_update',0)) + int(stat_info.get('Com_delete',0)) + int(stat_info.get('Com_replace',0))),2)
- except ZeroDivisionError:
- _value = 0
- else:
- _value = int(stat_info.get(_key,0))
- falcon_format = {
- 'Metric': '%s' % (_key),
- 'Endpoint': endpoint,
- 'Timestamp': timestamp,
- 'Step': step,
- 'Value': _value,
- 'CounterType': falcon_type,
- 'TAGS': tags
- }
- mysql_stat_list.append(falcon_format)
- #_key : History list length
- for _key,_value in engine_info.items():
- _key = "Undo_Log_Length"
- falcon_format = {
- 'Metric': '%s' % (_key),
- 'Endpoint': endpoint,
- 'Timestamp': timestamp,
- 'Step': step,
- 'Value': int(_value),
- 'CounterType': "GAUGE",
- 'TAGS': tags
- }
- mysql_stat_list.append(falcon_format)
- print json.dumps(mysql_stat_list,sort_keys=True,indent=4)
- requests.post(open_falcon_api, data=json.dumps(mysql_stat_list))
指标说明:收集指标里的COUNTER表示每秒执行次数,GAUGE表示直接输出值。
指标 | 类型 | 说明 |
Undo_Log_Length | GAUGE | 未清除的Undo事务数 |
Com_select | COUNTER | select/秒=QPS |
Com_insert | COUNTER | insert/秒 |
Com_update | COUNTER | update/秒 |
Com_delete | COUNTER | delete/秒 |
Com_replace | COUNTER | replace/秒 |
MySQL_QPS | COUNTER | QPS |
MySQL_TPS | COUNTER | TPS |
ReadWrite_ratio | GAUGE | 读写比例 |
Innodb_buffer_pool_read_requests | COUNTER | innodb buffer pool 读次数/秒 |
Innodb_buffer_pool_reads | COUNTER | Disk 读次数/秒 |
Innodb_buffer_read_hit_ratio | GAUGE | innodb buffer pool 命中率 |
Innodb_buffer_pool_pages_flushed | COUNTER | innodb buffer pool 刷写到磁盘的页数/秒 |
Innodb_buffer_pool_pages_free | GAUGE | innodb buffer pool 空闲页的数量 |
Innodb_buffer_pool_pages_dirty | GAUGE | innodb buffer pool 脏页的数量 |
Innodb_buffer_pool_pages_data | GAUGE | innodb buffer pool 数据页的数量 |
Bytes_received | COUNTER | 接收字节数/秒 |
Bytes_sent | COUNTER | 发送字节数/秒 |
Innodb_rows_deleted | COUNTER | innodb表删除的行数/秒 |
Innodb_rows_inserted | COUNTER | innodb表插入的行数/秒 |
Innodb_rows_read | COUNTER | innodb表读取的行数/秒 |
Innodb_rows_updated | COUNTER | innodb表更新的行数/秒 |
Innodb_os_log_fsyncs | COUNTER | Redo Log fsync次数/秒 |
Innodb_os_log_written | COUNTER | Redo Log 写入的字节数/秒 |
Created_tmp_disk_tables | COUNTER | 创建磁盘临时表的数量/秒 |
Created_tmp_tables | COUNTER | 创建内存临时表的数量/秒 |
Connections | COUNTER | 连接数/秒 |
Innodb_log_waits | COUNTER | innodb log buffer不足等待的数量/秒 |
Slow_queries | COUNTER | 慢查询数/秒 |
Binlog_cache_disk_use | COUNTER | Binlog Cache不足的数量/秒 |
使用说明:读取配置到都数据库列表执行,配置文件格式如下(mysqldb_list.txt):
IP,Port,User,Password,endpoint
- 192.168.2.21,3306,root,123,mysql-21:3306
- 192.168.2.88,3306,root,123,mysql-88:3306
最后执行:
- python mysql_monitor.py mysqldb_list.txt
2) Redis 收集信息脚本(redis_monitor.py)
- #!/bin/env python
- #-*- coding:utf-8 -*-
- import json
- import time
- import re
- import redis
- import requests
- import fileinput
- import datetime
- class RedisMonitorInfo():
- def __init__(self,host,port,password):
- self.host = host
- self.port = port
- self.password = password
- def stat_info(self):
- try:
- r = redis.Redis(host=self.host, port=self.port, password=self.password)
- stat_info = r.info()
- return stat_info
- except Exception, e:
- print (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")
- print e
- return dict()
- def cmdstat_info(self):
- try:
- r = redis.Redis(host=self.host, port=self.port, password=self.password)
- cmdstat_info = r.info('Commandstats')
- return cmdstat_info
- except Exception, e:
- print (datetime.datetime.now()).strftime("%Y-%m-%d %H:%M:%S")
- print e
- return dict()
- if __name__ == '__main__':
- open_falcon_api = 'http://192.168.200.86:1988/v1/push'
- db_list= []
- for line in fileinput.input():
- db_list.append(line.strip())
- for db_info in db_list:
- # host,port,password,endpoint,metric = db_info.split(',')
- host,port,password,endpoint = db_info.split(',')
- timestamp = int(time.time())
- step = 60
- falcon_type = 'COUNTER'
- # tags = "port=%s" %port
- tags = ""
- conn = RedisMonitorInfo(host,port,password)
- #查看各个命令每秒执行次数
- redis_cmdstat_dict = {}
- redis_cmdstat_list = []
- cmdstat_info = conn.cmdstat_info()
- for cmdkey in cmdstat_info:
- redis_cmdstat_dict[cmdkey] = cmdstat_info[cmdkey]['calls']
- for _key,_value in redis_cmdstat_dict.items():
- falcon_format = {
- 'Metric': '%s' % (_key),
- 'Endpoint': endpoint,
- 'Timestamp': timestamp,
- 'Step': step,
- 'Value': int(_value),
- 'CounterType': falcon_type,
- 'TAGS': tags
- }
- redis_cmdstat_list.append(falcon_format)
- #查看Redis各种状态,根据需要增删监控项,str的值需要转换成int
- redis_stat_list = []
- monitor_keys = [
- ('connected_clients','GAUGE'),
- ('blocked_clients','GAUGE'),
- ('used_memory','GAUGE'),
- ('used_memory_rss','GAUGE'),
- ('mem_fragmentation_ratio','GAUGE'),
- ('total_commands_processed','COUNTER'),
- ('rejected_connections','COUNTER'),
- ('expired_keys','COUNTER'),
- ('evicted_keys','COUNTER'),
- ('keyspace_hits','COUNTER'),
- ('keyspace_misses','COUNTER'),
- ('keyspace_hit_ratio','GAUGE'),
- ('keys_num','GAUGE'),
- ]
- stat_info = conn.stat_info()
- for _key,falcon_type in monitor_keys:
- #计算命中率
- if _key == 'keyspace_hit_ratio':
- try:
- _value = round(float(stat_info.get('keyspace_hits',0))/(int(stat_info.get('keyspace_hits',0)) + int(stat_info.get('keyspace_misses',0))),4)*100
- except ZeroDivisionError:
- _value = 0
- #碎片率是浮点数
- elif _key == 'mem_fragmentation_ratio':
- _value = float(stat_info.get(_key,0))
- #拿到key的数量
- elif _key == 'keys_num':
- _value = 0
- for i in range(16):
- _key = 'db'+str(i)
- _num = stat_info.get(_key)
- if _num:
- _value += int(_num.get('keys'))
- _key = 'keys_num'
- #其他的都采集成counter,int
- else:
- try:
- _value = int(stat_info[_key])
- except:
- continue
- falcon_format = {
- 'Metric': '%s' % (_key),
- 'Endpoint': endpoint,
- 'Timestamp': timestamp,
- 'Step': step,
- 'Value': _value,
- 'CounterType': falcon_type,
- 'TAGS': tags
- }
- redis_stat_list.append(falcon_format)
- load_data = redis_stat_list+redis_cmdstat_list
- print json.dumps(load_data,sort_keys=True,indent=4)
- requests.post(open_falcon_api, data=json.dumps(load_data))
指标说明:收集指标里的COUNTER表示每秒执行次数,GAUGE表示直接输出值。
指标 | 类型 | 说明 |
connected_clients | GAUGE | 连接的客户端个数 |
blocked_clients | GAUGE | 被阻塞客户端的数量 |
used_memory | GAUGE | Redis分配的内存的总量 |
used_memory_rss | GAUGE | OS分配的内存的总量 |
mem_fragmentation_ratio | GAUGE | 内存碎片率,used_memory_rss/used_memory |
total_commands_processed | COUNTER | 每秒执行的命令数,比较准确的QPS |
rejected_connections | COUNTER | 被拒绝的连接数/秒 |
expired_keys | COUNTER | 过期KEY的数量/秒 |
evicted_keys | COUNTER | 被驱逐KEY的数量/秒 |
keyspace_hits | COUNTER | 命中KEY的数量/秒 |
keyspace_misses | COUNTER | 未命中KEY的数量/秒 |
keyspace_hit_ratio | GAUGE | KEY的命中率 |
keys_num | GAUGE | KEY的数量 |
cmd_* | COUNTER | 各种名字都执行次数/秒 |
使用说明:读取配置到都数据库列表执行,配置文件格式如下(redisdb_list.txt):
IP,Port,Password,endpoint
- 192.168.1.56,7021,zhoujy,redis-56:7021
- 192.168.1.55,7021,zhoujy,redis-55:7021
最后执行:
- python redis_monitor.py redisdb_list.txt
3) MongoDB 收集信息脚本(mongodb_monitor.py)
...后续添加
4)其他相关的监控(需要装上agent),比如下面的指标:
告警项 | 触发条件 | 备注 |
---|---|---|
load.1min | all(#3)>10 | Redis服务器过载,处理能力下降 |
cpu.idle | all(#3)<10 | CPU idle过低,处理能力下降 |
df.bytes.free.percent | all(#3)<20 | 磁盘可用空间百分比低于20%,影响从库RDB和AOF持久化 |
mem.memfree.percent | all(#3)<15 | 内存剩余低于15%,Redis有OOM killer和使用swap的风险 |
mem.swapfree.percent | all(#3)<80 | 使用20% swap,Redis性能下降或OOM风险 |
net.if.out.bytes | all(#3)>94371840 | 网络出口流量超90MB,影响Redis响应 |
net.if.in.bytes | all(#3)>94371840 | 网络入口流量超90MB,影响Redis响应 |
disk.io.util | all(#3)>90 | 磁盘IO可能存负载,影响从库持久化和阻塞写 |
相关文档:
https://github.com/iambocai/falcon-monit-scripts(redis monitor)
https://github.com/ZhuoRoger/redismon(redis monitor)
Open-Falcon 监控系统监控 MySQL/Redis/MongoDB 状态监控的更多相关文章
- 分布式监控系统Zabbix3.4-针对MongoDB性能监控操作笔记
公司在IDC机房的一台服务器上部署了MongoDB,由于所存储的业务数据比较重要,所以对MongoDB的监控显得尤为重要!Zabbix监控MongoDB性能的原理:通过echo "db.se ...
- ELK监控系统nginx / mysql慢日志
ELK监控系统nginx / mysql慢日志 elasticsearch logstash kibana ELK监控系统nginx日志 1.环境准备 centos6.8_64 mini IP:192 ...
- python mysql redis mongodb selneium requests二次封装为什么大都是使用类的原因,一点见解
1.python mysql redis mongodb selneium requests举得这5个库里面的主要被用户使用的东西全都是面向对象的,包括requests.get函数是里面每次都是实例 ...
- centos7.4下搭建JDK+Tomcat+Nginx+Mysql+redis+Mongodb+maven+Git+Jenkins
先干两件大事!先干两件大事!先干两件大事! 1.关闭selinux [root@mycentos ~]# vi /etc/selinux/config SELINUX=disabled 2.关闭防火墙 ...
- 搭建前端监控系统(二)JS错误监控篇
===================================================================== 前端性能监控系统: DEMO地址 GIT代码仓库地址 ...
- 监控MySQL|Redis|MongoDB的执行语句(go-sniffer)
上节回顾:https://www.cnblogs.com/dotnetcrazy/p/9986873.html 以CentOS为例: 1.环境 PS:如果不需要Golang环境,可以编译后把执行文件c ...
- Python操作MySQL+Redis+MongoDB
1-1 python操作三大主流数据库导学篇 1-2 数据库简介 1-3 MySQL简介 2-1 MySQL安装及配置 2-2 MySQL图形化管理工具 2-3 SQL语法基础-创建并使用数据库 2- ...
- python连接MySQL/redis/mongoDB数据库的简单整理
python连接mysql 用python操作mysql,你必须知道pymysql 代码示意: import pymysql conn = pymysql.connect(host='127.0.0. ...
- zabbix的搭建及操作(3)监控 MySQL 及 HTTP 状态监控
书接上回 -- 详情点击 Server端以配置好 mariadb(MySQL) 及 http 服务 Zabbix实现监控 mysql 数据库 server服务器端配置 vim /usr/local/z ...
随机推荐
- Code forces 719A Vitya in the Countryside
A. Vitya in the Countryside time limit per test:1 second memory limit per test:256 megabytes input:s ...
- PowerShell 批量修改AD属性
环境:win 2008 R2 在管理工具中打开用于 windows powershell 的ActiveDirectory模块命令行窗口或打开命令提示符窗口输入PowerShell回车再输入impor ...
- 基于vue2.0+vuex+localStorage开发的本地记事本
本文采用vue2.0+vuex+localStorage+sass+webpack,实现一个本地存储的记事本.兼容PC端和移动端.在线预览地址:DEMO github地址:https://github ...
- 了解 : prevent default
基本了解是阻止事件之前设置好的事件触发,像是angular router ui里的 preventDefault是这样的. 在$stateChange的是后,可以调用preventDefault 来阻 ...
- ES6-01:常量与变量的声明
首先,我们声明一个变量: //定义一个变量num,并赋值为10: let num = 10; //进行打印 console.log(num); let与var有所不同: 语法特点1:let变量只能在当 ...
- DAX/PowerBI系列 - 写在前面
今天讲的主角是: 不过,先上一个图--2017 Gartner商业智能和数据分析魔力象限 DAX关注这个玩意儿有好一段时间了,刚开始的时候(2014年?)是从Excel里面认识的.2014年当时公司用 ...
- 网络服务器系统wamp的安装
第一步,下载wamp Server 可以百度查找下载,也可以到WAMP的官方网站http://wampserver.com/en下载,官网下载会比较慢. 第二步,下载之后,双击运行,安装 第三步,解压 ...
- shell笔记整理1---vim编译器基础应用(参考鸟哥)
1.linux中的配置文件都已是以ASCII的纯文本的形式存在 2.vim文本编译器. 一般模式:用vi打开的一个文件直接进入的就是一般模式,这个模式可以移动光标和删除字符,复制粘贴等,但是不能比那几 ...
- 1610: [Usaco2008 Feb]Line连线游戏
1610: [Usaco2008 Feb]Line连线游戏 Time Limit: 5 Sec Memory Limit: 64 MB Submit: 1396 Solved: 615 [Subm ...
- 1708: [Usaco2007 Oct]Money奶牛的硬币
1708: [Usaco2007 Oct]Money奶牛的硬币 Time Limit: 5 Sec Memory Limit: 64 MBSubmit: 544 Solved: 352[Submi ...