prometheus 监控项

此处记录prometheus监控项，exporter为 node_exporter

vim rules.yml

groups:

- name: node

  rules:

  - alert: server_status

    expr: up{job="node"} == 0

    for: 15s

    labels:

      severity: 'critical'

    annotations:

      summary: " node_exporter is down"

- name: cluster

  rules:

  - alert: CPU

    expr: (1-rate(node_cpu_seconds_total{mode="idle"}[1m]))*100 > 90

    for: 5s

    labels:

      severity: 'warning'

    annotations:

      summary: " cpu利用率超过 90%，{{ .Labels.name }}当前值: {{ $value }}%"

#  - alert: LOAD1

#    expr: node_load5 > Logical_CPU_core_total*0.3 or node_load1 > Logical_CPU_core_total*0.4 or node_load15 >  Logical_CPU_core_total*0.2

#    for: 5s

#    labels:

#      severity: 'critical'

#    annotations:

#      summary: " load过高 当前值为 {{ $value }}"

  - alert: LOAD1

    expr: node_load1 > Logical_CPU_core_total*3

    for: 5s

    labels:

      severity: 'warning'

    annotations:

      summary: " load1>cpu*3 当前值为 {{ $value }}"

  - alert: LOAD5

    expr:  node_load5 > Logical_CPU_core_total*2

    for: 5s

    labels:

      severity: 'warning'

    annotations:

      summary: " load5>cpu*2 当前值为 {{ $value }}"

  - alert: LOAD15

    expr: node_load15 >  Logical_CPU_core_total*2

    for: 5s

    labels:

      severity: 'warning'

    annotations:

      summary: " load15>cpu*2 当前值为 {{ $value }}"

  - alert: space_root

    expr: (1-node_filesystem_avail_bytes{fstype=~"xfs|ext4",mountpoint="/"}/node_filesystem_size_bytes{fstype=~"xfs|ext4",mountpoint="/"})*100 > 80

    for: 5s

    labels:

      severity: 'critical'

    annotations:

      summary: " /下空间使用率大于80%  当前值为{{ $value }}% "

  - alert: space_data

    expr: (1-node_filesystem_avail_bytes{fstype=~"xfs|ext4",mountpoint="/data"}/node_filesystem_size_bytes{fstype=~"xfs|ext4",mountpoint="/data"})*100 > 80

    for: 5s

    labels:

      severity: 'critical'

    annotations:

      summary: " /data空间使用率大于80% 当前值为{{ $value }}% "

  - alert: upload_rate

    expr: rate(node_network_transmit_bytes_total{device="eth0"}[1m])/1048576 > 10

    for: 5s

    labels:

      severity: 'warning'

    annotations:

      summary: " 上传速率大于10M 当前值为{{ $value }}M"

  - alert: download_rate

    expr: rate(node_network_receive_bytes_total{device="eth0"}[1m])/1048576 > 10

    for: 5s

    labels:

      severity: 'warning'

    annotations:

      summary: " 下载速率大于10M 当前值为{{ $value }}M "

  - alert: inode_size

    expr: (1-node_filesystem_files_free{fstype=~"xfs|ext4",mountpoint="/"}/node_filesystem_files{fstype=~"xfs|ext4",mountpoint="/"})*100 > 50

    for: 5s

    labels:

      severity: 'critical'

    annotations:

      summary: " /下inode使用率大于50% 当前值为{{ $value }}% "

  - alert: Memory_usage

    expr: (1-(node_memory_MemAvailable_bytes)/node_memory_MemTotal_bytes)*100 > 80

    for: 5s

    labels:

      severity: 'warning'

    annotations:

      summary: "内存使用率大于80% 当前值为{{ $value }}% "

  - alert: iowait

    expr: (avg by (instance) (rate(node_cpu_seconds_total{mode="iowait"}[5m])) * 100) > 50

    for: 5s

    labels:

      severity: 'critical'

    annotations:

      summary: "cpu iowait大于50% 当前值为{{ $value }}% "

  - alert: procs_zombie

    expr: procs_zombie > 20

    for: 5s

    labels:

      severity: 'critical'

    annotations:

      summary: " procs_zombie 大于20 当前值为{{ $value }} "

  - alert: logined_users

    expr: logined_users_total > 25

    for: 5s

    labels:

      severity: 'critical'

    annotations:

      summary: "logined_users 大于25 当前值为{{ $value }} "

prometheus 监控项的更多相关文章

prometheus 监控ElasticSearch核心指标
ES监控方案本文主要讲述使用 Prometheus监控ES,梳理核心监控指标并构建 Dashboard ,当集群有异常或者节点发生故障时,可以根据性能图表以高效率的方式进行问题诊断,再对核心指标筛选 ...
Prometheus Operator自定义监控项
Prometheus Operator默认的监控指标并不能完全满足实际的监控需求,这时候就需要我们自己根据业务添加自定义监控.添加一个自定义监控的步骤如下: 1.创建一个ServiceMonitor对 ...
prometheus node-exporter增加新的自定义监控项
项目中collector中新增加自己所需监控项即可定义启动node-exporter是传入的参数 var ( phpEndPoint = kingpin.Flag("collector.p ...
prometheus监控系统
关于Prometheus Prometheus是一套开源的监控系统,它将所有信息都存储为时间序列数据:因此实现一种Profiling监控方式,实时分析系统运行的状态.执行时间.调用次数等,以找到系统的 ...
Prometheus监控⼊⻔简介
文档目录: • prometheus是什么?• prometheus能为我们带来些什么• prometheus对于运维的要求• prometheus多图效果展示 1) Prometheus是什么pro ...
Prometheus监控学习笔记之Prometheus不完全避坑指南
0x00 概述 Prometheus 是一个开源监控系统,它本身已经成为了云原生中指标监控的事实标准,几乎所有 k8s 的核心组件以及其它云原生系统都以 Prometheus 的指标格式输出自己的运行 ...
Prometheus监控学习笔记之360基于Prometheus的在线服务监控实践
0x00 初衷最近参与的几个项目,无一例外对监控都有极强的要求,需要对项目中各组件进行详细监控,如服务端API的请求次数.响应时间.到达率.接口错误率.分布式存储中的集群IOPS.节点在线情况.偏移 ...
Grafana+Zabbix+Prometheus 监控系统
环境说明软件版本操作系统 IP地址 Grafana 5.4.3-1 Centos7.5 192.168.18.231 Prometheus 2.6.1 Centos7.5 192.168.18. ...
Kubernetes容器集群管理环境 - Prometheus监控篇
一.Prometheus介绍之前已经详细介绍了Kubernetes集群部署篇,今天这里重点说下Kubernetes监控方案-Prometheus+Grafana.Prometheus(普罗米修斯)是一 ...

随机推荐

JS中的 map, filter, some, every, forEach, for in, for of 用法总结和区别
JS中的 map, filter, some, every, forEach, for in, for of 用法总结和区别 :https://blog.csdn.net/hyupeng1006/a ...
Vue 进阶系列（一）之响应式原理及实现
Vue 进阶系列(一)之响应式原理及实现:https://juejin.im/post/5bce6a26e51d4579e9711f1d Vue 进阶系列(二)之插件原理及实现:https://jue ...
Failure to transfer org.apache.maven.plugins:maven-resources-plugin:pom:2.6 的解决办法
eclipse导入mavn工程报Failure to transfer org.apache.maven.plugins:maven-resources-plugin:pom:2.6 的解决办法: 错 ...
[gym101981D][2018ICPC南京D题]Country Meow
题目链接题目大意是求三维空间可以包含$n$个点的最小圆半径. 如果有做过洛谷P1337就会发现这到题很模拟退火,所以就瞎搞一发. $PS:$注意本题时限$3$秒. #include<bits/ ...
洛谷 P1072 Hankson 的趣味题题解
题面提前知识:gcd(a/d,b/d)*d=gcd(a,b); lcm(a,b)=a*b/gcd(a,b); 那么可以比较轻松的算出:gcd(x/a1,a0/a1)==gcd(b1/b0,b1/x) ...
细说Python的lambda函数用法，建议收藏
细说Python的lambda函数用法,建议收藏在Python中有两种函数,一种是def定义的函数,另一种是lambda函数,也就是大家常说的匿名函数.今天我就和大家聊聊lambda函数,在Pyth ...
如何利用`keep-alive`按需缓存页面数据
随着项目不断变大,页面变多,搜索条件也随之也越来越多,而每次跳转页面再返回时,之前的筛选的条件都会别清空.之前在elment-ui table组件 -- 远程筛选排序提到过缓存,但是有所取巧,这次重新 ...
css3之新增伪类
css3新增了许多伪类,但是IE8以及更低版本的IE浏览器不支持css3伪类,所以在使用时要是涉及到布局等意象全局的样式,应该多考虑一下. 1.elem:nth-child(n) 这个伪类选中父元素下 ...
Android中res下anim和animator文件夹区别与总结
1.anim文件夹 anim文件夹下存放tween animation(补间动画)和frame animation(逐帧动画) 逐帧动画: ①在animation-list中使用item定义动画的全部 ...
Flask开发系列之快速入门
Flask开发系列之快速入门文档一个最小的应用调试模式路由变量规则构造 URL HTTP 方法静态文件模板渲染访问请求数据环境局部变量请求对象文件上传 Cookies 重定向和 ...

prometheus 监控项

prometheus 监控项的更多相关文章

随机推荐

热门专题