Monitoring tools that everyone's currently using
Although a lot of new tools have arrived since 2011, it's clear that older open source tools like Nagios, and Nagios alternatives like Zabbix and Icinga, still dominate the market, with 70% of the companies we spoke to still using these tools for their core monitoring & alerting.
Around 70% of the companies used more than one monitoring tool, with most using an average of two. Nagios/Graphite configurations were most common, with many also using New Relic. However, only two of the companies we spoke to actually paid for New Relic, with most of the companies using the free version as they found the paid version too expensive.
In the "other" category, there were a lot of different tools with no particular one standing out. Types of tools that fell into this category were SaaS monitoring tools such as Librato & Datadog, used by several smaller start-ups, or many older open source tools like Cacti or Munin. Some AWS users rely on CloudWatch, and there were even a few custom built solutions.

Graph 1: Percentage of companies with monitoring tools deployed.
If we look at tool usage versus the number of servers the companies manage (< 20 being new startup services, and all the way to > 1000 servers for the large online services), you can see that the proportion of older open source tools like Nagios, or paid on-premise tools goes up as the service gets larger, whereas the smaller, newer services are more likely to use developer focused tools like Graphite, LogStash and New Relic.
This makes sense, as many of the larger services are older (> 5 years old) so have legacy monitoring infrastructure, and also have the resources to hire a dedicated operations team who tend to bring in the tools their most familiar with, namely Nagios or Nagios alternatives. They also have more money to pay for monitoring tools like Splunk (which everyone would love to have if they could afford it) or AppDynamics.
The newer smaller services tend not to have any DevOps/Operations people in their company, so developers tend to use simpler-to-install SaaS monitoring tools, or tools that help them such as Graphite or LogStash. There seems to be a tipping point between 50-100 servers when the company has the resources to bring in a DevOps/Operations person or team and they start bringing in the infrastructure monitoring tools like Nagios to provide the coverage they need.

Graph 2: Tool Usage vs. Number of Servers Managed
Key Trends:
1. Many people found the newer services lacked the flexibility of open source solutions with their ability to customize them to their requirements, and didn't like the idea of learning a proprietary system with its own plugin design and features. So they built their own "kit car".
2. While the services became larger, the trend was the move towards microservices, with different cross-functional development teams building, deploying and supporting their own parts of the service.
3. There are some simpler things that can be done to reduce spammy alerts with potential of predictive & more intelligent alerting using machine learning.
[excerpt from Outlyer]
Monitoring tools that everyone's currently using的更多相关文章
- PostgreSQL Performance Monitoring Tools
PostgreSQL Performance Monitoring Tools https://github.com/CloudServer/postgresql-perf-tools This pa ...
- 4. Traffic monitoring tools (流量监控工具 10个)
4. Traffic monitoring tools (流量监控工具 10个)EttercapNtop SolarWinds已经创建并销售了针对系统管理员的数十种专用工具. 安全相关工具包括许多网络 ...
- Top 10 Free Wireless Network hacking/monitoring tools for ethical hackers and businesses
There are lots of free tools available online to get easy access to the WiFi networks intended to he ...
- Top 12 Best Free Network Monitoring Tools (12种免费网络监控工具)
1) Fiddler Fiddler(几乎)是适用于任何平台和任何操作系统的最好的免费网络工具,并提供了一些广受欢迎的关键特性.如:性能测试.捕捉记录HTTP/HTTPs请求响应.进行web调试等很多 ...
- Java Monitoring&Troubleshooting Tools
JDK Tools and Utilities Monitoring Tools You can use the following tools to monitor JVM performance ...
- troubleshooting tools in JDK 7--转载
This chapter describes in detail the troubleshooting tools that are available in JDK 7. In addition, ...
- Java Performance Optimization Tools and Techniques for Turbocharged Apps--reference
Java Performance Optimization by: Pierre-Hugues Charbonneau reference:http://refcardz.dzone.com/refc ...
- Flink监控:Monitoring Apache Flink Applications
This post originally appeared on the Apache Flink blog. It was reproduced here under the Apache Lice ...
- MySQL Performance Tuning: Tips, Scripts and Tools
With MySQL, common configuration mistakes can cause serious performance problems. In fact, if you mi ...
随机推荐
- 为什么可以Ping通IP地址,但Ping不通域名?
能否ping通IP地址,与能否解析域名是两回事不能ping通IP地址,说明对方禁止ICMP报文或对方没有开机等解析域名只是将域名翻译成IP地址,不论该IP地址是否能够正常访问 问题是ping域名的时候 ...
- COUNT(DISTINCT a.TransportOrderID)的用法
DECLARE @StartDate DATETIME= '2017-12-20 00:00:00';DECLARE @EndDate DATETIME= '2017-12-26 00:00:00'; ...
- css开发素材网址
1.border-collapse 为表格设置合并边框模型 2.border-spacing border-spacing 属性设置相邻单元格的边框间的距离 backface-visibility:h ...
- java:system根据输入的内容,然后输出(字节流)
把输入的内容输出来:根据system.in的内容System.out.println输出出来 都是字节流,的形式: //限制读取的字符长度 //字节流 InputStream ips = System ...
- 机器学习(三)—线性回归、逻辑回归、Softmax回归 的区别
1.什么是回归? 是一种监督学习方式,用于预测输入变量和输出变量之间的关系,等价于函数拟合,选择一条函数曲线使其更好的拟合已知数据且更好的预测未知数据. 2.线性回归 于一个一般的线性模型而言,其 ...
- 分布式_理论_03_2PC
一.前言 五.参考资料 1.分布式理论(三)—— 一致性协议之 2PC 2.分布式理论(三) - 2PC协议
- loj 6085.「美团 CodeM 资格赛」优惠券
题目: 一个有门禁的大楼,初始时里面没有人. 现在有一些人在进出大楼,每个人都有一个唯一的编号.现在有他们进出大楼的记录,但是有些被污染了,只能知道这里有一条记录,具体并不能知道. 一个人只有进大楼, ...
- 洛谷 P3048 [USACO12FEB]牛的IDCow IDs
题目描述 Being a secret computer geek, Farmer John labels all of his cows with binary numbers. However, ...
- codevs 2503 失恋28天-缝补礼物
题目描述 Description 话说上回他给女孩送了n件礼物,由于是廉价的所以全部都坏掉了,女孩很在意这些礼物,所以决定自己缝补,但是人生苦短啊,女孩时间有限,她总共有m分钟能去缝补礼物.由于损坏程 ...
- JSplitPane的简单实现
import java.awt.Color; import javax.swing.ImageIcon; import javax.swing.JFrame; import javax.swing.J ...