官网链接:https://opentelemetry.io/about/

OpenTelemetry is the next major version of the OpenTracing and OpenCensus projects

The leadership of OpenCensus and OpenTracing have come together to create OpenTelemetry, and it will supersede both projects. You can read more in this post about the OpenTelemetry roadmap.

OpenTelemetry is an open source observability framework. It is a CNCF Sandbox member, formed through a merger of the OpenTracing and OpenCensus projects. The goal of OpenTelemetry is to provide a general-purpose API, SDK, and related tools required for the instrumentation of cloud-native software, frameworks, and libraries.

What is Observability?

The term observability stems from the discipline of control theory and refers to how well a system can be understood on the basis of the telemetry that it produces.

In software, observability typically refers to telemetry produced by services and is divided into three major verticals:

  • Tracing, aka distributed tracing, provides insight into the full lifecycles, aka traces, of requests to the system, allowing you to pinpoint failures and performance issues.
  • Metrics provide quantitative information about processes running inside the system, including counters, gauges, and histograms.
  • Logging provides insight into application-specific messages emitted by processes.

OpenTelemetry is an effort to combine all three verticals into a single set of system components and language-specific telemetry libraries. It is meant to replace both the OpenTracing project, which focused exclusively on tracing, and the OpenCensus project, which focused on tracing and metrics.

OpenTelemetry will not initially support logging, though we aim to incorporate this over time.

Where can I read the OpenTelemetry specification?

The spec is available in the open-telemetry/specification repo on GitHub.

Observability, Outputs, and High-Quality Telemetry

Observability is a fashionable word with some admirably nerdy and academic origins. In control theory, “observability” measures how well we can understand the internals of a given system using only its external outputs. If you’ve ever deployed or operated a modern, microservice-based software application, you have no doubt struggled to understand its performance and behavior, and that’s because those “outputs” are usually meager at best. We can’t understand a complex system if it’s a black box. And the only way to light up those black boxes is with high-quality telemetry: distributed traces, metrics, logs, and more.

So how can we get our hands – and our tools – on precise, low-overhead telemetry from the entirety of a modern software stack? One way would be to carefully instrument every microservice, piece by piece, and layer by layer. This would literally work, it’s also a complete non-starter – we’d spend as much time on the measurement as we would on the software itself! We need telemetry as a built-in feature of our services.

The OpenTelemetry project is designed to make this vision a reality for our industry, but before we describe it in more detail, we should first cover the history and context around OpenTracing and OpenCensus.

OpenTracing and OpenCensus

In practice, there are several flavors (or “verticals” in the diagram) of telemetry data, and then several integration points (or “layers” in the diagram) available for each. Broadly, the cloud-native telemetry landscape is dominated by distributed traces, timeseries metrics, and logs; and end-users typically integrate with a thin instrumentation API or via straightforward structured data formats that describe those traces, metrics, or logs.

For several years now, there has been a well-recognized need for industry-wide collaboration in order to amortize the shared cost of software instrumentation. OpenTracing and OpenCensus have led the way in that effort, and while each project made different architectural choices, the biggest problem with either project has been the fact that there were two of them. And, further, that the two projects weren’t working together and striving for mutual compatibility.

In many ways, it’s most accurate to think of OpenTelemetry as the next major version of both OpenTracing and OpenCensus. Like any version upgrade, we will try to make it easy for both new and existing end-users, but we recognize that the main benefit to the ecosystem is the consolidation itself – not some specific and shiny new feature – and we are prioritizing our own efforts accordingly.

This model still works with the emergence of OpenTelemetry, which is aimed at the instrumentation space. The end users will be able to instrument their applications or frameworks with OpenTelemetry SDK and use Jaeger as the backend for tracing data.

Then there is the question about the future of Jaeger tracers (client libraries), which do occupy the same problem space as OpenTelemetry. In the short term, Jaeger client libraries can be changed to implement the OpenTelemetry API. This may be necessary in order to support new style of instrumentation while keeping the existing functionality that is specific to Jaeger (such as adaptive sampling).

In the long term, we will seriously consider freezing development of Jaeger client libraries and porting their unique features to OpenTelemetry default implementations, either upstream or as plugins. Developing and maintaining client libraries in multiple languages is a significant investment of project resources, which would be better spent on building new backend features.

What about OpenCensus Agent/Collector?

The “battery included” approach did not always work well even for OpenCensus libraries, because they still needed to be configured with specific exporter plugin in order to send data to concrete tracing backends, like Jaeger or Zipkin. To address that issue, the OpenCensus project started development of two backend components called agent and collector, playing nearly identical roles to Jaeger’s agent and collector:

  • agent is a sidecar / host agent that receives telemetry from the client library in a standardized format and forwards it to collector;
  • collector translates the data into the format understood by a specific tracing backend and sends it there. OpenCensus Collector is also able to perform tail-based sampling.

These two components have a much larger overlap with the functionality of the respective Jaeger backend components. However, they are still limited to the problem space of data gathering, rather than trace storage or post-processing. It means that in the future we might also strongly consider deprecating Jaeger agent and collector components and instead deploying the respective OpenTelemetry components. The main open question is whether OpenTelemetry components will be able to support additional features provided by the Jaeger components, such as adaptive sampling.

Conclusion

As I stated at the beginning, the OpenTelemetry project is a good news for the Jaeger project, as they are very much complementary in terms of problem domains each is trying to address. In the areas where there is an overlap, namely client libraries, agent & collector, we are planning to collaborate with OpenTelemetry and ideally deprecate the respective Jaeger components so that we don’t have to waste time maintaining redundant software.

If you are interested in working on observability and reliability challenges at Uber, including our open source tracing platform Jaeger, we are hiring.

References

  1. OpenTracing and OpenCensus are merging (OpenTracing blog)
  2. A Brief History of OpenTelemetry (So Far) (CNCF blog)
  3. OpenTelemetry: Panel Discussion and Q&A (KubeCon EU video)
  4. OpenTelemetry: Backwards Compatibility with OpenTracing and OpenCensus (KubeCon EU video)
 

Build Telemetry for Distributed Services之Open Telemetry简介的更多相关文章

  1. Build Telemetry for Distributed Services之Open Telemetry来历

    官网:https://opentelemetry.io/ github:https://github.com/open-telemetry/ Effective observability requi ...

  2. Build Telemetry for Distributed Services之Jaeger

    github链接:https://github.com/jaegertracing/jaeger 官网:https://www.jaegertracing.io/ Jaeger: open sourc ...

  3. Build Telemetry for Distributed Services之OpenCensus:C#

    OpenCensus Easily collect telemetry like metrics and distributed traces from your services OpenCensu ...

  4. Build Telemetry for Distributed Services之OpenTracing实践

    官网:https://opentracing.io/docs/best-practices/ Best Practices This page aims to illustrate common us ...

  5. Build Telemetry for Distributed Services之OpenTracing项目

    中文文档地址:https://wu-sheng.gitbooks.io/opentracing-io/content/pages/quick-start.html 中文github地址:https:/ ...

  6. Build Telemetry for Distributed Services之OpenTracing简介

    官网地址:https://opentracing.io/ What is Distributed Tracing? Who Uses Distributed Tracing? What is Open ...

  7. Build Telemetry for Distributed Services之Elastic APM

    官网地址:https://www.elastic.co/guide/en/apm/get-started/current/index.html Overview Elastic APM is an a ...

  8. Build Telemetry for Distributed Services之OpenCensus:Tracing2(待续)

    part 1:Tracing1 Sampling Sampling Samplers Global sampler Per span sampler Rules References

  9. Build Telemetry for Distributed Services之OpenTracing指导:C#

    官网链接:https://opentracing.io/guides/ 官方微博:https://medium.com/opentracing Welcome to the OpenTracing G ...

随机推荐

  1. GNS3

    什么是GNS? GNS Graphical Network Simulator Simulator or Emulator? 尽管GNS全拼包含simulator,但实际上是emulator.我们说其 ...

  2. Error creating bean with name 'objectMapperConfigurer' defined in class path resource

  3. ]Kinect for Windows SDK开发入门(六):骨骼追踪基础 上

    原文来自:http://www.cnblogs.com/yangecnu/archive/2012/04/06/KinectSDK_Skeleton_Tracking_Part1.html Kinec ...

  4. 迭代器 Iterator 是什么?(未完成)Iterator 怎么使用?(未完成)有什么特点?(未完成)

    迭代器 Iterator 是什么?(未完成)Iterator 怎么使用?(未完成)有什么特点?(未完成)

  5. 2018牛客网暑期ACM多校训练营(第二场)I- car ( 思维)

    2018牛客网暑期ACM多校训练营(第二场)I- car 链接:https://ac.nowcoder.com/acm/contest/140/I来源:牛客网 时间限制:C/C++ 1秒,其他语言2秒 ...

  6. Beyond Compare 4提示已经过了30天试用期

    打开Beyond Compare 4,提示已经超出30天试用期限制,解决方法:1.修改C:\Program Files\Beyond Compare 4\BCUnrar.dll ,这个文件重命名或者直 ...

  7. 大数据之路week04--day06(I/O流阶段一 之异常)

    从这节开始,进入对I/O流的系统学习,I/O流在往后大数据的学习道路上尤为重要!!!极为重要,必须要提起重视,它与集合,多线程,网络编程,可以说在往后学习或者是工作上,起到一个基石的作用,没了地基,房 ...

  8. windows下mysql5.6.x的日志正确配置方法(my.ini) (网上的都是5.6之前的版本)

    https://blog.csdn.net/databatman/article/details/49951853 感谢楼主,找了好久,试了一下楼主的,果然是对的,网上的日志配置都是5.6之前的版本: ...

  9. ACM-ICPC 2018 沈阳赛区现场赛 K. Let the Flames Begin (约瑟夫环问题)

    题目链接: 题意:有 n 个人围成一个圈,从 1 开始报到第 k 个人出环,问第 m 个出环的人是谁,n.m.k <= 1e18 且 min(m,k)<= 2e6. 题解:容易得出O(m) ...

  10. 一个奇怪的方法解决华为ENSP模拟器路由器启动后命令行一直“#”的问题

    今天打开ensp准备练习一下,设备启动以后发现端口一直是红色.于是打开路由器命令行,发现一直输出“#”号.百度后几乎试了所有方法,什么删除网卡再新建.重启计时器.配置ip地址..都不行. 可是我昨天用 ...