github链接：https://github.com/jaegertracing/jaeger

Jaeger: open source, end-to-end distributed tracing

Monitor and troubleshoot transactions in complex distributed systems

a Cloud Native Computing Foundation incubating project.

Uber published a blog post, Evolving Distributed Tracing at Uber, where they explain the history and reasons for the architectural choices made in Jaeger. Yuri Shkuro, creator of Jaeger, also published a book Mastering Distributed Tracing that covers in-depth many aspects of Jaeger design and operation, as well as distributed tracing in general.

Why Jaeger?

As on-the-ground microservice practitioners are quickly realizing, the majority of operational problems that arise when moving to a distributed architecture are ultimately grounded in two areas: networking and observability. It is simply an orders of magnitude larger problem to network and debug a set of intertwined distributed services versus a single monolithic application.

Problems that Jaeger addresses

It is used for monitoring and troubleshooting microservices-based distributed systems, including:

Distributed context propagation
Distributed transaction monitoring
Root cause analysis
Service dependency analysis
Performance / latency optimization

Kubernetes and OpenShift

Kubernetes templates: https://github.com/jaegertracing/jaeger-kubernetes
Kubernetes Operator: https://github.com/jaegertracing/jaeger-operator
OpenShift templates: https://github.com/jaegertracing/jaeger-openshift

Features

Discover architecture of the whole system via data-driven dependency diagram.
View request timeline and errors; understand how the app works.
Find sources of latency and lack of concurrency.
Highly contextualized logging.
Use baggage propagation to:
- Diagnose inter-request contention (queueing).
- Attribute time spent in a service.
Use open source libraries with OpenTracing integration to get vendor-neutral instrumentation for free.

Features

OpenTracing compatible data model and instrumentation libraries
- in Go, Java, Node, Python and C++
Uses consistent upfront sampling with individual per service/endpoint probabilities
Multiple storage backends: Cassandra, Elasticsearch, memory.
Adaptive sampling (coming soon)
Post-collection data processing pipeline (coming soon)

Technical Specs

Backend components implemented in Go
React/Javascript UI
Supported storage backends:
- Cassandra 3.4+
- Elasticsearch 5.x, 6.x, 7.x
- Kafka
- memory storage

Span

A span represents a logical unit of work in Jaeger that has an operation name, the start time of the operation, and the duration. Spans may be nested and ordered to model causal relationships.

Trace

A trace is a data/execution path through the system, and can be thought of as a directed acyclic graph of spans

Query

Query is a service that retrieves traces from storage and hosts a UI to display them

参考：

the OpenTracing standard

Components

Jaeger can be deployed either as all-in-one binary, where all Jaeger backend components run in a single process, or as a scalable distributed system, discussed below. There two main deployment options:

Collectors are writing directly to storage.
Collectors are writing to Kafka as a preliminary buffer.

Illustration of direct-to-storage architecture

Illustration of architecture with Kafka as intermediate buffer

This section details the constituent parts of Jaeger and how they relate to each other. It is arranged by the order in which spans from your application interact with them.

Jaeger client libraries

Jaeger clients are language specific implementations of the OpenTracing API. They can be used to instrument applications for distributed tracing either manually or with a variety of existing open source frameworks, such as Flask, Dropwizard, gRPC, and many more, that are already integrated with OpenTracing.

An instrumented service creates spans when receiving new requests and attaches context information (trace id, span id, and baggage) to outgoing requests. Only ids and baggage are propagated with requests; all other information that compose a span like operation name, logs, etc. are not propagated. Instead sampled spans are transmitted out of process asynchronously, in the background, to Jaeger Agents.

The instrumentation has very little overhead, and is designed to be always enabled in production.

Note that while all traces are generated, only a few are sampled. Sampling a trace marks the trace for further processing and storage. By default, Jaeger client samples 0.1% of traces (1 in 1000), and has the ability to retrieve sampling strategies from the agent.

Agent

The Jaeger agent is a network daemon that listens for spans sent over UDP, which it batches and sends to the collector. It is designed to be deployed to all hosts as an infrastructure component. The agent abstracts the routing and discovery of the collectors away from the client.

Collector

The Jaeger collector receives traces from Jaeger agents and runs them through a processing pipeline. Currently our pipeline validates traces, indexes them, performs any transformations, and finally stores them.

Jaeger’s storage is a pluggable component which currently supports Cassandra, Elasticsearch and Kafka

Ingester

Ingester is a service that reads from Kafka topic and writes to another storage backend (Cassandra, Elasticsearch)

Monitoring Jaeger

Jaeger itself is a distributed, microservices based system. If you run it in production, you will likely want to setup adequate monitoring for different components, e.g. to ensure that the backend is not saturated by too much tracing data

Metrics

By default Jaeger microservices expose metrics in Prometheus format. It is controlled by the following command line options:

--metrics-backend controls how the measurements are exposed. The default value is prometheus, another option is expvar, the Go standard mechanism for exposing process level statistics.
--metrics-http-route specifies the name of the HTTP endpoint used to scrape the metrics (/metrics by default).

Each Jaeger component exposes the metrics scraping endpoint on one of the HTTP ports they already serve:

Component	Port
jaeger-agent	14271
jaeger-collector	14269
jaeger-query	16687
jaeger-ingester	14270

Logging

Jaeger components only log to standard out, using structured logging library go.uber.org/zap configured to write log lines as JSON encoded strings, for example:

{"level":"info","ts":1517621222.261759,"caller":"healthcheck/handler.go:99","msg":"Health Check server started","http-port":14269,"status":"unavailable"}

The log level can be adjusted via --log-level command line switch; default level is info.　　

Build Telemetry for Distributed Services之Jaeger的更多相关文章

Build Telemetry for Distributed Services之Open Telemetry简介
官网链接:https://opentelemetry.io/about/ OpenTelemetry is the next major version of the OpenTracing and ...
Build Telemetry for Distributed Services之OpenCensus：C#
OpenCensus Easily collect telemetry like metrics and distributed traces from your services OpenCensu ...
Build Telemetry for Distributed Services之OpenTracing实践
官网:https://opentracing.io/docs/best-practices/ Best Practices This page aims to illustrate common us ...
Build Telemetry for Distributed Services之Open Telemetry来历
官网:https://opentelemetry.io/ github:https://github.com/open-telemetry/ Effective observability requi ...
Build Telemetry for Distributed Services之OpenTracing简介
官网地址:https://opentracing.io/ What is Distributed Tracing? Who Uses Distributed Tracing? What is Open ...
Build Telemetry for Distributed Services之OpenTracing项目
中文文档地址:https://wu-sheng.gitbooks.io/opentracing-io/content/pages/quick-start.html 中文github地址:https:/ ...
Build Telemetry for Distributed Services之Elastic APM
官网地址:https://www.elastic.co/guide/en/apm/get-started/current/index.html Overview Elastic APM is an a ...
Build Telemetry for Distributed Services之OpenCensus：Tracing2（待续）
part 1:Tracing1 Sampling Sampling Samplers Global sampler Per span sampler Rules References
Build Telemetry for Distributed Services之OpenTracing指导：C#
官网链接:https://opentracing.io/guides/ 官方微博:https://medium.com/opentracing Welcome to the OpenTracing G ...

随机推荐

《python解释器源码剖析》第12章--python虚拟机中的函数机制
12.0 序函数是任何一门编程语言都具备的基本元素,它可以将多个动作组合起来,一个函数代表了一系列的动作.当然在调用函数时,会干什么来着.对,要在运行时栈中创建栈帧,用于函数的执行. 在python ...
Python实现神经网络算法识别手写数字集
最近忙里偷闲学习了一点机器学习的知识,看到神经网络算法时我和阿Kun便想到要将它用Python代码实现.我们用了两种不同的方法来编写它.这里只放出我的代码. MNIST数据集基于美国国家标准与技术研究 ...
PHP 基础知识-数组
PHP 的数组主要分为: 索引数组 - 带有数字索引的数组关联数组 - 带有指定键的数组多维数组 - 包含一个或多个数组的数组索引数组: 有两种创建索引数组的方法: 索引是自动分配的(索 ...
《流畅的Python》 Sequence Hacking, Hashing and Slicing(没完成)
序列修改,散列和切片基本序列协议:Basic sequence protocol: __len__ and __getitem__ 本章通过代码讨论一个概念: 把protocol当成一个正式接口.协 ...
TCP中的长连接和短连接（转载）
原文地址:http://www.cnblogs.com/onlysun/p/4520553.html 次挥手,所以说每个连接的建立都是需要资源消耗和时间消耗的示意图: ...
nodejs常用框架使用样例
Koa const Koa = require('koa'); const router = require('koa-router')(); const app = new Koa(); const ...
[Google Guava] 8-区间
原文链接译文链接译文:沈义扬范例 1 List scores; 2 Iterable belowMedian =Iterables.filter(scores,Range.lessThan(me ...
详解Kafka: 大数据开发最火的核心技术
详解Kafka: 大数据开发最火的核心技术架构师技术联盟 2019-06-10 09:23:51 本文共3268个字,预计阅读需要9分钟. 广告大数据时代来临,如果你还不知道Kafka那你就真 ...
luogu 2052 [NOI2011]道路修建 BFS序
据说dfs会爆栈,写一个 BFS 序更新就好了~ #include <bits/stdc++.h> #define N 1000005 #define ll long long #defi ...
xgzc— math 专题训练（一）
Lucas定理当\(p\)是质数时,有\((^n_m)\equiv(^{n/p}_{m/p}) * (^{n\%p}_{m\%p}) \pmod{p}\) 狄利克雷卷积定义:\((f*g)(n)= ...

Build Telemetry for Distributed Services之Jaeger

Jaeger: open source, end-to-end distributed tracing

a Cloud Native Computing Foundation incubating project.

Why Jaeger?

Problems that Jaeger addresses

Kubernetes and OpenShift

Features

Features

Technical Specs

Span

Trace

Query

Components

Jaeger client libraries

Agent

Collector

Ingester

Monitoring Jaeger

Metrics

Logging

Build Telemetry for Distributed Services之Jaeger的更多相关文章

随机推荐

热门专题