kafka高可用探究

众所周知 kafka 的 topic 可以使用 --replication-factor 数和 partitions 数来保证服务的高可用性

问题发现

但在最近的运维过程中,3台集群的kafka,副本与分区都为3,有其中一台 broker 挂了导致整个集群成了不可用状态,消费者消费不到信息,这是为什么呢?

查了很多资料后发现是kafka本身的 topic __consumer_offsets 搞的鬼。

问题分析

在高版本的kakfa中,消费者的offset偏移量会保存在kafka自身一个叫做__consumer_offsets的topic中,由于这个topic是由kafka本身默认创建,所以副本数会配置文件中指定的默认副本数,一般为1。

查看副本分区情况一般为:

./kafka-topics.sh --zookeeper localhost:2181 --describe __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:1 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 1 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 2 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 3 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 4 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 5 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 6 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 7 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 8 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 9 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 10 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 11 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 12 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 13 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 14 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 15 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 16 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 17 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 18 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 19 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 20 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 21 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 22 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 23 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 24 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 25 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 26 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 27 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 28 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 29 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 30 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 31 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 32 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 33 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 34 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 35 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 36 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 37 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 38 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 39 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 40 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 41 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 42 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 43 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 44 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 45 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 46 Leader: 1 Replicas: 1 Isr: 1
Topic: __consumer_offsets Partition: 47 Leader: 2 Replicas: 2 Isr: 2
Topic: __consumer_offsets Partition: 48 Leader: 3 Replicas: 3 Isr: 3
Topic: __consumer_offsets Partition: 49 Leader: 1 Replicas: 1 Isr: 1

50个分区,每个分区1个副本。

50个分区是遍布在3台broker的,这就导致如果有其中一台broker服务挂了,在其broker的所有Partition将不能正常使用,就导致此Partition的消费者不知道自己的offset偏移量,就导致无法正常消费。

问题解决

方法1

由于现在kafka已经开始正常提供服务,所以只能动态修改:

先准备分区副本规划 json 文件

vim /data/vfan/consumer.json

{
"version": 1,
"partitions": [
{
"topic": "__consumer_offsets",
"partition": 0,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 1,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 2,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 3,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 4,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 5,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 6,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 7,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 8,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 9,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 10,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 11,
"replicas": [
3,
1,
2
]
},
{
"topic": "__consumer_offsets",
"partition": 12,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 13,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 14,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 15,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 16,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 17,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 18,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 19,
"replicas": [
3,
2,
1
]
},

{
"topic": "__consumer_offsets",
"partition": 20,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 21,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 22,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 23,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 24,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 25,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 26,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 27,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 28,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 29,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 30,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 31,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 32,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 33,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 34,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 35,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 36,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 37,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 38,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 39,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 40,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 41,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 42,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 43,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 44,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 45,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 46,
"replicas": [
3,
2,
1
]
},
{
"topic": "__consumer_offsets",
"partition": 47,
"replicas": [
1,
2,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 48,
"replicas": [
2,
1,
3
]
},
{
"topic": "__consumer_offsets",
"partition": 49,
"replicas": [
3,
2,
1
]
}
]
}
各 replicas 所在的 broker id可以自定义修改,但不能有重复的broker

开始执行变更

./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /data/vfan/consumer.json --execute

校验变更是否完成

./kafka-reassign-partitions.sh --zookeeper localhost:2181 --reassignment-json-file /data/vfan/consumer.json --verify

检查变更后效果

./kafka-topics.sh --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: __consumer_offsets Partition: 0 Leader: 1 Replicas: 1,2,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 1 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 2 Leader: 3 Replicas: 3,2,1 Isr: 2,3,1
Topic: __consumer_offsets Partition: 3 Leader: 1 Replicas: 1,2,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 4 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 5 Leader: 3 Replicas: 3,2,1 Isr: 2,3,1
Topic: __consumer_offsets Partition: 6 Leader: 1 Replicas: 1,2,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 7 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 8 Leader: 3 Replicas: 3,2,1 Isr: 2,1,3
Topic: __consumer_offsets Partition: 9 Leader: 1 Replicas: 1,2,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 10 Leader: 2 Replicas: 2,1,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 11 Leader: 3 Replicas: 3,1,2 Isr: 2,3,1
Topic: __consumer_offsets Partition: 12 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 13 Leader: 1 Replicas: 1,2,3 Isr: 1,2,3
Topic: __consumer_offsets Partition: 14 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 15 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 16 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 17 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 18 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 19 Leader: 3 Replicas: 3,2,1 Isr: 1,3,2
Topic: __consumer_offsets Partition: 20 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 21 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 22 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 23 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
Topic: __consumer_offsets Partition: 24 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 25 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 26 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 27 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 28 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 29 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 30 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 31 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 32 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 33 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 34 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 35 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
Topic: __consumer_offsets Partition: 36 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 37 Leader: 3 Replicas: 3,2,1 Isr: 1,3,2
Topic: __consumer_offsets Partition: 38 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 39 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 40 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 41 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
Topic: __consumer_offsets Partition: 42 Leader: 2 Replicas: 2,1,3 Isr: 3,1,2
Topic: __consumer_offsets Partition: 43 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 44 Leader: 1 Replicas: 1,2,3 Isr: 2,3,1
Topic: __consumer_offsets Partition: 45 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 46 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3
Topic: __consumer_offsets Partition: 47 Leader: 1 Replicas: 1,2,3 Isr: 2,1,3
Topic: __consumer_offsets Partition: 48 Leader: 2 Replicas: 2,1,3 Isr: 3,2,1
Topic: __consumer_offsets Partition: 49 Leader: 3 Replicas: 3,2,1 Isr: 1,2,3

副本数已经成为三个并分布在三个broker中,实现高可用。

方法2

直接在kafka服务启动前,修改系统创建topic默认副本分区参数

num.partitions=3 ;当topic不存在系统自动创建时的分区数
default.replication.factor=3 ;当topic不存在系统自动创建时的副本数
offsets.topic.replication.factor=3 ;表示kafka的内部topic consumer_offsets副本数,默认为1

设置完毕后,启动 zk kafka,随后测试生产 消费

## 生产
./kafka-console-producer.sh --broker-list localhost:9092 --topic test

## 消费,--from-beginning参数表示从头开始
./kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test --from-beginning

查看系统生成的topic 分区及副本数

## test
./kafka-topics.sh --zookeeper localhost:2181 --describe --topic test
Topic:test PartitionCount:3 ReplicationFactor:3 Configs:

## __consumer_offsets
./kafka-topics.sh --zookeeper localhost:2181 --describe --topic __consumer_offsets
Topic:__consumer_offsets PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer

系统自动生成的topic也都已实现高可用

kafka高可用探究的更多相关文章

  1. Kafka 高可用设计

    Kafka 高可用设计 2016-02-28 杜亦舒 Kafka在早期版本中,并不提供高可用机制,一旦某个Broker宕机,其上所有Partition都无法继续提供服务,甚至发生数据丢失对于分布式系统 ...

  2. Kafka高可用环境搭建

    Apache Kafka是分布式发布-订阅消息系统,在 kafka官网上对 kafka 的定义:一个分布式发布-订阅消息传递系统. 它最初由LinkedIn公司开发,Linkedin于2010年贡献给 ...

  3. Kafka —— 基于 ZooKeeper 搭建 Kafka 高可用集群

    一.Zookeeper集群搭建 为保证集群高可用,Zookeeper集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群. 1.1 下载 & 解压 下载对应版本Zookeep ...

  4. Kafka 学习之路(二)—— 基于ZooKeeper搭建Kafka高可用集群

    一.Zookeeper集群搭建 为保证集群高可用,Zookeeper集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群. 1.1 下载 & 解压 下载对应版本Zookeep ...

  5. Kafka 系列(二)—— 基于 ZooKeeper 搭建 Kafka 高可用集群

    一.Zookeeper集群搭建 为保证集群高可用,Zookeeper 集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群. 1.1 下载 & 解压 下载对应版本 Zooke ...

  6. 入门大数据---基于Zookeeper搭建Kafka高可用集群

    一.Zookeeper集群搭建 为保证集群高可用,Zookeeper 集群的节点数最好是奇数,最少有三个节点,所以这里搭建一个三个节点的集群. 1.1 下载 & 解压 下载对应版本 Zooke ...

  7. mysql高可用探究 MMM高可用mysql方案

    1    MMM高可用mysql方案 1.1  方案简介 MMM即Master-Master Replication Manager for MySQL(mysql主主复制管理器)关于mysql主主复 ...

  8. Kafka高可用实现原理

    数据存储格式 Kafka的高可靠性的保障来源于其健壮的副本(replication)策略.一个Topic可以分成多个Partition,而一个Partition物理上由多个Segment组成. Seg ...

  9. Kafka高可用实现

    数据存储格式 Kafka的高可靠性的保障来源于其健壮的副本(replication)策略.一个Topic可以分成多个Partition,而一个Partition物理上由多个Segment组成. Seg ...

随机推荐

  1. springboot中@Mapper和@Repository的区别

    @Mapper和@Repository是常用的两个注解,两者都是用在dao上,两者功能差不多,容易混淆,有必要清楚其细微区别: 区别: @Repository需要在Spring中配置扫描地址,然后生成 ...

  2. WPF 获取主线程

    WPF线程获取UI线程   WPF中只能是UI线程才可以改变UI控件相关,当采用多线程工作时,可用以下代码获取 UI线程进行操作: App.Current.Dispatcher.Invoke((Act ...

  3. Linux 系统下10个查看网络与监听的命令

    下面列出来的10个基础的每个linux用户都应该知道的网络和监控命令.网络和监控命令类似于这些: hostname, ping, ifconfig, iwconfig, netstat, nslook ...

  4. yum 和 epel 的详解

    一.概览 1.什么是repo文件 repo文件是Fedora中yum源(软件仓库)的配置文件,通常一个repo文件定义了一个或者多个软件仓库的细节内容,例如我们将从哪里下载需要安装或者升级的软件包,r ...

  5. 关于PLSQL中的一些问题总结:在PLSQL中书写DDL等

    关于问题前导,使用的数据表中涉及到的字段和类型: 在PLSQL中create.drop.truncate等DDL是没有办法直接执行的. 必须要使用: Execute immediate 'DDL语句' ...

  6. 对于Oracle、mysql和sql server中的部分不同理解

    1.在mysql中事务默认是自动提交的,只有设置autocommit为0的时候,才用自己commit:(提到commit不要忘了rollback哦,回滚)2.但是在oracle中必须自己commit: ...

  7. MySQL(四)——

    MySQL官方对索引的定义:索引(Index)是帮助MySQL高效获取数据的数据结构.因此索引的本质就是数据结构.索引的目的在于提高查询效率,可类比字典.书籍的目录等这种形式. 可简单理解为" ...

  8. Django自带评论功能的基本使用

    1. 模块安装 pip install django-contrib-comments 2. 注册APP INSTALLED_APP=( #..., 'django_comments', 'djang ...

  9. TCP头部格式和封装

    文章目录 12.3 TCP头部和封装 12.3.1 端口号 12.3.2 序列号 12.3.3 头部长度 12.3.4 相关控制位 12.3.5 窗口大小 12.3.6 校验和 12.3.7 选项字段 ...

  10. Request请求对象

    一.Request对象由服务器创建,我们使用 浏览器访问服务器资源原理: 二.Request体系结构 其中,servlet 的service()方法参数列表是 servletRequest对象, Ht ...