storm-kafka教程

一、原理介绍

本文内容参考：https://github.com/apache/storm/tree/master/external/storm-kafka#brokerhosts

（一）使用storm-kafka的关键步骤

1、创建ZkHosts

当storm从kafka中读取某个topic的消息时，需要知道这个topic有多少个分区，以及这些分区放在哪个kafka节点(broker)上，

ZkHosts就是用于这个功能。

关于kafka信息在zk中的内容请参考：http://blog.csdn.net/jinhong_lu/article/details/46653087

创建zkHosts有2种形式：

public ZkHosts(String brokerZkStr, String brokerZkPath)

public ZkHosts(String brokerZkStr)

默认情况下，zk信息被放到/brokers中，此时可以使用第2种方式：

new ZkHosts("123.58.172.117:2181,123.58.172.98:2181,123.58.172.111:2181,123.58.172.114:2181,123.58.172.116:2181”)

若zk信息被放置在/kafka/brokers中，则可以使用：

publicZkHosts("123.58.172.117:2181,123.58.172.98:2181,123.58.172.111:2181,123.58.172.114:2181,123.58.172.116:2181",“/kafka")

或者直接：

new ZkHosts("123.58.172.117:2181,123.58.172.98:2181,123.58.172.111:2181,123.58.172.114:2181,123.58.172.116:2181/kafka”)

默认情况下，每60秒去读取一次kafka的分区信息，可以通过修改host.refreshFreqSecs来设置。

除了使用ZkHosts来读取分析信息外，storm-kafka还提供了一种静态指定的方法，如：

[plain] view plain copy

Broker brokerForPartition0 = new Broker("localhost");//localhost:9092
Broker brokerForPartition1 = new Broker("localhost", 9092);//localhost:9092 but we specified the port explicitly
Broker brokerForPartition2 = new Broker("localhost:9092");//localhost:9092 specified as one string.
GlobalPartitionInformation partitionInfo = new GlobalPartitionInformation();
partitionInfo.addPartition(0, brokerForPartition0);//mapping form partition 0 to brokerForPartition0
partitionInfo.addPartition(1, brokerForPartition1);//mapping form partition 1 to brokerForPartition1
partitionInfo.addPartition(2, brokerForPartition2);//mapping form partition 2 to brokerForPartition2
StaticHosts hosts = new StaticHosts(partitionInfo);

由此可以看出，ZkHosts完成的功能就是指定了从哪个kafka节点读取某个topic的哪个分区。

2、创建KafkaConfig

(1)有2种方式创建KafkaConfig

public KafkaConfig(BrokerHosts hosts, String topic)

public KafkaConfig(BrokerHosts hosts, String topic, String clientId)

BrokerHosts就是上面创建的实例，topic就是要订阅的topic名称，clientId用于指定存放当前topic consumer的offset的位置，这个id 应该是唯一的，否则多个拓扑会引起冲突。

事实上，trident的offset并不保存在这个位置，见下面介绍。

真正使用时，有2种扩展，分别用于一般的storm以及trident。

（2）core storm

Spoutconfig is an extension of KafkaConfig that supports additional fields with ZooKeeper connection info and for controlling behavior specific to KafkaSpout. The Zkroot will be used as root to store your consumer's offset. The id should uniquely identify your spout.

public SpoutConfig(BrokerHosts hosts, String topic, String zkRoot, String id);

public SpoutConfig(BrokerHosts hosts, String topic, String id);

In addition to these parameters, SpoutConfig contains the following fields that control how KafkaSpout behaves:

storm-kafka教程

storm-kafka教程的更多相关文章

随机推荐

热门专题