事件机制:

  Watcher 监听机制是 Zookeeper 中非常重要的特性,我们基于 zookeeper 上创建的节点,可以对这些节点绑定监听事件,比如可以监听节点数据变更、节点删除、子节点状态变更等事件,通过这个事件机制,可以基于 zookeeper实现分布式锁、集群管理等功能。

  watcher 特性:当数据发生变化的时候, zookeeper 会产生一个 watcher 事件,并且会发送到客户端。但是客户端只会收到一次通知。如果后续这个节点再次发生变化,那么之前设置 watcher 的客户端不会再次收到消息。(watcher 是一次性的操作)。 可以通过循环监听去达到永久监听效果。

如何注册事件机制:

  ZooKeeper 的 Watcher 机制,总的来说可以分为三个过程:客户端注册 Watcher、服务器处理 Watcher 和客户端回调 Watcher客户端。注册 watcher 有 3 种方式,getData、exists、getChildren;以如下代码为例

  如何触发事件? 凡是事务类型的操作,都会触发监听事件。create /delete /setData,来看以下代码简单实现

  1. public class WatcherDemo {
  2.  
  3. public static void main(String[] args) throws IOException, InterruptedException, KeeperException {
  4. final CountDownLatch countDownLatch=new CountDownLatch();
  5. final ZooKeeper zooKeeper=
  6. new ZooKeeper("192.168.254.135:2181," +
  7. "192.168.254.136:2181,192.168.254.137:2181",
  8. , new Watcher() {
  9. @Override
  10. public void process(WatchedEvent event) {
  11. System.out.println("默认事件: "+event.getType());
  12. if(Event.KeeperState.SyncConnected==event.getState()){
  13. //如果收到了服务端的响应事件,连接成功
  14. countDownLatch.countDown();
  15. }
  16. }
  17. });
  18. countDownLatch.await();
  19.  
  20. zooKeeper.create("/zk-wuzz","".getBytes(),
  21. ZooDefs.Ids.OPEN_ACL_UNSAFE,CreateMode.PERSISTENT);
  22.  
  23. //exists getdata getchildren
  24. //通过exists绑定事件
  25. Stat stat=zooKeeper.exists("/zk-wuzz", new Watcher() {
  26. @Override
  27. public void process(WatchedEvent event) {
  28. System.out.println(event.getType()+"->"+event.getPath());
  29. try {
  30. //再一次去绑定事件 ,但是这个走的是默认事件
  31. zooKeeper.exists(event.getPath(),true);
  32. } catch (KeeperException e) {
  33. e.printStackTrace();
  34. } catch (InterruptedException e) {
  35. e.printStackTrace();
  36. }
  37. }
  38. });
  39. //通过修改的事务类型操作来触发监听事件
  40. stat=zooKeeper.setData("/zk-wuzz","".getBytes(),stat.getVersion());
  41.  
  42. Thread.sleep();
  43.  
  44. zooKeeper.delete("/zk-wuzz",stat.getVersion());
  45.  
  46. System.in.read();
  47. }
  48. }

  以上就是 Watcher 的简单实现操作。接下来浅析一下这个 Watcher 实现的流程。

watcher 事件类型:

  1. //org.apache.zookeeper.Watcher.Event.EventType
    public enum EventType {
  2. None (-1), // 客户端连接状态发生变化的时候 会受到none事件
  3. NodeCreated (1), // 节点创建事件
  4. NodeDeleted (2), // 节点删除事件
  5. NodeDataChanged (3), // 节点数据变化
  6. NodeChildrenChanged (4); // 子节点被创建 删除触发该事件
  7. }

事件的实现原理:

  client 端连接后会注册一个事件,然后客户端会保存这个事件,通过zkWatcherManager 保存客户端的事件注册,通知服务端 Watcher 为 true,然后服务端会通过WahcerManager 会绑定path对应的事件。如下图:

请求发送:

  接下去通过源码层面去熟悉一下这个 Watcher 的流程。由于我们demo 是通过exists 来注册事件,那么我们就通过 exists 来作为入口。先来看看ZooKeeper API 的初始化过程:

  1. public ZooKeeper(String connectString, int sessionTimeout, Watcher watcher,
  2. boolean canBeReadOnly)
  3. throws IOException
  4. {
  5. LOG.info("Initiating client connection, connectString=" + connectString
  6. + " sessionTimeout=" + sessionTimeout + " watcher=" + watcher);
  7. //--在这里将 watcher 设置到ZKWatchManager
  8. watchManager.defaultWatcher = watcher;
  9.  
  10. ConnectStringParser connectStringParser = new ConnectStringParser(
  11. connectString);
  12. HostProvider hostProvider = new StaticHostProvider(connectStringParser.getServerAddresses());
  13. //初始化了 ClientCnxn,并且调用 cnxn.start()方法
  14. cnxn = new ClientCnxn(connectStringParser.getChrootPath(),hostProvider, sessionTimeout, this, watchManager,getClientCnxnSocket(), canBeReadOnly);
  15. cnxn.start();
  16. }

  在创建一个 ZooKeeper 客户端对象实例时,我们通过 new Watcher()向构造方法中传入一个默认的 Watcher, 这个 Watcher 将作为整个 ZooKeeper 会话期间的默认Watcher,会一直被保存在客户端 ZKWatchManager 的 defaultWatcher 中.其中初始化了 ClientCnxn并且调用了其start 方法:

  1. public ClientCnxn(String chrootPath, HostProvider hostProvider, int sessionTimeout, ZooKeeper zooKeeper,
  2. ClientWatchManager watcher, ClientCnxnSocket clientCnxnSocket,
  3. long sessionId, byte[] sessionPasswd, boolean canBeReadOnly) {
  4. this.zooKeeper = zooKeeper;
  5. this.watcher = watcher;
  6. this.sessionId = sessionId;
  7. this.sessionPasswd = sessionPasswd;
  8. this.sessionTimeout = sessionTimeout;//会话超时
  9. this.hostProvider = hostProvider;
  10. this.chrootPath = chrootPath;
  11. // 连接超时
  12. connectTimeout = sessionTimeout / hostProvider.size();
  13. readTimeout = sessionTimeout * / ; //超时
  14. readOnly = canBeReadOnly;
  15. // 新建了一个发送线程
  16. sendThread = new SendThread(clientCnxnSocket);
  17. // 处理watcher回调event的线程
  18. eventThread = new EventThread();
  19.  
  20. }
  21. //启动两个线程
  22. public void start() {
  23. sendThread.start();
  24. eventThread.start();
  25. }

  ClientCnxn:是 Zookeeper 客户端和 Zookeeper 服务器端进行通信和事件通知处理的主要类,它内部包含两个类,

  1. SendThread :负责客户端和服务器端的数据通信, 也包括事件信息的传输
  2. EventThread : 主要在客户端回调注册的 Watchers 进行通知处理

  接下去就是我们通过getData、exists、getChildren 注册事件的过程了,以exists为例:

  1. public Stat exists(final String path, Watcher watcher)
  2. throws KeeperException, InterruptedException
  3. {
  4. final String clientPath = path;
  5. PathUtils.validatePath(clientPath);
  6.      // 这个很关键,执行回调的时候会用到
  7. WatchRegistration wcb = null;
  8. if (watcher != null) {//不为空,将进行包装
  9. wcb = new ExistsWatchRegistration(watcher, clientPath);
  10. }
  11.  
  12. final String serverPath = prependChroot(clientPath);
  13. //类似手写RPC中的一个请求类request
  14. //在这里 requesr就封装了两个东西 1.ZooDefs.OpCode.exists
  15. //还有一个是watch ->true
  16.      RequestHeader h = new RequestHeader();
  17. h.setType(ZooDefs.OpCode.exists);
  18. ExistsRequest request = new ExistsRequest();
  19. request.setPath(serverPath);
  20. request.setWatch(watcher != null);
  21. SetDataResponse response = new SetDataResponse();
  22. //通过客户端的网络处理类去提交请求
  23. ReplyHeader r = cnxn.submitRequest(h, request, response, wcb);
  24. if (r.getErr() != ) {
  25. if (r.getErr() == KeeperException.Code.NONODE.intValue()) {
  26. return null;
  27. }
  28. throw KeeperException.create(KeeperException.Code.get(r.getErr()),
  29. clientPath);
  30. }
  31.  
  32. return response.getStat().getCzxid() == - ? null : response.getStat();
  33. }

  其实这个方法内就做了两件事,初始化了ExistsWatchRegistration 以及封装了一个网络请求参数 ExistsRequest,接着通过 cnxn.submitRequest 发送请求:

  1. public ReplyHeader submitRequest(RequestHeader h, Record request,
  2. Record response, WatchRegistration watchRegistration)
  3. throws InterruptedException {
  4. ReplyHeader r = new ReplyHeader();//应答消息头
  5. //组装请求入队
  6. Packet packet = queuePacket(h, r, request, response, null, null, null,
  7. null, watchRegistration);
  8.      //等待请求完成。否则阻塞
  9. synchronized (packet) {
  10. while (!packet.finished) {
  11. packet.wait();
  12. }
  13. }
  14. return r;
  15. }

  这里验证了我们之前流程图中对于请求进行封包都过程,紧接着会调用wait进入阻塞,一直的等待整个请求处理完毕,我们跟进 queuePacket:

  1. Packet queuePacket(RequestHeader h, ReplyHeader r, Record request,
  2. Record response, AsyncCallback cb, String clientPath,
  3. String serverPath, Object ctx, WatchRegistration watchRegistration)
  4. {
  5. Packet packet = null;
  6.    // 这个队列就是存放我们请求的队列,注意,我们还没有为包生成Xid。它是在发送时生成,通过实现ClientCnxnSocket::doIO(),数据包实际发送的地方。
  7. synchronized (outgoingQueue) {
  8. packet = new Packet(h, r, request, response, watchRegistration);
  9. packet.cb = cb;
  10. packet.ctx = ctx;
  11. packet.clientPath = clientPath;
  12. packet.serverPath = serverPath;
  13. if (!state.isAlive() || closing) {
  14. conLossPacket(packet);
  15. } else {
  16. // If the client is asking to close the session then
  17. // mark as closing
  18. if (h.getType() == OpCode.closeSession) {
  19. closing = true;
  20. }//请求包入队
  21. outgoingQueue.add(packet);
  22. }
  23. }
  24.      //唤醒selector
  25. sendThread.getClientCnxnSocket().wakeupCnxn();
  26. return packet;
  27. }

  这里加了个同步锁以避免并发问题,封装了一个  Packet 并将其加入到一个阻塞队列  outgoingQueue 中,最后调用 sendThread.getClientCnxnSocket().wakeupCnxn() 唤醒selector。看到这里,发现只是发送了数据,那哪里触发了对 outgoingQueue 队列的消息进行消费。再把组装的packeet 放入队列的时候用到的 cnxn.submitRequest(h, request, response, wcb);这个cnxn 是哪里来的呢? 在 zookeeper的构造函数中,我们初始化了一个ClientCnxn并且启动了两个线程:

  1. public void start() {
  2. sendThread.start();
  3. eventThread.start();
  4. }

  对于当前场景来说,目前是需要将封装好的数据包发送出去,很显然走的是 SendThread,我们进入他的 Run 方法:

  1. public void run() {
  2. clientCnxnSocket.introduce(this,sessionId);
  3. clientCnxnSocket.updateNow();
  4.         //心跳相关
  5. clientCnxnSocket.updateLastSendAndHeard();
  6. int to;
  7. long lastPingRwServer = Time.currentElapsedTime();
  8. final int MAX_SEND_PING_INTERVAL = ; //10 seconds
  9. InetSocketAddress serverAddress = null;
  10. while (state.isAlive()) {
  11.           //......七七八八一顿判断
  12.           //发起网络请求
  13. clientCnxnSocket.doTransport(to, pendingQueue, outgoingQueue, ClientCnxn.this);
  14. }
  15. cleanup();
  16. clientCnxnSocket.close();
  17. if (state.isAlive()) {
  18. eventThread.queueEvent(new WatchedEvent(Event.EventType.None,
  19. Event.KeeperState.Disconnected, null));
  20. }
  21. }

  这一步大部分的逻辑是进行校验判断连接状态,以及相关心跳维持得操作,最后会走 clientCnxnSocket.doTransport :

  1. void doTransport(int waitTimeOut, List<Packet> pendingQueue, LinkedList<Packet> outgoingQueue,
  2. ClientCnxn cnxn)
  3. throws IOException, InterruptedException {
  4. selector.select(waitTimeOut);
  5. Set<SelectionKey> selected;
  6. synchronized (this) {// 获取 selectKeys
  7. selected = selector.selectedKeys();
  8. }
  9. updateNow();//理解为时间常量
  10. for (SelectionKey k : selected) {//获取channel
  11. SocketChannel sc = ((SocketChannel) k.channel());
  12. // readyOps :获取此键上ready操作集合.即在当前通道上已经就绪的事件
  13. // SelectKey.OP_CONNECT 连接就绪事件,表示客户与服务器的连接已经建立成功
  14. // 两者的与计算不等于0
  15. if ((k.readyOps() & SelectionKey.OP_CONNECT) != ) {
  16. if (sc.finishConnect()) {
  17. updateLastSendAndHeard();
  18. sendThread.primeConnection();
  19. }
  20.         // 读或者写通道准备完毕
  21. } else if ((k.readyOps() & (SelectionKey.OP_READ | SelectionKey.OP_WRITE)) != ) {
  22. //进行IO传输
  23. doIO(pendingQueue, outgoingQueue, cnxn);
  24. }
  25. }
  26. if (sendThread.getZkState().isConnected()) {
  27. synchronized(outgoingQueue) {
  28. if (findSendablePacket(outgoingQueue,
  29. cnxn.sendThread.clientTunneledAuthenticationInProgress()) != null) {
  30. enableWrite();
  31. }
  32. }
  33. }
  34. selected.clear();
  35. }

  这里的代码相信很多小伙伴都不会很陌生,是 Java  NIO相关操作的API,对于当前场景,这里我们是走 SelectionKey.OP_WRITE ,即  doIO(pendingQueue, outgoingQueue, cnxn) :

  1. void doIO(List<Packet> pendingQueue, LinkedList<Packet> outgoingQueue, ClientCnxn cnxn)
  2. throws InterruptedException, IOException {
  3. SocketChannel sock = (SocketChannel) sockKey.channel();
  4. if (sock == null) {
  5. throw new IOException("Socket is null!");
  6. }
  7. // 可读状态
  8.      // ....省略部分代码,对于目前来说是要将exsits指令发送出去,写出去
  9. // 可写状态
  10.      if (sockKey.isWritable()) {
  11. synchronized(outgoingQueue) {//加锁
  12. // 发现传输包
  13. Packet p = findSendablePacket(outgoingQueue,
  14. cnxn.sendThread.clientTunneledAuthenticationInProgress());
  15.  
  16. if (p != null) {
  17. updateLastSend();//心跳相关操作
  18. // If we already started writing p, p.bb will already exist
  19. if (p.bb == null) {
  20. if ((p.requestHeader != null) &&
  21. (p.requestHeader.getType() != OpCode.ping) &&
  22. (p.requestHeader.getType() != OpCode.auth)) {
  23. p.requestHeader.setXid(cnxn.getXid());
  24. }
  25. p.createBB();
  26. }//将数据写入channel
  27. sock.write(p.bb);
  28.           // .......省略部分代码
  29. }
  30. }
  31. }
  32. public void createBB() {
  33. try {
  34. ByteArrayOutputStream baos = new ByteArrayOutputStream();
  35. BinaryOutputArchive boa = BinaryOutputArchive.getArchive(baos);
  36. boa.writeInt(-, "len"); // We'll fill this in later
  37. if (requestHeader != null) {
  38. requestHeader.serialize(boa, "header");
  39. }
  40. if (request instanceof ConnectRequest) {
  41. request.serialize(boa, "connect");
  42. // append "am-I-allowed-to-be-readonly" flag
  43. boa.writeBool(readOnly, "readOnly");
  44. } else if (request != null) {
  45. request.serialize(boa, "request");
  46. }
  47. baos.close();
  48. this.bb = ByteBuffer.wrap(baos.toByteArray());
  49. this.bb.putInt(this.bb.capacity() - );
  50. this.bb.rewind();
  51. } catch (IOException e) {
  52. LOG.warn("Ignoring unexpected exception", e);
  53. }
  54. }

  序列化框架:jute.至此就将当前都操作发送至服务器端,当服务器端接收到请求进行下一步的处理.

服务端接收请求处理流程:

  服务端有一个 NIOServerCnxn 类,在服务器端初始化的时候,在QuorumPeerMain.runFromConfig方法中:

  1. ServerCnxnFactory cnxnFactory = ServerCnxnFactory.createFactory();

  这里创建的 cnxnFactory 就是服务器端的网络请求处理类工厂对象,即 NIOServerCnxnFactory ,最后会调用 quorumPeer.start();启动,这里启动的就是 NIOServerCnxnFactory 里面都Run方法,我们跟进去看看:

  1. public void run() {
  2. // socket 不是关闭状态
  3. while (!ss.socket().isClosed()) {
  4. try {//设置超时时间
  5. selector.select();
  6. Set<SelectionKey> selected;
  7. synchronized (this) {//跟刚刚一样,获取事件键列表
  8. selected = selector.selectedKeys();
  9. }
  10. ArrayList<SelectionKey> selectedList = new ArrayList<SelectionKey>(
  11. selected);
  12. Collections.shuffle(selectedList);
  13. for (SelectionKey k : selectedList) {//遍历事件keys
  14. if ((k.readyOps() & SelectionKey.OP_ACCEPT) != ) {//就绪,等待连接
  15. SocketChannel sc = ((ServerSocketChannel) k
  16. .channel()).accept();
  17. InetAddress ia = sc.socket().getInetAddress();
  18. int cnxncount = getClientCnxnCount(ia);
  19. if (maxClientCnxns > && cnxncount >= maxClientCnxns){
  20. LOG.warn("Too many connections from " + ia
  21. + " - max is " + maxClientCnxns );
  22. sc.close();
  23. } else {
  24. LOG.info("Accepted socket connection from "
  25. + sc.socket().getRemoteSocketAddress());
  26. sc.configureBlocking(false);
  27. SelectionKey sk = sc.register(selector,
  28. SelectionKey.OP_READ);
  29. NIOServerCnxn cnxn = createConnection(sc, sk);
  30. sk.attach(cnxn);
  31. addCnxn(cnxn);
  32. }
  33.             // 就绪读写事件
  34. } else if ((k.readyOps() & (SelectionKey.OP_READ | SelectionKey.OP_WRITE)) != ) {
  35. NIOServerCnxn c = (NIOServerCnxn) k.attachment();
  36. c.doIO(k);
  37. } else {
  38.         ......//省略部分代码
  39. }

  看到这里大家应该都清楚了,这里就是一个 selector 的循环监听,由于客户端发送过来,服务端负责处理,即对于服务器端是达到一个读事件,所以这里会走 c.doIO(k); 我们跟进去看看具体做了什么:

  1. void doIO(SelectionKey k) throws InterruptedException {
  2. try {
  3. if (isSocketOpen() == false) {
  4. LOG.warn("trying to do i/o on a null socket for session:0x"
  5. + Long.toHexString(sessionId));
  6.  
  7. return;
  8. }// 可读
  9. if (k.isReadable()) {
  10. int rc = sock.read(incomingBuffer);
  11. if (rc < ) {
  12. throw new EndOfStreamException(
  13. "Unable to read additional data from client sessionid 0x"
  14. + Long.toHexString(sessionId)
  15. + ", likely client has closed socket");
  16. }// 返回剩余的可用长度,此长度为实际读取的数据长度 如果是0,代表读完了
  17. if (incomingBuffer.remaining() == ) {
  18. boolean isPayload;
  19. if (incomingBuffer == lenBuffer) { // start of next request
  20. incomingBuffer.flip();
  21. isPayload = readLength(k);
  22. incomingBuffer.clear();
  23. } else {
  24. // continuation
  25. isPayload = true;
  26. }
  27. if (isPayload) { // not the case for 4letterword
  28. readPayload();
  29. }
  30. else {
  31. // four letter words take care
  32. // need not do anything else
  33. return;
  34. }
  35. }
  36. }
  37.   ......//省略部分代码
  38. }

  这里进入依旧会判断是什么事件,我们这里重点看 isReadable,这里会从channel中读取请求数据,继而进入 readPayload();

  1. private void readPayload() throws IOException, InterruptedException {
  2. if (incomingBuffer.remaining() != ) { // have we read length bytes?
  3. int rc = sock.read(incomingBuffer); // sock is non-blocking, so ok
  4. if (rc < ) {
  5. throw new EndOfStreamException(
  6. "Unable to read additional data from client sessionid 0x"
  7. + Long.toHexString(sessionId)
  8. + ", likely client has closed socket");
  9. }
  10. }
  11.      //判断是否有可读数据
  12. if (incomingBuffer.remaining() == ) { // have we read length bytes?
  13. packetReceived();
  14. incomingBuffer.flip();
  15. if (!initialized) {
  16. readConnectRequest();
  17. } else {
  18. readRequest();
  19. }
  20. lenBuffer.clear();
  21. incomingBuffer = lenBuffer;
  22. }
  23. }

  这里会判断buffer中是否有可读数据,继而调用  readRequest() 去处理请求:

  1. private void readRequest() throws IOException {
  2. zkServer.processPacket(this, incomingBuffer);
  3. }

  继而调用  zkServer.processPacket:

  1. public void processPacket(ServerCnxn cnxn, ByteBuffer incomingBuffer) throws IOException {
  2. // We have the request, now process and setup for next
  3.    //我们有了请求,现在处理并设置next
  4.  
  5. InputStream bais = new ByteBufferInputStream(incomingBuffer);
  6. BinaryInputArchive bia = BinaryInputArchive.getArchive(bais);
  7. RequestHeader h = new RequestHeader();
  8. h.deserialize(bia, "header");
  9. incomingBuffer = incomingBuffer.slice();
  10. if (h.getType() == OpCode.auth) {
  11.       ......
  12. } else {
  13. if (h.getType() == OpCode.sasl) {
  14. ......
  15. }
  16. else {// 由于exists方法一开始设置了 h.setType(ZooDefs.OpCode.exists);所以走这个流程
  17. Request si = new Request(cnxn, cnxn.getSessionId(), h.getXid(),
  18. h.getType(), incomingBuffer, cnxn.getAuthInfo());
  19. si.setOwner(ServerCnxn.me);
  20. submitRequest(si);
  21. }
  22. }
  23. cnxn.incrOutstandingRequests(h);
  24. }

  根据当前调用链会走else里得else的流程,所以会调到 submitRequest(si) :

  1. public void submitRequest(Request si) {
  2. if (firstProcessor == null) {
  3. synchronized (this) {
  4. try {
  5. // Since all requests are passed to the request
  6. // processor it should wait for setting up the request
  7. // processor chain. The state will be updated to RUNNING
  8. // after the setup.
  9. while (state == State.INITIAL) {
  10. wait();
  11. }
  12. } catch (InterruptedException e) {
  13. LOG.warn("Unexpected interruption", e);
  14. }
  15. if (firstProcessor == null || state != State.RUNNING) {
  16. throw new RuntimeException("Not started");
  17. }
  18. }
  19. }
  20. try {
  21. touch(si.cnxn);
  22. boolean validpacket = Request.isValid(si.type);
  23. if (validpacket) { // 链式处理
  24. firstProcessor.processRequest(si);
  25. if (si.cnxn != null) {
  26. incInProcess();
  27. }
  28.     // ...省略部分代码
  29. }

  这里到了服务端的处理链都流程了,首先我们需要知道这个处理链是哪里初始化的呢?我们需要知道在整个调用链过程中采用的是责任链都设计模式,其中在ZK中每种角色以及部署方式都有其独特的调用链,我们先来看一下他是在哪里初始化的,在本类(ZookeeperServer)中搜索到如下方法:

  1. protected void setupRequestProcessors() {
  2. RequestProcessor finalProcessor = new FinalRequestProcessor(this);
  3. RequestProcessor syncProcessor = new SyncRequestProcessor(this,
  4. finalProcessor);
  5. ((SyncRequestProcessor)syncProcessor).start();
  6. firstProcessor = new PrepRequestProcessor(this, syncProcessor);
  7. ((PrepRequestProcessor)firstProcessor).start();
  8. }
  9. public synchronized void startup() {
  10. if (sessionTracker == null) {
  11. createSessionTracker();
  12. }
  13. startSessionTracker();
  14. setupRequestProcessors();
  15.  
  16. registerJMX();
  17.  
  18. setState(State.RUNNING);
  19. notifyAll();
  20. }

  从代码中可以看出在 setupRequestProcessors初始化了该链路,其中由 startup() 进入初始化,而这个startup在我们跟leader选举的时候,服务端初始化中在  QuorumPeer 类中的Run方法中有调到,可以跟单机版的流程看一下,针对不同的角色,这里有4种不同的实现

  我们来看看每种不同角色的调用链:standalone,单机部署:

  1. protected void setupRequestProcessors() {
  2. // PrepRequestProcessor -> SyncRequestProcessor-> FinalRequestProcessor
  3. RequestProcessor finalProcessor = new FinalRequestProcessor(this);
  4. RequestProcessor syncProcessor = new SyncRequestProcessor(this,
  5. finalProcessor);
  6. ((SyncRequestProcessor)syncProcessor).start();
  7. firstProcessor = new PrepRequestProcessor(this, syncProcessor);
  8. ((PrepRequestProcessor)firstProcessor).start();
  9. }

  集群部署 Leader :

  1. protected void setupRequestProcessors() {
  2. // PrepRequestProcessor->ProposalRequestProcessor -> CommitProcessor
  3. // -> ToBeAppliedRequestProcessor ->FinalRequestProcessor
  4. RequestProcessor finalProcessor = new FinalRequestProcessor(this);
  5. RequestProcessor toBeAppliedProcessor = new Leader.ToBeAppliedRequestProcessor(
  6. finalProcessor, getLeader().toBeApplied);
  7. //提交相关
  8. commitProcessor = new CommitProcessor(toBeAppliedProcessor,
  9. Long.toString(getServerId()), false,
  10. getZooKeeperServerListener());
  11. commitProcessor.start();
  12. ////事务相关
  13. ProposalRequestProcessor proposalProcessor = new ProposalRequestProcessor(this,
  14. commitProcessor);
  15. proposalProcessor.initialize();
  16. firstProcessor = new PrepRequestProcessor(this, proposalProcessor);
  17. ((PrepRequestProcessor)firstProcessor).start();
  18. }

  集群部署 Follower:

  1. protected void setupRequestProcessors() {
  2. // FollowerRequestProcessor->CommitProcessor ->FinalRequestProcessor
  3. RequestProcessor finalProcessor = new FinalRequestProcessor(this);
  4. commitProcessor = new CommitProcessor(finalProcessor,
  5. Long.toString(getServerId()), true,
  6. getZooKeeperServerListener());
  7. commitProcessor.start();
  8. firstProcessor = new FollowerRequestProcessor(this, commitProcessor);
  9. ((FollowerRequestProcessor) firstProcessor).start();
  10. //同步应答相关
  11. syncProcessor = new SyncRequestProcessor(this,
  12. new SendAckRequestProcessor((Learner)getFollower()));
  13. syncProcessor.start();
  14. }

  集群部署 Observer:

  1. protected void setupRequestProcessors() {
  2. RequestProcessor finalProcessor = new FinalRequestProcessor(this);
  3. commitProcessor = new CommitProcessor(finalProcessor,
  4. Long.toString(getServerId()), true,
  5. getZooKeeperServerListener());
  6. commitProcessor.start();
  7. firstProcessor = new ObserverRequestProcessor(this, commitProcessor);
  8. ((ObserverRequestProcessor) firstProcessor).start();
  9. if (syncRequestProcessorEnabled) {
  10. syncProcessor = new SyncRequestProcessor(this, null);
  11. syncProcessor.start();
  12. }
  13. }

  这里 setupRequestProcessors 方法,对于不同的集群角色都有相对应都类去重写该方法,我们这里以单机部署的流程去处理对应流程:回到刚刚 submitRequest 方法中:

  1. public void submitRequest(Request si) {
  2.     //firstProcessor不可能是null
  3.     try {
  4. touch(si.cnxn);
  5. boolean validpacket = Request.isValid(si.type);
  6. if (validpacket) {
  7. firstProcessor.processRequest(si);
  8. if (si.cnxn != null) {
  9. incInProcess();
  10. }
  11.     //.......
  12. }

  我们根据单机版的调用链的顺序:PrepRequestProcessor -> SyncRequestProcessor-> FinalRequestProcessor。而这3个处理器的主要功能如下:

  • PrepRequestProcessor:此请求处理器通常位于RequestProcessor的开头,等等可以看到,就exsits对应就一个Session的检查
  • SyncRequestProcessor:此RequestProcessor将请求记录到磁盘。简单来说就是持久化的处理器
  • FinalRequestProcessor:此请求处理程序实际应用与请求关联的任何事务,并为任何查询提供服务

  首先进入PrepRequestProcessor.processRequest:

  1. public void processRequest(Request request) {
  2. // request.addRQRec(">prep="+zks.outstandingChanges.size());
  3. submittedRequests.add(request);
  4. }

  很奇怪,processRequest 只是把 request 添加到submittedRequests中,根据前面的经验,很自然的想到这里又是一个异步操作。而submittedRequests又是一个阻塞队列LinkedBlockingQueue submittedRequests = new LinkedBlockingQueue();而 PrepRequestProcessor 这个类又继承了线程类,因此我们直接找到当前类中的方法如下:

  1. public void run() {
  2. try {
  3. while (true) {
  4. Request request = submittedRequests.take();
  5. long traceMask = ZooTrace.CLIENT_REQUEST_TRACE_MASK;
  6. if (request.type == OpCode.ping) {
  7. traceMask = ZooTrace.CLIENT_PING_TRACE_MASK;
  8. }
  9. if (LOG.isTraceEnabled()) {
  10. ZooTrace.logRequest(LOG, traceMask, 'P', request, "");
  11. }
  12. if (Request.requestOfDeath == request) {
  13. break;
  14. }// 调用 pRequest 进行预处理
  15. pRequest(request);
  16. }
  17. } catch (RequestProcessorException e) {
  18. if (e.getCause() instanceof XidRolloverException) {
  19. LOG.info(e.getCause().getMessage());
  20. }
  21. handleException(this.getName(), e);
  22. } catch (Exception e) {
  23. handleException(this.getName(), e);
  24. }
  25. LOG.info("PrepRequestProcessor exited loop!");
  26. }
  27. protected void pRequest(Request request) throws RequestProcessorException {
  28. // LOG.info("Prep>>> cxid = " + request.cxid + " type = " +
  29. // request.type + " id = 0x" + Long.toHexString(request.sessionId));
  30. request.hdr = null;
  31. request.txn = null;
  32.  
  33. try {
  34. switch (request.type) {
  35.      ......//省略部分代码
  36. case OpCode.sync:
  37. case OpCode.exists: //根据我们这个案例会走这个分支
  38. case OpCode.getData:
  39. case OpCode.getACL:
  40. case OpCode.getChildren:
  41. case OpCode.getChildren2:
  42. case OpCode.ping:
  43. case OpCode.setWatches:
  44. zks.sessionTracker.checkSession(request.sessionId,
  45. request.getOwner());
  46. break;
  47.    .....//省略部分代码
  48. request.zxid = zks.getZxid();
  49. nextProcessor.processRequest(request);
  50. }

  这里通过判断请求的类型进而调用处理,而在本场景中 case OpCode.exists: 会走检查 Session 而没有做其他操作,进而进入下一个调用链  SyncRequestProcessor.processRequest:

  1. public void processRequest(Request request) {
  2. // request.addRQRec(">sync");
  3. queuedRequests.add(request);
  4. }

  又是一样的套路,进入其 Run方法:

  1. public void run() {
  2. try {
  3. int logCount = ;
  4. // we do this in an attempt to ensure that not all of the servers
  5. // in the ensemble take a snapshot at the same time
  6. setRandRoll(r.nextInt(snapCount/));
  7. while (true) {
  8. Request si = null;
  9. if (toFlush.isEmpty()) {
                //出队
  10. si = queuedRequests.take();
  11. } else {
  12. si = queuedRequests.poll();
  13. if (si == null) {
  14. flush(toFlush);
  15. continue;
  16. }
  17. }
  18. if (si == requestOfDeath) {
  19. break;
  20. }
  21. if (si != null) {
                //下面这块代码,粗略看来是触发快照操作,启动一个处理快照的线程
  22. // track the number of records written to the log
  23. if (zks.getZKDatabase().append(si)) {
  24. logCount++;
  25. if (logCount > (snapCount / + randRoll)) {
  26. setRandRoll(r.nextInt(snapCount/));
  27. // roll the log
  28. zks.getZKDatabase().rollLog();
  29. // take a snapshot
  30. if (snapInProcess != null && snapInProcess.isAlive()) {
  31. LOG.warn("Too busy to snap, skipping");
  32. } else {
  33. snapInProcess = new ZooKeeperThread("Snapshot Thread") {
  34. public void run() {
  35. try {
  36. zks.takeSnapshot();
  37. } catch(Exception e) {
  38. LOG.warn("Unexpected exception", e);
  39. }
  40. }
  41. };
  42. snapInProcess.start();
  43. }
  44. logCount = ;
  45. }
  46. } else if (toFlush.isEmpty()) {
  47. // optimization for read heavy workloads
  48. // iff this is a read, and there are no pending
  49. // flushes (writes), then just pass this to the next
  50. // processor
  51. if (nextProcessor != null) {//调用下一个处理器
  52. nextProcessor.processRequest(si);
  53. if (nextProcessor instanceof Flushable) {
  54. ((Flushable)nextProcessor).flush();
  55. }
  56. }
  57. continue;
  58. }
  59. toFlush.add(si);
  60. if (toFlush.size() > ) {
  61. flush(toFlush);
  62. }
  63. }
  64. }
  65.     ......
  66. }

  接着进入下一个调用链 FinalRequestProcessor.processRequest:

  1. public void processRequest(Request request) {
  2.       //省略部分代码 校验 相关
  3.       switch (request.type) {
  4.         //省略部分代码
  5.         case OpCode.exists: {
  6. lastOp = "EXIS";
  7. // TODO we need to figure out the security requirement for this!
  8. ExistsRequest existsRequest = new ExistsRequest();
  9.          // 反序列化 将 ByteBuffer 反序列化成为 ExitsRequest. 这个就是我们在客户端发起请求的时候传递过来的 Request 对象
  10. ByteBufferInputStream.byteBuffer2Record(request.request,existsRequest);
  11. // 得到请求的路径
  12. String path = existsRequest.getPath();
  13. if (path.indexOf('\0') != -) {
  14. throw new KeeperException.BadArgumentsException();
  15. }
  16.          // 终于找到一个很关键的代码,判断请求的 getWatch 是否存在,如果存在,则传递 cnxn ( servercnxn )
  17.          // 对于 exists 请求,需要监听 data 变化事件,添加 watcher
  18. Stat stat = zks.getZKDatabase().statNode(path, existsRequest.getWatch() ? cnxn : null);
  19.          // 在服务端内存数据库中根据路径得到结果进行组装,设置为 ExistsResponse
  20. rsp = new ExistsResponse(stat);
  21. break;
  22. }
  23.     .....//省略部分代码
  24. }

  这里的 cnxn 是 SverCnxn cnxn = request.cnxn在 processRequest(Request request) 方法内,推至前面 c.doIO(k) 的这个c 是通过 NIOServerCnxn c = (NIOServerCnxn) k.attachment() 获取到的。

  最后将这个信息保存在服务器端:

  1. public Stat statNode(String path, Watcher watcher)
  2. throws KeeperException.NoNodeException {
  3. Stat stat = new Stat();
  4. DataNode n = nodes.get(path);
  5. if (watcher != null) {
  6.        //保存watch事件
  7. dataWatches.addWatch(path, watcher);
  8. }
  9. if (n == null) {
  10. throw new KeeperException.NoNodeException();
  11. }
  12. synchronized (n) {
  13. n.copyStat(stat);
  14. return stat;
  15. }
  16. }
  17. //保存 watch 事件
  18. public synchronized void addWatch(String path, Watcher watcher) {
  19. HashSet<Watcher> list = watchTable.get(path);
  20. if (list == null) {
  21. // don't waste memory if there are few watches on a node
  22. // rehash when the 4th entry is added, doubling size thereafter
  23. // seems like a good compromise
  24. list = new HashSet<Watcher>();
  25. watchTable.put(path, list);
  26. }
  27. list.add(watcher);
  28.  
  29. HashSet<String> paths = watch2Paths.get(watcher);
  30. if (paths == null) {
  31. // cnxns typically have many watches, so use default cap here
  32. paths = new HashSet<String>();
  33. watch2Paths.put(watcher, paths);
  34. }
  35. paths.add(path);
  36. }

  至此,服务端处理完成。

客户端接收服务端处理完成的响应:

  服务端处理完成以后,由于在 发送exsits的时候调用了doTransport ,本身调用这个方法之前的ClientCnxn 的 run方法是一直在轮询跑着的。所以在不断的轮询Selector ,所以这里不管是客户端的读还是写操作,都会进入ClientCnxnSocketNIO.doIO ,这里是接收服务端的返回:

  1. void doIO(List<Packet> pendingQueue, LinkedList<Packet> outgoingQueue, ClientCnxn cnxn)
  2. throws InterruptedException, IOException {
  3. SocketChannel sock = (SocketChannel) sockKey.channel();
  4. if (sock == null) {
  5. throw new IOException("Socket is null!");
  6. }
  7. if (sockKey.isReadable()) {
  8. int rc = sock.read(incomingBuffer);
  9. if (rc < ) {
  10. throw new EndOfStreamException(
  11. "Unable to read additional data from server sessionid 0x"
  12. + Long.toHexString(sessionId)
  13. + ", likely server has closed socket");
  14. }//判断是否有刻度数据
  15. if (!incomingBuffer.hasRemaining()) {
  16. incomingBuffer.flip();
  17. if (incomingBuffer == lenBuffer) {
  18. recvCount++;
  19. readLength();
  20. } else if (!initialized) {
  21. readConnectResult();
  22. enableRead();
  23. if (findSendablePacket(outgoingQueue,
  24. cnxn.sendThread.clientTunneledAuthenticationInProgress()) != null) {
  25. // Since SASL authentication has completed (if client is configured to do so),
  26. // outgoing packets waiting in the outgoingQueue can now be sent.
  27. enableWrite();
  28. }
  29. lenBuffer.clear();
  30. incomingBuffer = lenBuffer;
  31. updateLastHeard();
  32. initialized = true;
  33. } else {//读取响应
  34. sendThread.readResponse(incomingBuffer);
  35. lenBuffer.clear();
  36. incomingBuffer = lenBuffer;
  37. updateLastHeard();
  38. }
  39. }
  40. }
      ......//省略部分代码
  41. }
  42. }

  根据当前场景我们现在是接收服务器响应应该走的是  read,最后会调用 sendThread.readResponse(incomingBuffer);来读取数据:

  1. void readResponse(ByteBuffer incomingBuffer) throws IOException {
  2. ByteBufferInputStream bbis = new ByteBufferInputStream(
  3. incomingBuffer);
  4. BinaryInputArchive bbia = BinaryInputArchive.getArchive(bbis);
  5. ReplyHeader replyHdr = new ReplyHeader();
  6. // 反序列化 header
  7. replyHdr.deserialize(bbia, "header");
  8. if (replyHdr.getXid() == -) {
  9. // -2 is the xid for pings
  10. if (LOG.isDebugEnabled()) {
  11. LOG.debug("Got ping response for sessionid: 0x"
  12. + Long.toHexString(sessionId)
  13. + " after "
  14. + ((System.nanoTime() - lastPingSentNs) / )
  15. + "ms");
  16. }
  17. return;
  18. }
  19. if (replyHdr.getXid() == -) {
  20. // -4 is the xid for AuthPacket
  21. if(replyHdr.getErr() == KeeperException.Code.AUTHFAILED.intValue()) {
  22. state = States.AUTH_FAILED;
  23. eventThread.queueEvent( new WatchedEvent(Watcher.Event.EventType.None,
  24. Watcher.Event.KeeperState.AuthFailed, null) );
  25. }
  26. if (LOG.isDebugEnabled()) {
  27. LOG.debug("Got auth sessionid:0x"
  28. + Long.toHexString(sessionId));
  29. }
  30. return;
  31. }
  32. if (replyHdr.getXid() == -) {
  33. // -1 means notification
  34. // 表示当前的消息类型为一个 notification( 意味着是服务端的一个响应事件)
  35. if (LOG.isDebugEnabled()) {
  36. LOG.debug("Got notification sessionid:0x"
  37. + Long.toHexString(sessionId));
  38. }
  39. WatcherEvent event = new WatcherEvent();
  40. // 反序列化响应信息
  41. event.deserialize(bbia, "response");
  42.  
  43. // convert from a server path to a client path
  44. if (chrootPath != null) {
  45. String serverPath = event.getPath();
  46. if(serverPath.compareTo(chrootPath)==)
  47. event.setPath("/");
  48. else if (serverPath.length() > chrootPath.length())
  49. event.setPath(serverPath.substring(chrootPath.length()));
  50. else {
  51. LOG.warn("Got server path " + event.getPath()
  52. + " which is too short for chroot path "
  53. + chrootPath);
  54. }
  55. }
  56.  
  57. WatchedEvent we = new WatchedEvent(event);
  58. if (LOG.isDebugEnabled()) {
  59. LOG.debug("Got " + we + " for sessionid 0x"
  60. + Long.toHexString(sessionId));
  61. }
  62.  
  63. eventThread.queueEvent( we );
  64. return;
  65. }
  66. // If SASL authentication is currently in progress, construct and
  67. // send a response packet immediately, rather than queuing a
  68. // response as with other packets.
  69. if (clientTunneledAuthenticationInProgress()) {
  70. GetSASLRequest request = new GetSASLRequest();
  71. request.deserialize(bbia,"token");
  72. zooKeeperSaslClient.respondToServer(request.getToken(),
  73. ClientCnxn.this);
  74. return;
  75. }
  76. Packet packet;
  77. synchronized (pendingQueue) {
  78. if (pendingQueue.size() == ) {
  79. throw new IOException("Nothing in the queue, but got "
  80. + replyHdr.getXid());
  81. }
  82. // 因为当前这个数据包已经收到了响应,所以讲它从 pendingQueued 中移除
  83. packet = pendingQueue.remove();
  84. }
  85. /*
  86. * Since requests are processed in order, we better get a response
  87. * to the first request!
  88. */
  89. try {// 校验数据包信息,校验成功后讲数据包信息进行更新(替换为服务端的信息)
  90. if (packet.requestHeader.getXid() != replyHdr.getXid()) {
  91. packet.replyHeader.setErr(
  92. KeeperException.Code.CONNECTIONLOSS.intValue());
  93. throw new IOException("Xid out of order. Got Xid "
  94. + replyHdr.getXid() + " with err " +
  95. + replyHdr.getErr() +
  96. " expected Xid "
  97. + packet.requestHeader.getXid()
  98. + " for a packet with details: "
  99. + packet );
  100. }
  101. packet.replyHeader.setXid(replyHdr.getXid());
  102. packet.replyHeader.setErr(replyHdr.getErr());
  103. packet.replyHeader.setZxid(replyHdr.getZxid());
  104. if (replyHdr.getZxid() > ) {
  105. lastZxid = replyHdr.getZxid();
  106. }
  107. if (packet.response != null && replyHdr.getErr() == ) {
  108. packet.response.deserialize(bbia, "response");
  109. // 获得服务端的响应,反序列化以后设置到 packet.response 属性中。
  110. // 所以我们可以在 exists 方法的最后一行通过 packet.response 拿到改请求的返回结果
  111. }
  112.  
  113. if (LOG.isDebugEnabled()) {
  114. LOG.debug("Reading reply sessionid:0x"
  115. + Long.toHexString(sessionId) + ", packet:: " + packet);
  116. }
  117. } finally {
  118. // 最后调用 finishPacket 方法完成处理
  119. finishPacket(packet);
  120. }
  121. }

  这个方法里面主要的流程如下首先读取 header,如果其 xid == -2,表明是一个 ping 的 response,return。如果 xid 是 -4 ,表明是一个 AuthPacket 的 response return。如果 xid 是 -1,表明是一个 notification,此时要继续读取并构造一个 enent,通过EventThread.queueEvent 发送,return。其它情况下:从 pendingQueue 拿出一个 Packet,校验后更新 packet 信息,最后调用  finishPacket 注册本地事件:主要功能是把从 Packet 中取出对应的 Watcher 并注册到 ZKWatchManager 中去

  1. private void finishPacket(Packet p) {
         // exists中初始化的 ExistsWatchRegistration
  2. if (p.watchRegistration != null) {
           // 将事件注册到 zkwatchemanager 中
  3. p.watchRegistration.register(p.replyHeader.getErr());
  4. }
  5.  
  6. if (p.cb == null) {
  7. synchronized (p) {
  8. p.finished = true;
  9. p.notifyAll();
  10. }
  11. } else {
  12. p.finished = true;
           //处理时间的线程进行处理
  13. eventThread.queuePacket(p);
  14. }
  15. }

  其中 watchRegistration 为  exists 方法中初始化的 ExistsWatchRegistration,调用其注册事件:

  1. public void register(int rc) {
  2. if (shouldAddWatch(rc)) {
  3. Map<String, Set<Watcher>> watches = getWatches(rc);
  4. synchronized(watches) {
  5. Set<Watcher> watchers = watches.get(clientPath);
  6. if (watchers == null) {
  7. watchers = new HashSet<Watcher>();
  8. watches.put(clientPath, watchers);
  9. }
  10. watchers.add(watcher);//初始化客户端的时候自己定义的实现Watcher接口的类
  11. }
  12. }
  13. }
  14. //ExistsWatchRegistration.getWatches
  15. protected Map<String, Set<Watcher>> getWatches(int rc) {
  16. return rc == ? watchManager.dataWatches : watchManager.existWatches;
  17. }  

  而这里的 ExistsWatchRegistration.getWatches 获取到的集合在本场景下是获取到的 dataWatches :

  1. private static class ZKWatchManager implements ClientWatchManager {
  2.     private final Map<String, Set<Watcher>> dataWatches =
  3. new HashMap<String, Set<Watcher>>();
  4. private final Map<String, Set<Watcher>> existWatches =
  5. new HashMap<String, Set<Watcher>>();
  6. private final Map<String, Set<Watcher>> childWatches =
  7. new HashMap<String, Set<Watcher>>();
    .......

  总的来说,当使用 ZooKeeper 构造方法或者使用 getData、exists 和getChildren 三个接口来向 ZooKeeper 服务器注册 Watcher 的时候,首先将此消息传递给服务端,传递成功后,服务端会通知客户端,然后客户端将该路径和Watcher 对应关系存储起来备用。finishPacket 方法最终会调用 eventThread.queuePacket, 将当前的数据包添加到等待事件通知的队列中.

  1. public void queuePacket(Packet packet) {
  2. if (wasKilled) {
  3. synchronized (waitingEvents) {
  4. if (isRunning) waitingEvents.add(packet);
  5. else processEvent(packet);
  6. }
  7. } else {
  8. waitingEvents.add(packet);
  9. }
  10. }

事件触发:

  前面这么长的说明,只是为了清晰的说明事件的注册流程,最终的触发,还得需要通过事务型操作来完成在我们最开始的案例中,通过如下代码去完成了事件的触发zooKeeper.setData("/zk-wuzz","1".getBytes(),stat.getVersion()); 修改节点的值触发监听前面的客户端和服务端对接的流程就不再重复讲解了,交互流程是一样的,唯一的差别在于事件触发了.由于调用链路最终都会走到FinalRequestProcessor.processRequest:我们回到这个里面:

  1. public void processRequest(Request request) {
  2. if (LOG.isDebugEnabled()) {
  3. LOG.debug("Processing request:: " + request);
  4. }
  5. // request.addRQRec(">final");
  6. long traceMask = ZooTrace.CLIENT_REQUEST_TRACE_MASK;
  7. if (request.type == OpCode.ping) {
  8. traceMask = ZooTrace.SERVER_PING_TRACE_MASK;
  9. }
  10. if (LOG.isTraceEnabled()) {
  11. ZooTrace.logRequest(LOG, traceMask, 'E', request, "");
  12. }
  13. ProcessTxnResult rc = null;
  14. synchronized (zks.outstandingChanges) {
  15. while (!zks.outstandingChanges.isEmpty()
  16. && zks.outstandingChanges.get().zxid <= request.zxid) {
  17. ChangeRecord cr = zks.outstandingChanges.remove();
  18. if (cr.zxid < request.zxid) {
  19. LOG.warn("Zxid outstanding "
  20. + cr.zxid
  21. + " is less than current " + request.zxid);
  22. }
  23. if (zks.outstandingChangesForPath.get(cr.path) == cr) {
  24. zks.outstandingChangesForPath.remove(cr.path);
  25. }
  26. }//获取header 不为空
  27. if (request.hdr != null) {
  28. TxnHeader hdr = request.hdr;
  29. Record txn = request.txn;
  30.          //事务请求会走这里
  31. rc = zks.processTxn(hdr, txn);
  32. }
  33. // do not add non quorum packets to the queue.
  34. if (Request.isQuorum(request.type)) {
  35. zks.getZKDatabase().addCommittedProposal(request);
  36. }
  37. }
  38.   .....//省略部分代码
  39. }

  我们跟进 zks.processTxn(hdr, txn) :

  1. public ProcessTxnResult processTxn(TxnHeader hdr, Record txn) {
  2. ProcessTxnResult rc;
  3. int opCode = hdr.getType();
  4. long sessionId = hdr.getClientId();
         //处理
  5. rc = getZKDatabase().processTxn(hdr, txn);
  6. if (opCode == OpCode.createSession) {
  7. if (txn instanceof CreateSessionTxn) {
  8. CreateSessionTxn cst = (CreateSessionTxn) txn;
  9. sessionTracker.addSession(sessionId, cst
  10. .getTimeOut());
  11. } else {
  12. LOG.warn("*****>>>>> Got "
  13. + txn.getClass() + " "
  14. + txn.toString());
  15. }
  16. } else if (opCode == OpCode.closeSession) {
  17. sessionTracker.removeSession(sessionId);
  18. }
  19. return rc;
    }

  通过 getZKDatabase().processTxn(hdr, txn) 链路,最终会调用到  DataTree.processTxn(TxnHeader header, Record txn) :

  1. public ProcessTxnResult processTxn(TxnHeader header, Record txn)
  2. {
  3. ProcessTxnResult rc = new ProcessTxnResult();
  4. try {
  5. rc.clientId = header.getClientId();
  6. rc.cxid = header.getCxid();
  7. rc.zxid = header.getZxid();
  8. rc.type = header.getType();
  9. rc.err = ;
  10. rc.multiResult = null;
  11. switch (header.getType()) {
  12.           //省略代码
  13.         case OpCode.setData:
  14. SetDataTxn setDataTxn = (SetDataTxn) txn;
  15. rc.path = setDataTxn.getPath();
  16. rc.stat = setData(setDataTxn.getPath(), setDataTxn
  17. .getData(), setDataTxn.getVersion(), header
  18. .getZxid(), header.getTime());
  19. break;
  20.           //省略代码
  21. }
  22. } return rc;
  23. }

  在这里我们会再进这个分支:

  1. public Stat setData(String path, byte data[], int version, long zxid,
  2. long time) throws KeeperException.NoNodeException {
  3. Stat s = new Stat();
  4. DataNode n = nodes.get(path);
  5. if (n == null) {
  6. throw new KeeperException.NoNodeException();
  7. }
  8. byte lastdata[] = null;
  9. synchronized (n) {
  10. lastdata = n.data;
  11. n.data = data;
  12. n.stat.setMtime(time);
  13. n.stat.setMzxid(zxid);
  14. n.stat.setVersion(version);
  15. n.copyStat(s);
  16. }
  17. // now update if the path is in a quota subtree.
  18. String lastPrefix;
  19. if((lastPrefix = getMaxPrefixWithQuota(path)) != null) {
  20. this.updateBytes(lastPrefix, (data == null ? : data.length)
  21. - (lastdata == null ? : lastdata.length));
  22. }
  23. // 触发对应节点的 NodeDataChanged 事件
  24. dataWatches.triggerWatch(path, EventType.NodeDataChanged);
  25. return s;
  26. }

  在这里可以看到 ,在服务端的节点是利用  DataNode 来保存的,在保存好数据后会触发对应节点的 NodeDataChanged 事件:

  1. public Set<Watcher> triggerWatch(String path, EventType type, Set<Watcher> supress) {
  2. // 根据事件类型、连接状态、节点路径创建 WatchedEvent
  3. WatchedEvent e = new WatchedEvent(type,
  4. KeeperState.SyncConnected, path);
  5. HashSet<Watcher> watchers;
  6. synchronized (this) {
  7. // 从 watcher 表中移除 path ,并返回其对应的 watcher 集合
  8. //这也是ZK默认事件只通知一次的原因
  9. watchers = watchTable.remove(path);
  10. if (watchers == null || watchers.isEmpty()) {
  11. if (LOG.isTraceEnabled()) {
  12. ZooTrace.logTraceMessage(LOG,
  13. ZooTrace.EVENT_DELIVERY_TRACE_MASK,
  14. "No watchers for " + path);
  15. }
  16. return null;
  17. } // 遍历 watcher 集合
  18. for (Watcher w : watchers) {
  19. // 根据 watcher 从 watcher 表中取出路径集合
  20. HashSet<String> paths = watch2Paths.get(w);
  21. if (paths != null) {
  22. paths.remove(path);// 移除路径
  23. }
  24. }
  25. }// 遍历 watcher 集合
  26. for (Watcher w : watchers) {
  27. if (supress != null && supress.contains(w)) {
  28. continue;
  29. }//OK ,重点又来了, w.process 是做什么呢?
  30. w.process(e);
  31. }
  32. return watchers;
  33. }

  还记得我们在服务端绑定事件的时候,watcher 绑定是是什么?是 ServerCnxn,所以 w.process(e),其实调用的应该是 ServerCnxn 的 process 方法。而servercnxn 又是一个抽象方法,有两个实现类,分别是:NIOServerCnxn 和 NettyServerCnxn。那接下来我们扒开 NIOServerCnxn 这个类的 process 方法看看究竟:

  1. synchronized public void process(WatchedEvent event) {
  2. ReplyHeader h = new ReplyHeader(-, -1L, );
  3. if (LOG.isTraceEnabled()) {
  4. ZooTrace.logTraceMessage(LOG, ZooTrace.EVENT_DELIVERY_TRACE_MASK,
  5. "Deliver event " + event + " to 0x"
  6. + Long.toHexString(this.sessionId)
  7. + " through " + this);
  8. }
  9.  
  10. // Convert WatchedEvent to a type that can be sent over the wire
  11. WatcherEvent e = event.getWrapper();
  12.  
  13. sendResponse(h, e, "notification");
  14. }

  那接下里,客户端会收到这个 response,触发 SendThread.readResponse 方法。

客户端处理事件响应:

  还是在不断轮询Selector ,所以这里不管是客户端的读还是写操作,都会进入ClientCnxnSocketNIO.doIO,然后我们直接进入 SendThread.readResponse:

  1. void readResponse(ByteBuffer incomingBuffer) throws IOException {
  2. ByteBufferInputStream bbis = new ByteBufferInputStream(
  3. incomingBuffer);
  4. BinaryInputArchive bbia = BinaryInputArchive.getArchive(bbis);
  5. ReplyHeader replyHdr = new ReplyHeader();
  6. // 反序列化 header
  7. replyHdr.deserialize(bbia, "header");
  8. //省略代码
  9.           if (replyHdr.getXid() == -) {
  10. // -1 means notification
  11. // 表示当前的消息类型为一个 notification( 意味着是服务端的一个响应事件)
  12. if (LOG.isDebugEnabled()) {
  13. LOG.debug("Got notification sessionid:0x"
  14. + Long.toHexString(sessionId));
  15. }
  16. WatcherEvent event = new WatcherEvent();
  17. // 反序列化响应信息
  18. event.deserialize(bbia, "response");
  19.  
  20. // convert from a server path to a client path
  21. if (chrootPath != null) {
  22. String serverPath = event.getPath();
  23. if(serverPath.compareTo(chrootPath)==)
  24. event.setPath("/");
  25. else if (serverPath.length() > chrootPath.length())
  26. event.setPath(serverPath.substring(chrootPath.length()));
  27. else {
  28. LOG.warn("Got server path " + event.getPath()
  29. + " which is too short for chroot path "
  30. + chrootPath);
  31. }
  32. }
  33.  
  34. WatchedEvent we = new WatchedEvent(event);
  35. if (LOG.isDebugEnabled()) {
  36. LOG.debug("Got " + we + " for sessionid 0x"
  37. + Long.toHexString(sessionId));
  38. }
  39.  
  40. eventThread.queueEvent( we );
  41. return;
  42. }
  43.         .....//省略代码
  44. } finally {
  45. // 最后调用 finishPacket 方法完成处理
  46. finishPacket(packet);
  47. }
  48. }

  这里是客户端处理事件回调,这里传过来的 xid 是等于 -1。SendThread 接收到服务端的通知事件后,会通过调用 EventThread 类的queueEvent 方法将事件传给 EventThread 线程,queueEvent 方法根据该通知事件,从 ZKWatchManager 中取出所有相关的 Watcher,如果获取到相应的 Watcher,就会让 Watcher 移除失效:

  1. public void queueEvent(WatchedEvent event) {
  2. // 判断类型
  3. if (event.getType() == EventType.None
  4. && sessionState == event.getState()) {
  5. return;
  6. }
  7. sessionState = event.getState();
  8.  
  9. // materialize the watchers based on the event
  10. // 封装 WatcherSetEventPair 对象,添加到 waitngEvents 队列中
  11. WatcherSetEventPair pair = new WatcherSetEventPair(
  12. watcher.materialize(event.getState(), event.getType(),
  13. event.getPath()),
  14. event);
  15. // queue the pair (watch set & event) for later processing
  16. waitingEvents.add(pair);
  17. }

  其中Meterialize 方法是通过 dataWatches 或者 existWatches 或者 childWatches 的 remove 取出对应的watch,表明客户端 watch 也是注册一次就移除同时需要根据 keeperState、eventType 和 path 返回应该被通知的 Watcher 集合

  这里也进一步说明了zookeeper的watcher事件是不复用的,触发一次就没了,除非再注册一次。

  1. public Set<Watcher> materialize(Watcher.Event.KeeperState state,
  2. Watcher.Event.EventType type,
  3. String clientPath)
  4. {
  5. Set<Watcher> result = new HashSet<Watcher>();
  6.  
  7. switch (type) {
  8. case None:
  9. result.add(defaultWatcher);
  10. boolean clear = ClientCnxn.getDisableAutoResetWatch() &&
  11. state != Watcher.Event.KeeperState.SyncConnected;
  12.  
  13. synchronized(dataWatches) {
  14. for(Set<Watcher> ws: dataWatches.values()) {
  15. result.addAll(ws);
  16. }
  17. if (clear) {
  18. dataWatches.clear();
  19. }
  20. }
  21.  
  22. synchronized(existWatches) {
  23. for(Set<Watcher> ws: existWatches.values()) {
  24. result.addAll(ws);
  25. }
  26. if (clear) {
  27. existWatches.clear();
  28. }
  29. }
  30.  
  31. synchronized(childWatches) {
  32. for(Set<Watcher> ws: childWatches.values()) {
  33. result.addAll(ws);
  34. }
  35. if (clear) {
  36. childWatches.clear();
  37. }
  38. }
  39.  
  40. return result;
  41. case NodeDataChanged://节点变化
  42. case NodeCreated://节点创建
  43. synchronized (dataWatches) {
  44. addTo(dataWatches.remove(clientPath), result);
  45. }
  46. synchronized (existWatches) {
  47. addTo(existWatches.remove(clientPath), result);
  48. }
  49. break;
  50. case NodeChildrenChanged://子节点变化
  51. synchronized (childWatches) {
  52. addTo(childWatches.remove(clientPath), result);
  53. }
  54. break;
  55. case NodeDeleted://节点删除
  56. synchronized (dataWatches) {
  57. addTo(dataWatches.remove(clientPath), result);
  58. }
  59. // XXX This shouldn't be needed, but just in case
  60. synchronized (existWatches) {
  61. Set<Watcher> list = existWatches.remove(clientPath);
  62. if (list != null) {
  63. addTo(list, result);
  64. LOG.warn("We are triggering an exists watch for delete! Shouldn't happen!");
  65. }
  66. }
  67. synchronized (childWatches) {
  68. addTo(childWatches.remove(clientPath), result);
  69. }
  70. break;
  71. default://默认
  72. String msg = "Unhandled watch event type " + type
  73. + " with state " + state + " on path " + clientPath;
  74. LOG.error(msg);
  75. throw new RuntimeException(msg);
  76. }
  77.  
  78. return result;
  79. }

  最后一步,接近真相了,waitingEvents 是 EventThread 这个线程中的阻塞队列,很明显,又是在我们第一步操作的时候实例化的一个线程。从名字可以知道,waitingEvents 是一个待处理 Watcher 的队列,EventThread 的run() 方法会不断从队列中取数据,交由 processEvent 方法处理:

  1. public void run() {
  2. try {
  3. isRunning = true;
  4. while (true) {
  5. Object event = waitingEvents.take();
  6. if (event == eventOfDeath) {
  7. wasKilled = true;
  8. } else {
  9. processEvent(event);
  10. }
  11. if (wasKilled)
  12. synchronized (waitingEvents) {
  13. if (waitingEvents.isEmpty()) {
  14. isRunning = false;
  15. break;
  16. }
  17. }
  18. }
  19. } catch (InterruptedException e) {
  20. LOG.error("Event thread exiting due to interruption", e);
  21. }
  22.  
  23. LOG.info("EventThread shut down for session: 0x{}",
  24. Long.toHexString(getSessionId()));
  25. }

  继而调用  processEvent(event):

  1. private void processEvent(Object event) {
  2. try {// 判断事件类型
  3. if (event instanceof WatcherSetEventPair) {
  4. // each watcher will process the event
  5. // 得到 watcherseteventPair
  6. WatcherSetEventPair pair = (WatcherSetEventPair) event;
  7. // 拿到符合触发机制的所有 watcher 列表,循环进行调用
  8. for (Watcher watcher : pair.watchers) {
  9. try {// 调用客户端的回
  10. watcher.process(pair.event);
  11. } catch (Throwable t) {
  12. LOG.error("Error while calling watcher ", t);
  13. }
  14. }
  15. } else {
  16. 。。。。//省略代码
  17. }
  18. }

  最后调用到自定义的 Watcher 处理类。至此整个Watcher 事件处理完毕。

Zookeeper的Watcher 机制的实现原理的更多相关文章

  1. Zookeeper的Watcher机制

    ZooKeeper 提供了分布式数据的发布/订阅功能, 在 ZooKeeper 中引入了 Watcher 机制来实现这种分布式的通知功能. ZooKeeper 允许客户端向服务端注册一个 Watche ...

  2. 品味ZooKeeper之Watcher机制_2

    品味ZooKeeper之Watcher机制 本文思维导图如下: 前言 Watcher机制是zookeeper最重要三大特性数据节点Znode+Watcher机制+ACL权限控制中的其中一个,它是zk很 ...

  3. zk的watcher机制的实现

    转载:https://www.ibm.com/developerworks/cn/opensource/os-cn-apache-zookeeper-watcher/ http://blog.csdn ...

  4. ZOOKEEPER之WATCHER简介

    zookeeper通过watcher机制,可以实现数据的修改,删除等情况的监听 可以设置观察的操作:exists,getChildren,getData 可以触发观察的操作:create,delete ...

  5. 分布式协调组件Zookeeper之 选举机制与ZAB协议

    Zookeeper简介: Zookeeper是什么: Zookeeper 是⼀个分布式协调服务的开源框架. 主要⽤来解决分布式集群中应⽤系统的⼀致性问题, 例如怎样避免同时操作同⼀数据造成脏读的问题. ...

  6. 【Zookeeper】源码分析之Watcher机制(一)

    一.前言 前面已经分析了Zookeeper持久话相关的类,下面接着分析Zookeeper中的Watcher机制所涉及到的类. 二.总体框图 对于Watcher机制而言,主要涉及的类主要如下. 说明: ...

  7. 【Zookeeper】源码分析之Watcher机制(二)

    一.前言 前面已经分析了Watcher机制中的第一部分,即在org.apache.zookeeper下的相关类,接着来分析org.apache.zookeeper.server下的WatchManag ...

  8. 【Zookeeper】源码分析之Watcher机制(三)之Zookeeper

    一.前言 前面已经分析了Watcher机制中的大多数类,本篇对于ZKWatchManager的外部类Zookeeper进行分析. 二.Zookeeper源码分析 2.1 类的内部类 Zookeeper ...

  9. 【Zookeeper】源码分析之Watcher机制(二)之WatchManager

    一.前言 前面已经分析了Watcher机制中的第一部分,即在org.apache.zookeeper下的相关类,接着来分析org.apache.zookeeper.server下的WatchManag ...

随机推荐

  1. Windows下开启composer镜像服务来安装yii

    网上关于使用composer的安装教程挺多的,但是作为新手的我,觉得好凌乱,不断尝试后,终于安装好了.最后总结出,用开启composer的镜像服务来安装yii是最好的啦,当然,归档文件的做法有利有弊就 ...

  2. 5-5 re模块 正则表达式

    1,正则表达式 正则表达式,就是匹配字符串内容的一种规则. 官方定义:正则表达式是对字符串操作的一种逻辑公式,就是用事先定义好的一些特定字符.及这些特定字符的组合,组成一个“规则字符串”,这个“规则字 ...

  3. 《jQuery精品教程视频》-每天的复习笔记

    第一天 //jquery:简单.粗暴 //jq和js的关系 //js是什么? js是一门编程语言 //jq仅仅是基于js的一个库,jq可理解为就是开发js的一个工具. //概念 //1. 为什么要学j ...

  4. awk基本用法

    1  简介 awk实质是一种编程语言,基本作用在于查找和替换. 2  基本用法 有文本名称为:awk.txt 内容为: john.wang male 30 021-111111 lucy.yang f ...

  5. ORACLE数据库,数据量大,转移数据到备份表语句

    INSERT INTO TEMP_BUS_TRAVEL_INFO ( SELECT * FROM BUS_TRAVEL_INFO t ') SELECT COUNT(*) FROM TEMP_BUS_ ...

  6. java知识点2

    进阶篇 Java底层知识 字节码.class文件格式 CPU缓存,L1,L2,L3和伪共享 尾递归 位运算 用位运算实现加.减.乘.除.取余 设计模式 了解23种设计模式 会使用常用设计模式 单例.策 ...

  7. 手写代码注意点--java.lang.Math 相关

    1-如果用到了Math的函数,需要手动写上: import java.lang.Math; 2-求x的y次方,用的是Math.pow(x,y); 注意,返回值是double!!! 不是int, 如果需 ...

  8. seo 优化排名 使用总结

    SEO 的优化技巧 随着百度对竞价排名位置的大幅减少,SEO优化将自己的网站在首页上有更好的展示有了更多的可能. 本文将系统阐述SEO优化原理.优化技巧和优化流程. 搜索引擎的优化原理是蜘蛛过来抓取网 ...

  9. 【php】随缘php企业网站管理系统V2.0 shownews.php注入漏洞

    程序名称:随缘网络php企业网站管理系统2.0免费版 以下为系统的功能简介: 1.采用div+css布局经测试兼容IE及firefox主流浏览器,其他浏览器暂未测试. 2.产品新闻三级无限分类. 3. ...

  10. Scala 继承

    1. 继承 Scala 通过 extends 关键字来继承类. 那么继承一个类有什么好处呢? 子类拥有继承自超类的方法和字段(即为val(常量), var(变量)所定义的) 可以添加自己需要的新方法和 ...