为了提升系统的性能,进一步提高系统的吞吐能力,最近公司很多系统都在进行异步化改造。在异步化改造的过程中,肯定会比以前碰到更多的多线程问题,上周就碰到ZooKeeper客户端异步化过程中的一个死锁问题,这里说明下。

通常ZooKeeper对于同一个API,提供了同步和异步两种调用方式。

同步接口很容易理解,使用方法如下:

1
2
ZooKeeper zk =
new ZooKeeper(...);
List children = zk.getChildren( path,
true );

异步接口就相对复杂一点,使用方法如下:

1
2
3
4
5
6
7
ZooKeeper zk =
new ZooKeeper(...);
zk.getChildren( path,
true, new
AsyncCallback.Children2Callback() {
            @Override
            public
void processResult(
int rc, String path, Object ctx, List children, Stat stat ) {
                System.out.println(
"Recive the response."
);
            }
}, null);

我们可以看到,异步调用中,需要注册一个Children2Callback,并实现回调方法:processResult。

上周碰到这样的问题:应用注册了对某znode子节点列表变化的监听,逻辑是在接受到ZooKeeper服务器节点列表变更通知(EventType.NodeChildrenChanged)的时候,会重新获取一次子节点列表。之前,他们是使用同步接口,整个应用可以正常运行,但是这次异步化改造后,出现了诡异现象,能够收到子节点的变更通知,但是无法重新获取子节点列表了。

下面,我首先把应用之前使用同步接口的逻辑代码,用一个简单的demo来演示下,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
package
book.chapter05;
import
java.io.IOException;
import
java.util.List;
import
java.util.concurrent.CountDownLatch;
import
org.apache.zookeeper.CreateMode;
import
org.apache.zookeeper.KeeperException;
import
org.apache.zookeeper.WatchedEvent;
import
org.apache.zookeeper.Watcher;
import
org.apache.zookeeper.Watcher.Event.EventType;
import
org.apache.zookeeper.ZooDefs.Ids;
import
org.apache.zookeeper.ZooKeeper;
import
org.apache.zookeeper.Watcher.Event.KeeperState;
/**
 * ZooKeeper API 获取子节点列表,使用同步(sync)接口。
 * @author <a href="mailto:nileader@gmail.com">银时</a>
 */
public
class
ZooKeeper_GetChildren_API_Sync_Usage
implements Watcher {
    private
CountDownLatch connectedSemaphore = new
CountDownLatch( 1
);
    private
static CountDownLatch _semaphore =
new CountDownLatch(
1 );
    private
ZooKeeper zk;
    ZooKeeper createSession( String connectString,
int sessionTimeout, Watcher watcher )
throws IOException {
        ZooKeeper zookeeper =
new ZooKeeper( connectString, sessionTimeout, watcher );
        try
{
            connectedSemaphore.await();
        }
catch ( InterruptedException e ) {
        }
        return
zookeeper;
    }
    /** create path by sync */
    void
createPath_sync( String path, String data, CreateMode createMode )
throws IOException, KeeperException, InterruptedException {
        if
( zk == null
) {
            zk =
this.createSession(
"domain1.book.zookeeper:2181",
5000, this
);
        }
        zk.create( path, data.getBytes(), Ids.OPEN_ACL_UNSAFE, createMode );
    }
    /** Get children znodes of path and set watches */
    List getChildren( String path )
throws KeeperException, InterruptedException, IOException{
        System.out.println(
"===Start to get children znodes.==="
);
        if
( zk == null
) {
            zk =
this.createSession(
"domain1.book.zookeeper:2181",
5000, this
);
        }
        return
zk.getChildren( path, true
);
    }
    public
static void
main( String[] args )
throws
IOException, InterruptedException {
        ZooKeeper_GetChildren_API_Sync_Usage sample =
new ZooKeeper_GetChildren_API_Sync_Usage();
        String path =
"/get_children_test";
        try
{
            sample.createPath_sync( path,
"", CreateMode.PERSISTENT );
            sample.createPath_sync( path +
"/c1", "", CreateMode.PERSISTENT );
            List childrenList = sample.getChildren( path );
            System.out.println( childrenList );
            //Add a new child znode to test watches event notify.
            sample.createPath_sync( path +
"/c2", "", CreateMode.PERSISTENT );
            _semaphore.await();
        }
catch ( KeeperException e ) {
            System.err.println(
"error: " + e.getMessage() );
            e.printStackTrace();
        }
    }
    /**
     * Process when receive watched event
     */
    @Override
    public
void process( WatchedEvent event ) {
        System.out.println(
"Receive watched event:"
+ event );
        if
( KeeperState.SyncConnected == event.getState() ) {
            if( EventType.None == event.getType() &amp;&amp;
null == event.getPath() ){
                connectedSemaphore.countDown();
            }else
if( event.getType() == EventType.NodeChildrenChanged ){
                //children list changed
                try
{
                    System.out.println(
this.getChildren( event.getPath() ) );
                    _semaphore.countDown();
                }
catch ( Exception e ) {}
            }
        }
    }
}

输出结果如下:

1
2
3
4
5
6
Receive watched event:WatchedEvent state:SyncConnected
type:None path:null
===Start to get children znodes.===
[c1]
Receive watched event:WatchedEvent state:SyncConnected
type:NodeChildrenChanged path:/get_children_test
===Start to get children znodes.===
[c1, c2]

在上面这个程序中,我们首先创建了一个父节点: /get_children_test,以及一个子节点:/get_children_test/c1。然后调用getChildren的同步接口来获取/get_children_test节点下的所有子节点,调用的同时注册一个watches。之后,我们继续向/get_children_test节点创建子节点:/get_children_test/c2,这个时候,因为我们之前我们注册了一个watches,因此,一旦此时有子节点被创建,ZooKeeper
Server就会向客户端发出“子节点变更”的通知,于是,客户端可以再次调用getChildren方法来获取新的子节点列表。

这个例子当然是能够正常运行的。现在,我们进行异步化改造,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
package
book.chapter05;
import
java.io.IOException;
import
java.util.List;
import
java.util.concurrent.CountDownLatch;
import
org.apache.zookeeper.AsyncCallback;
import
org.apache.zookeeper.CreateMode;
import
org.apache.zookeeper.KeeperException;
import
org.apache.zookeeper.WatchedEvent;
import
org.apache.zookeeper.Watcher;
import
org.apache.zookeeper.Watcher.Event.EventType;
import
org.apache.zookeeper.ZooDefs.Ids;
import
org.apache.zookeeper.data.Stat;
import
org.apache.zookeeper.ZooKeeper;
import
org.apache.zookeeper.Watcher.Event.KeeperState;
/**
 * ZooKeeper API 获取子节点列表,使用异步(ASync)接口。
 * @author <a href="mailto:nileader@gmail.com">银时</a>
 */
public
class
ZooKeeper_GetChildren_API_ASync_Usage_Deadlock
implements Watcher {
    private
CountDownLatch connectedSemaphore = new
CountDownLatch( 1
);
    private
static CountDownLatch _semaphore =
new CountDownLatch(
1 );
    private
ZooKeeper zk;
                     
    ZooKeeper createSession( String connectString,
int sessionTimeout, Watcher watcher )
throws IOException {
        ZooKeeper zookeeper =
new ZooKeeper( connectString, sessionTimeout, watcher );
        try
{
            connectedSemaphore.await();
        }
catch ( InterruptedException e ) {
        }
        return
zookeeper;
    }
                     
    /** create path by sync */
    void
createPath_sync( String path, String data, CreateMode createMode )
throws IOException, KeeperException, InterruptedException {
        if
( zk == null
) {
            zk =
this.createSession(
"domain1.book.zookeeper:2181",
5000, this
);
        }
        zk.create( path, data.getBytes(), Ids.OPEN_ACL_UNSAFE, createMode );
    }
                     
    /** Get children znodes of path and set watches */
    void
getChildren( String path ) throws
KeeperException, InterruptedException, IOException{
                         
        System.out.println(
"===Start to get children znodes.==="
);
        if
( zk == null
) {
            zk =
this.createSession(
"domain1.book.zookeeper:2181",
5000, this
);
        }
                         
        final
CountDownLatch _semaphore_get_children = new
CountDownLatch( 1
);
                         
        zk.getChildren( path,
true, new
AsyncCallback.Children2Callback() {
            @Override
            public
void processResult(
int rc, String path, Object ctx, List children, Stat stat ) {
                                 
                System.out.println(
"Get Children znode result: [response code: "
+ rc + ", param path: "
+ path + ", ctx: "
+ ctx + ", children list: "
                        + children +
", stat: " + stat );
                _semaphore_get_children.countDown();
            }
        },
null);
        _semaphore_get_children.await();
    }
    public
static void
main( String[] args )
throws
IOException, InterruptedException {
                         
        ZooKeeper_GetChildren_API_ASync_Usage_Deadlock sample =
new ZooKeeper_GetChildren_API_ASync_Usage_Deadlock();
        String path =
"/get_children_test";
                         
        try
{
            sample.createPath_sync( path,
"", CreateMode.PERSISTENT );
            sample.createPath_sync( path +
"/c1", "", CreateMode.PERSISTENT );
            //Get children and register watches.
            sample.getChildren( path );
            //Add a new child znode to test watches event notify.
            sample.createPath_sync( path +
"/c2", "", CreateMode.PERSISTENT );
                             
            _semaphore.await();
        }
catch ( KeeperException e ) {
            System.err.println(
"error: " + e.getMessage() );
            e.printStackTrace();
        }
    }
    /**
     * Process when receive watched event
     */
    @Override
    public
void process( WatchedEvent event ) {
        System.out.println(
"Receive watched event:"
+ event );
        if
( KeeperState.SyncConnected == event.getState() ) {
                             
            if( EventType.None == event.getType() &amp;&amp;
null == event.getPath() ){
                connectedSemaphore.countDown();
            }else
if( event.getType() == EventType.NodeChildrenChanged ){
                //children list changed
                try
{
                    this.getChildren( event.getPath() );
                    _semaphore.countDown();
                }
catch ( Exception e ) {
                    e.printStackTrace();
                }
            }
                             
        }
    }
}

输出结果如下:

1
2
3
4
5
Receive watched event:WatchedEvent state:SyncConnected
type:None path:null
===Start to get children znodes.===
Get Children znode result: [response code: 0, param path:
/get_children_test, ctx: null, children list: [c1], stat: 555,555,1373931727380,1373931727380,0,1,0,0,0,1,556
Receive watched event:WatchedEvent state:SyncConnected
type:NodeChildrenChanged path:/get_children_test
===Start to get children znodes.===

在上面这个demo中,执行逻辑和之前的同步版本基本一致,唯一有区别的地方在于获取子节点列表的过程异步化了。这样一改造,问题就出来了,整个程序在进行第二次获取节点列表的时候,卡住了。和应用方确认了,之前同步版本从来没有出现过这个现象的,所以开始排查这个异步化中哪里会阻塞。

这里,我们重点讲解在ZooKeeper客户端中,需要处理来自服务端的两类事件通知:一类是Watches时间通知,另一类则是异步接口调用的响应。值得一提的是,在ZooKeeper的客户端线程模型中,这两个事件由同一个线程处理,并且是串行处理。具体可以自己查看事件处理的核心类:org.apache.zookeeper.ClientCnxn.EventThread。

ZooKeeper客户端事件串行化处理的更多相关文章

  1. 【性能诊断】四、单功能场景的性能分析(RedGate,找到同一个客户端的并发请求被串行化问题)

    问题描述: 客户端js连续发起两个异步http请求,请求地址相同,但参数不同:POST http://*.*.*.*/*****/webservice/RESTFulWebService/RESTFu ...

  2. 【原创】uwsgi中多进程+多线程原因以及串行化accept() - thunder_lock说明

    如有不对,请详细指正. 最近再研究uwsgi如何部署python app,看uwsgi的文档,里面有太多的参数,但每个参数的解释太苍白,作为菜鸟的我实在是不懂.想搞清楚uwsgi的工作原因以及里面的一 ...

  3. MFC【6】文件I/O和串行化

    文件输入和输出(I/O)服务是所有操作系统的主要工作.Microsoft Windows提供了各种API函数用来读.写和操作磁盘文件.MFC将这些桉树和CFile类融合在面对对象的模型里.其中CFil ...

  4. 【Java EE 学习 72 下】【数据采集系统第四天】【移动/复制页分析】【使用串行化技术实现深度复制】

    一.移动.复制页的逻辑实现 移动.复制页的功能是在设计调查页面的时候需要实现的功能.规则是如果在同一个调查中的话就是移动,如果是在不同调查中的就是复制. 无论是移动还是复制,都需要注意一个问题,那就是 ...

  5. PHP面向对象04_串行化

    oop04复习 2014-9-3 10:48:45 要点: --1.克隆对象 --2.__toString( ) --3. __call( ) --4.自动加载类 --5.对象串行化 1.克隆对象以及 ...

  6. 【PHP面向对象(OOP)编程入门教程】22.把对象串行化serialize()方法,__sleep()方法,__wakeup()方法

    有时候需要把一个对象在网络上传输,为了方便传输,可以把整个对象转化为二进制串,等到达另一端时,再还原为原来的对象,这个过程称之为串行化(也叫序列化), 就像我们现在想把一辆汽车通过轮船运到美国去,因为 ...

  7. VC++ chap13 文档与串行化

    Lesson 13 文档与串行化 13.1使用CArchive类对文件进行读写操作 //让对象数据持久性的过程称之为串行化,或者序列化 void CGraphicView::OnFileWrite() ...

  8. Java 对象的串行化(Serialization)

    1.什么是串行化 对象的寿命通常随着生成该对象的程序的终止而终止.有时候,可能需要将对象的状态保存下来,在需要时再将对象恢复.我们把对象的这种能记录自己的状态以便将来再生的能力.叫作对象的持续性(pe ...

  9. Oracle 6 - 锁和闩 - transaction的可串行化

    本文主要内容 1.transaction的可串行化 2.数据库并发带来的问题, dirty read, Nonrepeatable reads, Phantoms幻读 3.隔离级别和2中的问题 4. ...

随机推荐

  1. java虚拟机 jvm 栈数据区

    java栈帧还是需要一些数据支持常量池的解析.正常方法的返回和异常的处理.大部分的java字节码指令需要进行常量池的访问,在栈帧数据区中保存着访问常量池的指针,方便程序访问java常量池.如下图所示: ...

  2. Linux下如何阅读开源项目

    标签(空格分隔): code SLAM是一个大型的项目,而且通常都是基于linux平台的.对于大部分没有linux经验的人来说,如何在linux下拥有vs代码阅读体验就非常重要了.这篇博客就简答的介绍 ...

  3. UNIX网络编程——客户/服务器程序设计示范(八)

        TCP预先创建线程服务器程序,主线程统一accept 最后一个使用线程的服务器程序设计示范是在程序启动阶段创建一个线程池之后只让主线程调用accept并把每个客户连接传递给池中某个可用线程.  ...

  4. 04-GIT TortoiseGit冲突和补丁演示 案例演示

    TortoiseGit安装下载 http://download.tortoisegit.org/tgit/1.8.12.0/ 或https://code.google.com/p/tortoisegi ...

  5. Java 8新特性探究(四)深入解析日期和时间-JSR310

    众所周知,日期是商业逻辑计算一个关键的部分,任何企业应用程序都需要处理时间问题.应用程序需要知道当前的时间点和下一个时间点,有时它们还必须计算这两个时间点之间的路径.但java之前的日期做法太令人恶心 ...

  6. Java 8新特性探究(三)泛型的目标类型推断

    简单理解泛型 泛型是Java SE 1.5的新特性,泛型的本质是参数化类型,也就是说所操作的数据类型被指定为一个参数.通俗点将就是"类型的变量".这种类型变量可以用在类.接口和方法 ...

  7. 对“传统BIOS”与“EFI/UEFI BIOS”的基本认识

    硬盘(MBR磁盘)分区基本认识+Windows启动原理 大家常会看到硬盘分区中这样的几种说法:系统分区.启动分区.活动分区.主分区.拓展分区.逻辑分区,MBR.PBR.DPT.主引导扇区等.尤其是看到 ...

  8. Ext JS 6正式版的GPL版本下载地址

    下面是Ext JS 6正式版的GPL版本下载地址 https://www.sencha.com/legal/gpl/

  9. 工作中常用的Linux命令

    1.从其他机器拷贝文件夹 格式: scp -r 文件夹名 用户名@机器名:/路径 范例: scp -rsearch work@zjm-testing-ps23.zjm.baidu.com:/home/ ...

  10. Python代码运行助手

    将下述demo文件保存下来,比如存为learning.py 然后运行,如果出现: Ready for Python code on port 39093... 则说明成功了. demo #!/usr/ ...