title: libevent源码阅读笔记(一):libevent对epoll的封装

最近开始阅读网络库libevent的源码,阅读源码之前,大致看了张亮写的几篇博文(libevent源码深度剖析 http://blog.csdn.net/sparkliang/article/details/4957667 ),对libevent网络库有了总体上的认识,然后开始源码的阅读。

与整体把握不同,我是先从局部开始阅读libevent的源码,当然,前提是我已经大致了解了整个libevent的框架结构,今天先写写libevent对epoll的封装部分。

libevent对epoll的封装主要是在epoll.c文件

首先是epollop结构体,封装epoll文件描述符,event事件数组

struct epollop {
struct epoll_event *events; //epoll_event数组
int nevents; //事件数量
int epfd; //epollfd
};

定义了三个静态函数对epoll进行操作,其中event_base是整个libevent框架封装的结构体,也就是反应堆,我们利用epoll注册的事件,最终都会加到反应堆event_base中

  static void *epoll_init(struct event_base *);
static int epoll_dispatch(struct event_base *, struct timeval *);
static void epoll_dealloc(struct event_base *);

这里的几个函数都定义成函数指针,是因为libevent对多路复用IO封装成统一的接口,在安装libevent的时候根据系统对IO复用的支持选择合适的函数。

统一封装的接口为eventop,其每一个成员都是一个函数指针:

const struct eventop epollops = {

"epoll",

epoll_init,

epoll_nochangelist_add,

epoll_nochangelist_del,

epoll_dispatch,

epoll_dealloc,

1, /* need reinit */

EV_FEATURE_ET|EV_FEATURE_O1,

0

};

下面依次看下这些函数的实现:

(1) 先看 epoll_init 函数,主要是epoll的初始化操作,创建一个epoll文件描述符,初始化event数组大小,最后设置了关于信号监听处理方面的初始化工作(这个后面单独讲解)。

static void *
epoll_init(struct event_base *base)
{
int epfd;
struct epollop *epollop; /* Initialize the kernel queue. (The size field is ignored since
* 2.6.8.) */
if ((epfd = epoll_create(32000)) == -1) {
if (errno != ENOSYS)
event_warn("epoll_create");
return (NULL);
}
//设置使用execl执行的程序里,此描述符被关闭,子进程中不关
evutil_make_socket_closeonexec(epfd); if (!(epollop = mm_calloc(1, sizeof(struct epollop)))) {
close(epfd);
return (NULL);
} epollop->epfd = epfd; /* Initialize fields */
epollop->events = mm_calloc(INITIAL_NEVENT, sizeof(struct epoll_event));
if (epollop->events == NULL) {
mm_free(epollop);
close(epfd);
return (NULL);
}
epollop->nevents = INITIAL_NEVENT;
//使用changelist
if ((base->flags & EVENT_BASE_FLAG_EPOLL_USE_CHANGELIST) != 0 ||
((base->flags & EVENT_BASE_FLAG_IGNORE_ENV) == 0 &&
evutil_getenv("EVENT_EPOLL_USE_CHANGELIST") != NULL))
base->evsel = &epollops_changelist;
//对于信号的初始化
evsig_init(base); return (epollop);
}

(2)epoll_nochangelist_add

对于epoll事件的添加,主要采用了区分为是否使用changelist,使用changelist效率更高,我们先分析不使用changelist的情况,也就是epoll_nochangelist_add函数,该函数设置需要添加的事件,调用epoll_apply_one_change函数完成添加

static int
epoll_nochangelist_add(struct event_base *base, evutil_socket_t fd,
short old, short events, void *p)
{
struct event_change ch;
ch.fd = fd;
ch.old_events = old;
ch.read_change = ch.write_change = 0;
//判断读写事件是否需要修改
if (events & EV_WRITE)
ch.write_change = EV_CHANGE_ADD |
(events & EV_ET); //EV_ET是边缘触发
if (events & EV_READ)
ch.read_change = EV_CHANGE_ADD |
(events & EV_ET);
//nochangelist方法中,直接调用epoll_apply_one_change,底层是系统调用epoll_ctl的封装
return epoll_apply_one_change(base, base->evbase, &ch);
}

其中使用了event_change结构封装事件的修改,主要包括一个句柄,改变之前的事件old_events,希望改变的读事件read_change和写事件write_change:

struct event_change {
/** The fd or signal whose events are to be changed */
evutil_socket_t fd;
/* The events that were enabled on the fd before any of these changes
were made. May include EV_READ or EV_WRITE. */
short old_events; /* The changes that we want to make in reading and writing on this fd.
* If this is a signal, then read_change has EV_CHANGE_SIGNAL set,
* and write_change is unused. */
ev_uint8_t read_change;
ev_uint8_t write_change;
};

上面提到的epoll_apply_one_change函数,实际是对epoll_ctl的封装,采用了ADD和MOD的方式进行尝试,最后都是调用epoll_ctl完成事件的添加

static int
epoll_apply_one_change(struct event_base *base,
struct epollop *epollop,
const struct event_change *ch)
{
struct epoll_event epev;
int op, events = 0; if (1) {
/* The logic here is a little tricky. If we had no events set
on the fd before, we need to set op="ADD" and set
events=the events we want to add. If we had any events set
on the fd before, and we want any events to remain on the
fd, we need to say op="MOD" and set events=the events we
want to remain. But if we want to delete the last event,
we say op="DEL" and set events=the remaining events. What
fun!
*/ /* TODO: Turn this into a switch or a table lookup. */ if ((ch->read_change & EV_CHANGE_ADD) ||
(ch->write_change & EV_CHANGE_ADD)) {
/* If we are adding anything at all, we'll want to do
* either an ADD or a MOD. */
events = 0;
op = EPOLL_CTL_ADD;
//读监听
if (ch->read_change & EV_CHANGE_ADD) {
events |= EPOLLIN;
} else if (ch->read_change & EV_CHANGE_DEL) {
;
} else if (ch->old_events & EV_READ) {
events |= EPOLLIN;
}
//写监听
if (ch->write_change & EV_CHANGE_ADD) {
events |= EPOLLOUT;
} else if (ch->write_change & EV_CHANGE_DEL) {
;
} else if (ch->old_events & EV_WRITE) {
events |= EPOLLOUT;
}
//是否边缘触发
if ((ch->read_change|ch->write_change) & EV_ET)
events |= EPOLLET; if (ch->old_events) {
/* If MOD fails, we retry as an ADD, and if
* ADD fails we will retry as a MOD. So the
* only hard part here is to guess which one
* will work. As a heuristic, we'll try
* MOD first if we think there were old
* events and ADD if we think there were none.
*
* We can be wrong about the MOD if the file
* has in fact been closed and re-opened.
*
* We can be wrong about the ADD if the
* the fd has been re-created with a dup()
* of the same file that it was before.
*/
op = EPOLL_CTL_MOD;
}
} else if ((ch->read_change & EV_CHANGE_DEL) ||
(ch->write_change & EV_CHANGE_DEL)) {
/* If we're deleting anything, we'll want to do a MOD
* or a DEL. */
op = EPOLL_CTL_DEL; if (ch->read_change & EV_CHANGE_DEL) {
if (ch->write_change & EV_CHANGE_DEL) {
events = EPOLLIN|EPOLLOUT;
} else if (ch->old_events & EV_WRITE) {
events = EPOLLOUT;
op = EPOLL_CTL_MOD;
} else {
events = EPOLLIN;
}
} else if (ch->write_change & EV_CHANGE_DEL) {
if (ch->old_events & EV_READ) {
events = EPOLLIN;
op = EPOLL_CTL_MOD;
} else {
events = EPOLLOUT;
}
}
} if (!events)
return 0; memset(&epev, 0, sizeof(epev));
epev.data.fd = ch->fd;
epev.events = events;
if (epoll_ctl(epollop->epfd, op, ch->fd, &epev) == -1) {
if (op == EPOLL_CTL_MOD && errno == ENOENT) {
/* If a MOD operation fails with ENOENT, the
* fd was probably closed and re-opened. We
* should retry the operation as an ADD.
*/
if (epoll_ctl(epollop->epfd, EPOLL_CTL_ADD, ch->fd, &epev) == -1) {
event_warn("Epoll MOD(%d) on %d retried as ADD; that failed too",
(int)epev.events, ch->fd);
return -1;
} else {
event_debug(("Epoll MOD(%d) on %d retried as ADD; succeeded.",
(int)epev.events,
ch->fd));
}
} else if (op == EPOLL_CTL_ADD && errno == EEXIST) {
/* If an ADD operation fails with EEXIST,
* either the operation was redundant (as with a
* precautionary add), or we ran into a fun
* kernel bug where using dup*() to duplicate the
* same file into the same fd gives you the same epitem
* rather than a fresh one. For the second case,
* we must retry with MOD. */
if (epoll_ctl(epollop->epfd, EPOLL_CTL_MOD, ch->fd, &epev) == -1) {
event_warn("Epoll ADD(%d) on %d retried as MOD; that failed too",
(int)epev.events, ch->fd);
return -1;
} else {
event_debug(("Epoll ADD(%d) on %d retried as MOD; succeeded.",
(int)epev.events,
ch->fd));
}
} else if (op == EPOLL_CTL_DEL &&
(errno == ENOENT || errno == EBADF ||
errno == EPERM)) {
/* If a delete fails with one of these errors,
* that's fine too: we closed the fd before we
* got around to calling epoll_dispatch. */
event_debug(("Epoll DEL(%d) on fd %d gave %s: DEL was unnecessary.",
(int)epev.events,
ch->fd,
strerror(errno)));
} else {
event_warn("Epoll %s(%d) on fd %d failed. Old events were %d; read change was %d (%s); write change was %d (%s)",
epoll_op_to_string(op),
(int)epev.events,
ch->fd,
ch->old_events,
ch->read_change,
change_to_string(ch->read_change),
ch->write_change,
change_to_string(ch->write_change));
return -1;
}
} else {
event_debug(("Epoll %s(%d) on fd %d okay. [old events were %d; read change was %d; write change was %d]",
epoll_op_to_string(op),
(int)epev.events,
(int)ch->fd,
ch->old_events,
ch->read_change,
ch->write_change));
}
}
return 0;
}

(3)epoll_nochangelist_del

与epoll_nochangelist_add操作类似,从反应堆中删除对应的监听事件

static int
epoll_nochangelist_del(struct event_base *base, evutil_socket_t fd,
short old, short events, void *p)
{
struct event_change ch;
ch.fd = fd;
ch.old_events = old;
ch.read_change = ch.write_change = 0;
if (events & EV_WRITE)
ch.write_change = EV_CHANGE_DEL;
if (events & EV_READ)
ch.read_change = EV_CHANGE_DEL; return epoll_apply_one_change(base, base->evbase, &ch);
}

(4)epoll_dispatch

再来看epoll_dispatch函数,实际是对epoll_wait的封装,对反应堆中已经注册添加的监听事件调用epoll_wait,同时设置超时时间,对监听到的事件,加入反应堆的激活队列中(反应堆会处理激活队列中中的事件)

static int
epoll_dispatch(struct event_base *base, struct timeval *tv)
{
struct epollop *epollop = base->evbase;
struct epoll_event *events = epollop->events;
int i, res;
long timeout = -1; //
if (tv != NULL) {
timeout = evutil_tv_to_msec(tv); //转换成毫秒,后面设置epoll_wait的超时时间
if (timeout < 0 || timeout > MAX_EPOLL_TIMEOUT_MSEC) {
/* Linux kernels can wait forever if the timeout is
* too big; see comment on MAX_EPOLL_TIMEOUT_MSEC. */
timeout = MAX_EPOLL_TIMEOUT_MSEC;
}
} //注册event_base中添加的每一个监听事件(针对changelist)
epoll_apply_changes(base);
//清空changelist(针对changelist)
event_changelist_remove_all(&base->changelist, base); EVBASE_RELEASE_LOCK(base, th_base_lock); res = epoll_wait(epollop->epfd, events, epollop->nevents, timeout); EVBASE_ACQUIRE_LOCK(base, th_base_lock); if (res == -1) {
if (errno != EINTR) {
event_warn("epoll_wait");
return (-1);
} return (0);
} event_debug(("%s: epoll_wait reports %d", __func__, res));
EVUTIL_ASSERT(res <= epollop->nevents); for (i = 0; i < res; i++) {
int what = events[i].events;
short ev = 0;
//EPOLLHUP是文件描述符被挂断 EPOLLERR是文件描述符出现错误
if (what & (EPOLLHUP|EPOLLERR)) {
ev = EV_READ | EV_WRITE;
} else {
if (what & EPOLLIN)
ev |= EV_READ;
if (what & EPOLLOUT)
ev |= EV_WRITE;
} if (!ev)
continue;
//将事件加入激活队列中
evmap_io_active(base, events[i].data.fd, ev | EV_ET);
} //epollop中注册监听的事件都触发,表明需要增加epollop中能够容纳的事件大小
if (res == epollop->nevents && epollop->nevents < MAX_NEVENT) {
/* We used all of the event space this time. We should
be ready for more events next time. */
int new_nevents = epollop->nevents * 2;
struct epoll_event *new_events; new_events = mm_realloc(epollop->events,
new_nevents * sizeof(struct epoll_event));
if (new_events) {
epollop->events = new_events;
epollop->nevents = new_nevents;
}
} return (0);
}

(5)epoll_dealloc

这个函数就不用说了,将反应堆中epoll的对应内存释放,句柄关闭~

static void
epoll_dealloc(struct event_base *base)
{
struct epollop *epollop = base->evbase; evsig_dealloc(base);
if (epollop->events)
mm_free(epollop->events);
if (epollop->epfd >= 0)
close(epollop->epfd); memset(epollop, 0, sizeof(struct epollop));
mm_free(epollop);
}

changleist模式

epoll在封装时,实际上采用的是效率更高的changelist模式,先来看一下changelist的结构,该结构用于记录反应堆两次监听(dispatch)之间,对需要监听的文件描述符所做更改的保存,而并不是对于每一次更改立即调用系统调用epoll_ctl,效率更高

struct event_changelist {
struct event_change *changes; //event_change数组首地址
int n_changes; //event_change数组中event_change个数
int changes_size; //分配的event_change数组容量
};

在以上结构基础上,再来理解event_changelist_add和event_changelist_del函数

int
event_changelist_add(struct event_base *base, evutil_socket_t fd, short old, short events,
void *p)
{
//获取反应堆中的changelist
struct event_changelist *changelist = &base->changelist;
struct event_changelist_fdinfo *fdinfo = p;
struct event_change *change; event_changelist_check(base);
//从changelist中查找是否存在fd的event_change,如果有,返回,如果没有构造一个加到event_changelist中,并返回
change = event_changelist_get_or_construct(changelist, fd, old, fdinfo);
if (!change)
return -1; /* An add replaces any previous delete, but doesn't result in a no-op,
* since the delete might fail (because the fd had been closed since
* the last add, for instance. */
//添加操作可以替代之前的删除,但不会导致在此文件描述符fd上的不操作,因为此前的删除操作可能失败,
//注意这里是替代,也就是说,原本该fd上若是删除,这里直接修改为添加
if (events & (EV_READ|EV_SIGNAL)) {
change->read_change = EV_CHANGE_ADD |
(events & (EV_ET|EV_PERSIST|EV_SIGNAL));
}
if (events & EV_WRITE) {
change->write_change = EV_CHANGE_ADD |
(events & (EV_ET|EV_PERSIST|EV_SIGNAL));
} event_changelist_check(base);
return (0);

}

其中值得注意的是event_changelist_get_or_construct函数:这里在event_changelist中查找对应fd的event_change并没有遍历,原因在于使用了event_changelist_fdinfo结构保存了该fd在event_changelist中的下标加1(若该值为0,表示event_changelist中不存在该fd的event_change)

static struct event_change *
event_changelist_get_or_construct(struct event_changelist *changelist,
evutil_socket_t fd,
short old_events,
struct event_changelist_fdinfo *fdinfo)
{
struct event_change *change;
//不存在,增加
if (fdinfo->idxplus1 == 0) {
int idx;
EVUTIL_ASSERT(changelist->n_changes <= changelist->changes_size);
//容量不够,扩容
if (changelist->n_changes == changelist->changes_size) {
if (event_changelist_grow(changelist) < 0)
return NULL;
} idx = changelist->n_changes++;
change = &changelist->changes[idx];
fdinfo->idxplus1 = idx + 1; //idxplus1为list的下标+1 memset(change, 0, sizeof(struct event_change));
change->fd = fd;
change->old_events = old_events;
} else {
//存在,直接返回change指针
change = &changelist->changes[fdinfo->idxplus1 - 1];
EVUTIL_ASSERT(change->fd == fd);
}
return change;
} 类似的,event_changelist_del函数也很好理解,与event_changelist_add不同的是,del操作可以抵消此前在该fd上的add操作 int
event_changelist_del(struct event_base *base, evutil_socket_t fd, short old, short events,
void *p)
{
struct event_changelist *changelist = &base->changelist;
struct event_changelist_fdinfo *fdinfo = p;
struct event_change *change; event_changelist_check(base);
change = event_changelist_get_or_construct(changelist, fd, old, fdinfo);
event_changelist_check(base);
if (!change)
return -1; /* A delete removes any previous add, rather than replacing it:
on those platforms where "add, delete, dispatch" is not the same
as "no-op, dispatch", we want the no-op behavior. As well as checking the current operation we should also check
the original set of events to make sure were not ignoring
the case where the add operation is present on an event that
was already set. If we have a no-op item, we could remove it it from the list
entirely, but really there's not much point: skipping the no-op
change when we do the dispatch later is far cheaper than rejuggling
the array now. As this stands, it also lets through deletions of events that are
not currently set.
*/ //对于已经设置过添加监听的fd,删除操作抵消添加操作变为不操作
if (events & (EV_READ|EV_SIGNAL)) {
if (!(change->old_events & (EV_READ | EV_SIGNAL)) &&
(change->read_change & EV_CHANGE_ADD))
change->read_change = 0;
else
change->read_change = EV_CHANGE_DEL;
}
if (events & EV_WRITE) {
if (!(change->old_events & EV_WRITE) &&
(change->write_change & EV_CHANGE_ADD))
change->write_change = 0;
else
change->write_change = EV_CHANGE_DEL;
} event_changelist_check(base);
return (0);
}

libevent源码阅读笔记(一):libevent对epoll的封装的更多相关文章

  1. Mina源码阅读笔记(二)- IoBuffer的封装

    在阅读IoBuffer源码之前,我们先看Mina对IoBuffer的描述:A byte buffer used by MINA applications. This is a replacement ...

  2. CI框架源码阅读笔记5 基准测试 BenchMark.php

    上一篇博客(CI框架源码阅读笔记4 引导文件CodeIgniter.php)中,我们已经看到:CI中核心流程的核心功能都是由不同的组件来完成的.这些组件类似于一个一个单独的模块,不同的模块完成不同的功 ...

  3. CI框架源码阅读笔记4 引导文件CodeIgniter.php

    到了这里,终于进入CI框架的核心了.既然是“引导”文件,那么就是对用户的请求.参数等做相应的导向,让用户请求和数据流按照正确的线路各就各位.例如,用户的请求url: http://you.host.c ...

  4. CI框架源码阅读笔记3 全局函数Common.php

    从本篇开始,将深入CI框架的内部,一步步去探索这个框架的实现.结构和设计. Common.php文件定义了一系列的全局函数(一般来说,全局函数具有最高的加载优先权,因此大多数的框架中BootStrap ...

  5. CI框架源码阅读笔记2 一切的入口 index.php

    上一节(CI框架源码阅读笔记1 - 环境准备.基本术语和框架流程)中,我们提到了CI框架的基本流程,这里再次贴出流程图,以备参考: 作为CI框架的入口文件,源码阅读,自然由此开始.在源码阅读的过程中, ...

  6. 源码阅读笔记 - 1 MSVC2015中的std::sort

    大约寒假开始的时候我就已经把std::sort的源码阅读完毕并理解其中的做法了,到了寒假结尾,姑且把它写出来 这是我的第一篇源码阅读笔记,以后会发更多的,包括算法和库实现,源码会按照我自己的代码风格格 ...

  7. Three.js源码阅读笔记-5

    Core::Ray 该类用来表示空间中的“射线”,主要用来进行碰撞检测. THREE.Ray = function ( origin, direction ) { this.origin = ( or ...

  8. PHP源码阅读笔记一(explode和implode函数分析)

    PHP源码阅读笔记一一.explode和implode函数array explode ( string separator, string string [, int limit] )此函数返回由字符 ...

  9. AQS源码阅读笔记(一)

    AQS源码阅读笔记 先看下这个类张非常重要的一个静态内部类Node.如下: static final class Node { //表示当前节点以共享模式等待锁 static final Node S ...

随机推荐

  1. 【Python】 迭代器&生成器

    迭代器 任何一个类,只要其实现了__iter__方法,就算是一个可迭代对象.可迭代对象的__iter__方法返回的对象是迭代器,迭代器类需要实现next方法.一般来说,实现了__iter__方法的类肯 ...

  2. 数据库 --> sqlite3之api使用

    创建 if [ ! -d /opt/dbspace ] then mkdir /opt/dbspace fi if [ -f /opt/dbspace/.memo.db ] then rm /opt/ ...

  3. oracle中事务处理--事务隔离级别

    概念:隔离级别定义了事务与事务之间的隔离程度. ANSI/ISO SQL92标准定义了一些数据库操作的隔离级别(这是国际标准化组织定义的一个标准而以,不同的数据库在实现时有所不同). 隔离级别 脏读 ...

  4. Android 动画 属性动画 视图动画 补间动画 帧动画 详解 使用

    Android动画 Property Animation res/animator/filename.xml In Java: R.animator.filename In XML: @[packag ...

  5. C#简单入门

    公司给的一个小的practice C# vs2017 Stage 1 (cmd)1. Parse the dll (reflection)2. Write all the public methods ...

  6. Django 模版语法

    一.简介 模版是纯文本文件.它可以产生任何基于文本的的格式(HTML,XML,CSV等等). 模版包括在使用时会被值替换掉的 变量,和控制模版逻辑的 标签. {% extends "base ...

  7. 自主学习之RxSwift(一) -----Driver

    对于RxSwift,我也是初学者,此系列来记录我学习RxSwift的历程! (一) 想必关于Drive大家一定在RxSwift的Demo中看到过,也一定有些不解,抱着一起学习的态度,来了解一下Driv ...

  8. day-1 用python编写一个简易的FTP服务器

    从某宝上购买了一份<Python神经网络深度学习>课程,按照视频教程,用python语言,写了一个简易的FTP服务端和客户端程序,以前也用C++写过聊天程序,编程思路差不多,但是pytho ...

  9. ExtJs6级联combo的实现

    父类获取子类进行操作 { xtype: 'combo', store: Common.Dic.getDicData("IMAGE_BIG_TYPE") , multiSelect: ...

  10. python的模块和包

    ==模块== python语言的组织结构层次: 包->模块->代码文件->类->函数->代码块 什么是模块呢 可以把模块理解为一个代码文件的封装,这是比类更高一级的封装层 ...