Python线程池ThreadPoolExecutor源码分析

先看个例子：

import time

from concurrent.futures import ThreadPoolExecutor

def foo():

    print('enter at {} ...'.format(time.strftime('%X')))

    time.sleep(5)

    print('exit  at {} ...'.format(time.strftime('%X')))

executor = ThreadPoolExecutor()

executor.submit(foo)

executor.shutdown()

执行结果：

enter at 16:20:31 ...

exit  at 16:20:36 ...

shutdown(wait=True) 方法默认阻塞当前线程，等待子线程执行完毕。即使 shutdown(wait=Fasle)也只是非阻塞的关闭线程池，线程池中正在执行任务的子线程并不会被马上停止，而是会继续执行直到执行完毕。尝试在源码中给新开启的子线程调用t.join(0)来立马强制停止子线程t，也不行，到底是什么原因保证了线程池中的线程在关闭线程池时，线程池中正在执行任务的子线程们不会被关闭呢？

看一下ThreadPoolExecutor源码：

class ThreadPoolExecutor(_base.Executor):

    def __init__(self, max_workers=None, thread_name_prefix=''):

        """Initializes a new ThreadPoolExecutor instance.

        Args:

            max_workers: The maximum number of threads that can be used to

                execute the given calls.

            thread_name_prefix: An optional name prefix to give our threads.

        """

        if max_workers is None:

            # Use this number because ThreadPoolExecutor is often

            # used to overlap I/O instead of CPU work.

            max_workers = (os.cpu_count() or 1) * 5

        if max_workers <= 0:

            raise ValueError("max_workers must be greater than 0")

        self._max_workers = max_workers

        self._work_queue = queue.Queue()

        self._threads = set()

        self._shutdown = False

        self._shutdown_lock = threading.Lock()

        self._thread_name_prefix = thread_name_prefix

    def submit(self, fn, *args, **kwargs):

        with self._shutdown_lock:

            if self._shutdown:

                raise RuntimeError('cannot schedule new futures after shutdown')

            f = _base.Future()

            # 把目标函数f包装成worker对象，执行worker.run()会调用f()

            w = _WorkItem(f, fn, args, kwargs)

            # 把worker对象放入到队列中

            self._work_queue.put(w)

            # 开启一个新的线程不断的从queue中获取worker对象，获取到则调用worker.run()

            self._adjust_thread_count()

            return f

    submit.__doc__ = _base.Executor.submit.__doc__

    def _adjust_thread_count(self):

		# 当执行del executor时，这个回调方法会被调用，也就是说当executor对象被垃圾回收时调用

        def weakref_cb(_, q=self._work_queue):

            q.put(None)

        num_threads = len(self._threads)

        if num_threads < self._max_workers:

            thread_name = '%s_%d' % (self._thread_name_prefix or self,

                                     num_threads)

			# 把_worker函数作为新线程的执行函数

            t = threading.Thread(name=thread_name, target=_worker,

                                 args=(weakref.ref(self, weakref_cb),

                                       self._work_queue))

            t.daemon = True

            t.start()

            self._threads.add(t)

            # 这一步很重要，是确保该线程t不被t.join(0)强制中断的关键。具体查看_python_exit函数

            _threads_queues[t] = self._work_queue

    def shutdown(self, wait=True):

        with self._shutdown_lock:

            self._shutdown = True

            self._work_queue.put(None)

        if wait:

            for t in self._threads:

                t.join()

    shutdown.__doc__ = _base.Executor.shutdown.__doc__

submit(func) 干了两件事：

把worker放入queue中
开启一个新线程不断从queue中取出woker，执行woker.run()，即执行func()

_adjust_thread_count()干了两件事：

开启一个新线程执行_worker函数，这个函数的作用就是不断去queue中取出worker，执行woker.run()，即执行func()
把新线程跟队列queue绑定，防止线程被join(0)强制中断。

来看一下_worker函数源码：

def _worker(executor_reference, work_queue):

    try:

        while True:

            # 不断从queue中取出worker对象

            work_item = work_queue.get(block=True)

            if work_item is not None:

                # 执行func()

                work_item.run()

                # Delete references to object. See issue16284

                del work_item

                continue

            # 从弱引用对象中返回executor

            executor = executor_reference()

            # Exit if:

            #   - The interpreter is shutting down OR

            #   - The executor that owns the worker has been collected OR

            #   - The executor that owns the worker has been shutdown.

            # 当executor执行shutdown()方法时executor._shutdown为True，同时会放入None到队列，

            # 当work_item.run()执行完毕时，又会进入到下一轮循环从queue中获取worker对象，但是

            # 由于shutdown()放入了None到queue，因此取出的对象是None，从而判断这里的if条件分支，

            # 发现executor._shutdown是True，又放入一个None到queue中，是来通知其他线程跳出while循环的

			# shutdown()中的添加None到队列是用来结束线程池中的某一个线程的，这个if分支中的添加None

        	# 队列是用来通知其他线程中的某一个线程结束的，这样连锁反应使得所有线程执行完func中的逻辑后都会结束

            if _shutdown or executor is None or executor._shutdown:

                # Notice other workers

                work_queue.put(None)

                return

            del executor

    except BaseException:

        _base.LOGGER.critical('Exception in worker', exc_info=True)

可以看出，这个 _worker方法的作用就是在新新线程中不断获得queue中的worker对象，执行worker.run()方法，执行完毕后通过放入None到queue队列的方式来通知其他线程结束。

再来看看_adjust_thread_count()方法中的_threads_queues[t] = self._work_queue这个操作是如何实现防止join(0)的操作强制停止正在执行的线程的。

import atexit

_threads_queues = weakref.WeakKeyDictionary()

_shutdown = False

def _python_exit():

    global _shutdown

    _shutdown = True

    items = list(_threads_queues.items())

    for t, q in items:

        q.put(None)

	# 取出_threads_queues中的线程t，执行t.join()强制等待子线程完成

    for t, q in items:

        t.join()

atexit.register(_python_exit)

这个atexit模块的作用是用来注册一个函数，当MainThread中的逻辑执行完毕时，会执行注册的这个_python_exit函数。然后执行_python_exit中的逻辑，也就是说t.join()会被执行，强制阻塞。这里好奇，既然是在MainThread结束后执行，那这个t.join()是在什么线程中被执行的呢。其实是一个叫_DummyThread线程的虚拟线程中执行的。

import atexit

import threading

import weakref

import time

threads_queues = weakref.WeakKeyDictionary()

def foo():

    print('enter at {} ...'.format(time.strftime('%X')))

    time.sleep(5)

    print('exit  at {} ...'.format(time.strftime('%X')))

def _python_exit():

    items = list(threads_queues.items())

    print('current thread in _python_exit --> ', threading.current_thread())

    for t, _ in items:

        t.join()

atexit.register(_python_exit)

if __name__ == '__main__':

    t = threading.Thread(target=foo)

    t.setDaemon(True)

    t.start()

    threads_queues[t] = foo

    print(time.strftime('%X'))

    t.join(timeout=2)

    print(time.strftime('%X'))

    t.join(timeout=2)

    print(time.strftime('%X'))

    print('current thread in main -->', threading.current_thread())

    print(threading.current_thread(), 'end')

执行结果：

enter at 17:13:44 ...

17:13:44

17:13:46

17:13:48

current thread in main --> <_MainThread(MainThread, started 12688)>

<_MainThread(MainThread, started 12688)> end

current thread in _python_exit -->  <_DummyThread(Dummy-2, started daemon 12688)>

exit  at 17:13:49 ...

从这个例子可以看到，当线程t开启时foo函数阻塞5秒，在MainThread中2次调用t.join(timeout=2)，分别的等待了2秒，总等待时间是4秒，但是当执行第二个t.join(timeout=2)后，线程t依然没有被强制停止，然后主线执行完毕，然后_python_exit方法被调用，在_DummyThread线程中由调用t.join()，继续等待子线程t的执行完毕，直到线程t打印exit at 17:13:49 ...才执行完毕。

总结：

join()是可以被一个线程多次调用的，相当是多次等待的叠加。把_python_exit函数注册到atexit模块后，其他线程即使企图调用t.jion(n)来终止线程t也不起作用，因为_python_exit总是在最后执行时调用t.jion()来保证让线程t执行完毕，而不是被中途强制停止。

Python线程池ThreadPoolExecutor源码分析的更多相关文章

Java核心复习——线程池ThreadPoolExecutor源码分析
一.线程池的介绍线程池一种性能优化的重要手段.优化点在于创建线程和销毁线程会带来资源和时间上的消耗,而且线程池可以对线程进行管理,则可以减少这种损耗. 使用线程池的好处如下: 降低资源的消耗提高响 ...
线程池ThreadPoolExecutor源码分析
在阿里编程规约中关于线程池强制了两点,如下: [强制]线程资源必须通过线程池提供,不允许在应用中自行显式创建线程.说明:使用线程池的好处是减少在创建和销毁线程上所消耗的时间以及系统资源的开销,解决资源 ...
java线程池ThreadPoolExector源码分析
java线程池ThreadPoolExector源码分析今天研究了下ThreadPoolExector源码,大致上总结了以下几点跟大家分享下: 一.ThreadPoolExector几个主要变量先 ...
【Java并发编程】21、线程池ThreadPoolExecutor源码解析
一.前言 JUC这部分还有线程池这一块没有分析,需要抓紧时间分析,下面开始ThreadPoolExecutor,其是线程池的基础,分析完了这个类会简化之后的分析,线程池可以解决两个不同问题:由于减少了 ...
Java并发之线程池ThreadPoolExecutor源码分析学习
线程池学习以下所有内容以及源码分析都是基于JDK1.8的,请知悉. 我写博客就真的比较没有顺序了,这可能跟我的学习方式有关,我自己也觉得这样挺不好的,但是没办法说服自己去改变,所以也只能这样想到什么 ...
线程池ThreadPoolExecutor源码解读研究（JDK1.8）
一.什么是线程池为什么要使用线程池?在多线程并发开发中,线程的数量较多,且每个线程执行一定的时间后就结束了,下一个线程任务到来还需要重新创建线程,这样线程数量特别庞大的时候,频繁的创建线程和销毁线程 ...
ThreadPoolExecutor（线程池）源码分析
1. 常量和变量 private final AtomicInteger ctl = new AtomicInteger(ctlOf(RUNNING, 0)); // 高3位为线程池的运行状态,低29 ...
Java调度线程池ScheduledThreadPoolExecutor源码分析
最近新接手的项目里大量使用了ScheduledThreadPoolExecutor类去执行一些定时任务,之前一直没有机会研究这个类的源码,这次趁着机会好好研读一下. 该类主要还是基于ThreadPoo ...
Java并发包源码学习系列：线程池ThreadPoolExecutor源码解析
目录 ThreadPoolExecutor概述线程池解决的优点线程池处理流程创建线程池重要常量及字段线程池的五种状态及转换 ThreadPoolExecutor构造参数及参数意义 Work类 ...

随机推荐

Hermite曲线插值
原文 Hermite Curve Interpolation Hermite Curve Interpolation Hamburg (Germany), the 30th March 1998. W ...
aspx页面@Page指令解析
@Page指令位于每个ASP.NET页面的顶部,告诉ASP.NET这个具体页面使用什么属性,以及该页面继承的用户控件.ASP.NET页面@Page指令属性有:AspCompat.Async.Async ...
关于"云服务器被检测到对外攻击已阻断该服务器对其它服务器端口的访问"的解决措施
前段时间阿里云大量发送云服务器对外攻击的信息到邮箱中,邮件信息大概如下: 您的云服务器(XX.XX.XX.XX)由于被检测到对外攻击,已阻断该服务器对其它服务器端口(TCP:XX)的访问,阻断预计将在 ...
UWP 浏览本地图片及对图片的裁剪
原文:UWP 浏览本地图片及对图片的裁剪 1.前言准备给我的校园助手客户端添加一个修改头像的功能,但是查了好多资料都没有找到裁剪图片的简单的方法,最后才找到这个使用Launcher调用系统组件的简单 ...
Win10《芒果TV》更新v3.8.40青春版：优化推送策略、新增缓存清理
芒果TV暑期重磅活动-青春芒果节拉开帷幕,炫酷的3D视觉大秀.王牌IP互动体验馆.众星云集的青春炙燥夜晚会.神秘的芒果吉祥物发布,Win10版<芒果TV>全平台同步更新青春版v3.8.40 ...
八荣八耻 IT版
八荣八耻 IT版以可配置为荣,以硬编码为耻:以系统互备为荣,以系统单点为耻:以随时可重启为荣,以不能迁移为耻:以整体交付为荣,以部分交付为耻:以无状态为荣,以有状态为耻:以标准化为荣,以特殊化为耻:以 ...
零元学Expression Blend 4 - Chapter 39 虾米?!同款?不同师傅!告诉你Visible、Hidden与Collapsed的差异!
原文:零元学Expression Blend 4 - Chapter 39 虾米?!同款?不同师傅!告诉你Visible.Hidden与Collapsed的差异! 由此可知 Hidden为隐藏项目,但 ...
PING[ARC2]: Heartbeat failed to connect to standby 'gqtzdb_dg'. Error is 16047
Data Guard搭建完毕之后,执行开启归档路径2,结果发现不同步, SQL> alter system set log_archive_dest_state_2=enable; 查看错误日志 ...
UWP入门（五）--控件模板
原文:UWP入门(五)--控件模板通过在 XAML 框架中创建控件模板,你可以自定义控件的可视结构和可视行为(eg:勾选框的三种状态). 控件有多个属性,如 Background.Foregroun ...
UWP入门（十一）--使用选取器打开文件和文件夹
原文:UWP入门(十一)--使用选取器打开文件和文件夹很漂亮的功能,很有趣重要的 API FileOpenPicker FolderPicker StorageFile 通过让用户与选取器交互来访 ...

Python线程池ThreadPoolExecutor源码分析

Python线程池ThreadPoolExecutor源码分析的更多相关文章

随机推荐

热门专题