python多线程之threading、ThreadPoolExecutor.map

背景：

（多线程执行同一个函数任务）某个应用场景需要从数据库中取出几十万的数据时，需要对每个数据进行相应的操作。逐个数据处理过慢，于是考虑对数据进行分段线程处理：

方法一：使用threading模块

代码：

 # -*- coding: utf-8 -*-

 import math

 import random

 import time

 from threading import Thread

 _result_list = []

 def split_df():

     # 线程列表

     thread_list = []

     # 需要处理的数据

     _l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

     # 每个线程处理的数据大小

     split_count = 2

     # 需要的线程个数

     times = math.ceil(len(_l) / split_count)

     count = 0

     for item in range(times):

         _list = _l[count: count + split_count]

         # 线程相关处理

         thread = Thread(target=work, args=(item, _list,))

         thread_list.append(thread)

         # 在子线程中运行任务

         thread.start()

         count += split_count

     # 线程同步，等待子线程结束任务，主线程再结束

     for _item in thread_list:

         _item.join()

 def work(df, _list):

     """

     每个线程执行的任务，让程序随机sleep几秒

     :param df:

     :param _list:

     :return:

     """

     sleep_time = random.randint(1, 5)

     print(f'count is {df},sleep {sleep_time},list is {_list}')

     time.sleep(sleep_time)

     _result_list.append(df)

 if __name__ == '__main__':

     split_df()

     print(len(_result_list), _result_list)

测试结果：

方法二：使用ThreadPoolExecutor.map

代码：

 # -*- coding: utf-8 -*-

 import math

 import random

 import time

 from concurrent.futures import ThreadPoolExecutor

 def split_list():

     # 线程列表

     new_list = []

     count_list = []

     # 需要处理的数据

     _l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

     # 每个线程处理的数据大小

     split_count = 2

     # 需要的线程个数

     times = math.ceil(len(_l) / split_count)

     count = 0

     for item in range(times):

         _list = _l[count: count + split_count]

         new_list.append(_list)

         count_list.append(count)

         count += split_count

     return new_list, count_list

 def work(df, _list):

     """ 线程执行的任务，让程序随机sleep几秒

     :param df:

     :param _list:

     :return:

     """

     sleep_time = random.randint(1, 5)

     print(f'count is {df},sleep {sleep_time},list is {_list}')

     time.sleep(sleep_time)

     return sleep_time, df, _list

 def use():

     new_list, count_list = split_list()

     with ThreadPoolExecutor(max_workers=len(count_list)) as t:

         results = t.map(work, new_list, count_list)

     # 或执行如下两行代码

     # pool = ThreadPoolExecutor(max_workers=5)

     # 使用map的优点是 每次调用回调函数的结果不用手动的放入结果list中

     # results = pool.map(work, new_list, count_list)

     # map返回一个迭代器，其中的回调函数的参数 最好是可以迭代的数据类型，如list；如果有 多个参数 则 多个参数的 数据长度相同；

     # 如： pool.map(work,[[1,2],[3,4]],[0,1]]) 中 [1,2]对应0 ；[3,4]对应1 ；其实内部执行的函数为 work([1,2],0) ; work([3,4],1)

     # map返回的结果 是 有序结果；是根据迭代函数执行顺序返回的结果

     print(type(results))

     # 如下2行 会等待线程任务执行结束后 再执行其他代码

     for ret in results:

         print(ret)

     print('thread execute end!')

 if __name__ == '__main__':

     use()

测试结果：

参考链接：https://www.cnblogs.com/rgcLOVEyaya/p/RGC_LOVE_YAYA_1103_3days.html

python多线程之threading、ThreadPoolExecutor.map的更多相关文章

python多线程之Threading
什么是线程? 线程是操作系统内核调度的基本单位,一个进程中包含一个或多个线程,同一个进程内的多个线程资源共享,线程相比进程是“轻”量级的任务,内核进行调度时效率更高. 多线程有什么优势? 多线程可以实 ...
“死锁” 与 python多线程之threading模块下的锁机制
一:死锁在死锁之前需要先了解的概念是“可抢占资源”与“不可抢占资源”[此处的资源可以是硬件设备也可以是一组信息],因为死锁是与不可抢占资源有关的. 可抢占资源:可以从拥有他的进程中抢占而不会发生副作 ...
python多线程之threading模块
threading模块中的对象其中除了Thread对象以外,还有许多跟同步相关的对象 threading模块支持守护线程的机制 Thread对象直接调用法 import threading imp ...
python 线程之 threading(四）
python 线程之 threading(三) http://www.cnblogs.com/someoneHan/p/6213100.html中对Event做了简单的介绍. 但是如果线程打算一遍一遍 ...
python 线程之 threading(三）
python 线程之 threading(一)http://www.cnblogs.com/someoneHan/p/6204640.html python 线程之 threading(二)http: ...
python并发编程之threading线程(一)
进程是系统进行资源分配最小单元,线程是进程的一个实体,是CPU调度和分派的基本单位,它是比进程更小的能独立运行的基本单位.进程在执行过程中拥有独立的内存单元,而多个线程共享内存等资源. 系列文章 py ...
python利用(threading,ThreadPoolExecutor.map,ThreadPoolExecutor.submit) 三种多线程方式处理 list数据
需求:在从银行数据库中取出几十万数据时,需要对每行数据进行相关操作,通过pandas的dataframe发现数据处理过慢,于是对数据进行分段后通过线程进行处理: 如下给出测试版代码,通过 ...
python多线程之Condition（条件变量）
#!/usr/bin/env python # -*- coding: utf-8 -*- from threading import Thread, Condition import time it ...
python多线程之semaphore（信号量）
#!/usr/bin/env python # -*- coding: utf-8 -*- import threading import time import random semaphore = ...

随机推荐

对保存的参数checkpoints进行可视化读取 1.pywrap_tensorflow.NewCheckpoint(获得checkpoint的读取器) 2.np.save(对npy文件进行保存) 3.tl.file.load_npy_to_any(对保存的npy文件进行读取)
1. pywrap_tensorflow.NewCheckpoint(path)获得checkpoint的读取器参数说明: path表示checkpoint的路径 2.np.save(path, d ...
全面解读php-流程控制
一.PHP遍历数组的三种方式示例: $arr = [1, 2, 3 4, 'five' => 5]; 1.for () for循环只能用于遍历纯索引数组!如果存在关联数组,count统计时会 ...
retrofit2+rxjava+okhttp网络请求实现
第一步:添加依赖: compile 'io.reactivex:rxandroid:1.2.0' compile 'com.squareup.retrofit2:adapter-rxjava:2.1. ...
MQTT主题Topic讲解
文章转载于https://www.cnblogs.com/hayasi/p/7792191.html 我们已经把相关的连接报文搞定了.笔者想来想去还是决定先讲解一下订阅报文(SUBSCRIBE ).如 ...
Group By查询
1.概述 “Group By”从字面意义上理解就是根据“By”指定的规则对数据进行分组,所谓的分组就是将一个“数据集”划分成若干个“小区域”,然后针对若干个“小区域”进行数据处理. 2.原始表 3.简 ...
CentOS mysql安装
MySQL For Excel 1.3.5MySQL for Visual Studio 1.2.5MySQL Fabric 1.5.6 & MySQL Utilities 1.5.6Conn ...
CPU性能监测介绍
CPU的性能监测包含以下部分: * 检查系统运行队列并确保每个核心上不超过3个可运行进程* 确保CPU利用率的用户时间和系统时间在70/30之间* 当CPU花费更多的时间在system mode上时, ...
Jmeter之乱码（三）
使用Jmeter执行JDBC请求,往MySQL数据库中插入数据,如下图: 执行结果如下: 解决方案: 在JDBC Connection Configuration中的Database URL后加上&a ...
jquery.qrcode.min.js——前端生成二维码
下载地址:[http://www.jq22.com/jquery-info294] demo1:[https://www.helloweba.com/view-blog-226.html] demo2 ...
【miscellaneous】监狱安防系统智能视频监控系统设计方案
1监狱安防新需求随着司法监狱管理系统内视频监控系统的日益发展,现有的被动式人工监控这一传统模式已无法满足新形势下的监管工作需求,尤其是现在靠轮询的视频监控方式,无法对突发恶性事件做到第一时间的防御和 ...

python多线程之threading、ThreadPoolExecutor.map

背景：

方法一：使用threading模块

代码：

测试结果：

方法二：使用ThreadPoolExecutor.map

代码：

测试结果：

python多线程之threading、ThreadPoolExecutor.map的更多相关文章

随机推荐

热门专题