python3 多线程和多进程

一.线程和进程

1.操作系统中，线程是CPU调度和分派的基本单位，线程依存于程序中

2.操作系统中，进程是系统进行资源分配和调度的一个基本单位，一个程序至少有一个进程

3.一个进程由至少一个线程组成，线程组成进程

4.多进程、多进程实际是进程、线程、进程和线程的并发而不是并行，用来加快程序运行速度

5.Python既支持多线程，也支持多进程。

二.多线程threading

1.python3线程操作中常用模块：_thread和threading，其中一般都用threading模块

2.线程分为：内核线程：由操作系统内核创建和撤销；用户线程：不需要内核支持而在用户程序中实现的线程

3.Python中使用线程有两种方式：函数或者用类来包装线程对象

2. 创建线程

 import threading

 #def main():#定义一个存放多线程的函数

 #   print(threading.active_count())#获取已激活的线程数

 #   print(threading.enumerate()) # see the thread list查询线程信息

 #   print(threading.current_thread())#查询当前运行的线程

 def thread_job():#定义一个线程的工作的函数

     print('This is a thread of %s' % threading.current_thread())

 def main():

     thread = threading.Thread(target=thread_job,)#添加线程，参数为线程的目标（任务）

     thread.start()#执行线程thread

 if __name__ == '__main__':#在该程序中运行

     main()

 ---------------------------------------

 This is a thread of <Thread(Thread-1, started 5664)>

创建线程1

 #调用 _thread 模块中的start_new_thread()函数来产生新线程

 #_thread.start_new_thread ( function, args[, kwargs] )（线程函数，函数的参数tuple，可选参数）

 import _thread

 import time

 # 为线程定义一个函数

 def print_time( threadName, delay):

    count = 0

    while count < 5:

       time.sleep(delay)

       count += 1

       print ("%s: %s" % ( threadName, time.ctime(time.time()) ))

 # 创建两个线程

 try:

    _thread.start_new_thread( print_time, ("Thread-1", 2, ) )

    _thread.start_new_thread( print_time, ("Thread-2", 4, ) )

 except:

    print ("Error: 无法启动线程")

 while 1:

    pass

 ---------------------------------------------------------

 Thread-1: Thu May 17 17:13:01 2018

 Thread-2: Thu May 17 17:13:03 2018

 Thread-1: Thu May 17 17:13:03 2018

 Thread-1: Thu May 17 17:13:05 2018

 Thread-2: Thu May 17 17:13:07 2018

 。。。

创建线程 2

 #从 threading.Thread 继承创建一个新的子类，并实例化后调用 start() 方法启动新线程，即它调用了线程的 run() 方法

 import threading

 import time

 exitFlag = 0

 class myThread (threading.Thread):

     def __init__(self, threadID, name, counter):

         threading.Thread.__init__(self)

         self.threadID = threadID

         self.name = name

         self.counter = counter

     def run(self):

         print ("开始线程：" + self.name)

         print_time(self.name, self.counter, 5)

         print ("退出线程：" + self.name)

 def print_time(threadName, delay, counter):

     while counter:

         if exitFlag:

             threadName.exit()

         time.sleep(delay)

         print ("%s: %s" % (threadName, time.ctime(time.time())))

         counter -= 1

 # 创建新线程

 thread1 = myThread(1, "Thread-1", 1)

 thread2 = myThread(2, "Thread-2", 2)

 # 开启新线程

 thread1.start()

 thread2.start()

 thread1.join()

 thread2.join()

 print ("退出主线程")

 ------------------------------------------------

 开始线程：Thread-1

 开始线程：Thread-2

 Thread-1: Thu May 17 17:18:32 2018

 Thread-2: Thu May 17 17:18:33 2018

 Thread-1: Thu May 17 17:18:33 2018

 。。。

创建线程 3

3.join功能，控制执行顺序

 import threading

 import time

 def thread_job():#定义一个线程

     print('T1 start\n')#开始t1线程

     for i in range(10):#设置10步

         time.sleep(0.1)#任务间隔0.1秒，增加时耗

     print('T1 finish\n')#结束t1线程

 def T2_job():#又定义一个线程

     print('T2 start\n')#开始

     print('T2 finish\n')#结束

 def main():#定义主程序

     thread1 = threading.Thread(target=thread_job, name='T1')#添加线程t1

     thread2 = threading.Thread(target=T2_job, name='T2')#添加线程t2

     thread1.start()#开始运行t1

     thread2.start()#开始运行t2

     #使用join控制多个线程的执行顺序

     thread2.join()#等到t2运行完再进行下一步

     thread1.join()#到t1运行完再进行下一步

     print('all done\n')#表明程序执行完

 if __name__ == '__main__':

     main()

 --------------------------------------------------------

 T1 start

 T2 start

 T2 finish

 T1 finish

 all done

join

4.存储进程结果Queue,多线程调用的函数不能有返回值, 所以使用Queue存储多个线程运算的结果

 #，将数据列表中的数据传入，使用四个线程处理，将结果保存在Queue中，

 # 线程执行完后，从Queue中获取存储的结果

 import threading

 import time

 from queue import Queue#导入队列模块

 def job(l,q):#函数的参数是一个列表l和一个队列q

     #函数的功能是，对列表的每个元素进行平方计算，将结果保存在队列中

     for i in range(len(l)):

         l[i] = l[i]**2

     q.put(l)#把列表l存放进队列q中

 def multithreading():#定义一个多线程函数

     q = Queue()#创建一个空队列，用来保存返回值

     threads = []#定义一个多线程列表

     data = [[1,2,3],[3,4,5],[4,4,4],[5,5,5]]#初始化一个多维数据列表

     for i in range(4):#定义四个线程

         t = threading.Thread(target=job, args=(data[i], q))

         #创建一个线程，任务是job函数，被调用的job函数没有括号，只是一个索引，参数在后面

         t.start()#开始t

         threads.append(t)# #把每个线程append到线程列表中

     for thread in threads:

         thread.join()#分别join四个线程到主线程

     results = []#定义一个空的列表results，将四个线运行后保存在队列中的结果返回给空列表results

     for _ in range(4):

           results.append(q.get()) #q.get()按顺序从q中拿出一个值

     print(results)

 if __name__ == '__main__':

     multithreading()

 -------------------------------------------------

 [[1, 4, 9], [9, 16, 25], [16, 16, 16], [25, 25, 25]]

queue

 import queue

 import threading

 import time

 exitFlag = 0

 class myThread (threading.Thread):

     def __init__(self, threadID, name, q):

         threading.Thread.__init__(self)

         self.threadID = threadID

         self.name = name

         self.q = q

     def run(self):

         print ("开启线程：" + self.name)

         process_data(self.name, self.q)

         print ("退出线程：" + self.name)

 def process_data(threadName, q):

     while not exitFlag:

         queueLock.acquire()

         if not workQueue.empty():

             data = q.get()

             queueLock.release()

             print ("%s processing %s" % (threadName, data))

         else:

             queueLock.release()

         time.sleep(1)

 threadList = ["Thread-1", "Thread-2", "Thread-3"]

 nameList = ["One", "Two", "Three", "Four", "Five"]

 queueLock = threading.Lock()

 workQueue = queue.Queue(10)

 threads = []

 threadID = 1

 # 创建新线程

 for tName in threadList:

     thread = myThread(threadID, tName, workQueue)

     thread.start()

     threads.append(thread)

     threadID += 1

 # 填充队列

 queueLock.acquire()

 for word in nameList:

     workQueue.put(word)

 queueLock.release()

 # 等待队列清空

 while not workQueue.empty():

     pass

 # 通知线程是时候退出

 exitFlag = 1

 # 等待所有线程完成

 for t in threads:

     t.join()

 print ("退出主线程")

 ------------------------------------------------------

 开启线程：Thread-1

 开启线程：Thread-2

 开启线程：Thread-3

 Thread-3 processing One

 Thread-2 processing Two

 Thread-1 processing Three

 Thread-3 processing Four

 Thread-2 processing Five

 退出线程：Thread-3

 退出线程：Thread-2

 退出线程：Thread-1

 退出主线程

queue

5.GIL（Global Interpreter Lock）全局解释锁：它确保任何时候都只有一个Python线程执行，导致了多线程无法利用多核。

实质上python中的多线程就是节省了i/o时间

 # 测试 GIL

 import threading

 from queue import Queue

 import copy

 import time

 def job(l, q):#传入一个列表和一个队列，

     #操作：计算列表的总和，并把它传入到队列中

     res = sum(l)

     q.put(res)

 def multithreading(l):#多线程函数

     q = Queue()

     threads = []

     for i in range(4):#同时执行4个线程

         t = threading.Thread(target=job, args=(copy.copy(l), q), name='T%i' % i)

         t.start()#t开始执行

         threads.append(t)#线程列表中添加t

     [t.join() for t in threads]

     total = 0

     for _ in range(4):

         total += q.get()

     print(total)

 def normal(l):#计算列表的总值

     total = sum(l)

     print(total)

 if __name__ == '__main__':

     l = list(range(1000000))

     # 1.正常执行：一个列表扩展4倍，并打印出执行时间

     s_t = time.time()

     normal(l*4)

     print('normal: ',time.time()-s_t)

     #2.线程执行：建立四个线程执行

     s_t = time.time()

     multithreading(l)

     print('multithreading: ', time.time()-s_t)

 -------------------------------------------------------------

 1999998000000

 normal:  0.24673938751220703

 1999998000000

 multithreading:  0.244584321975708

GIL

6.lock线程锁：lock在不同线程使用同一共享内存时（只允许一个线程修改数据），能够确保线程之间互不影响，但是阻止了多线程并行，也可能导致死锁

如果多个线程共同对某个数据修改，则可能出现不可预料的结果，为了保证数据的正确性，需要对多个线程进行同步

多线程和多进程最大的不同在于:多进程中，同一个变量，各自有一份拷贝存在于每个进程中，互不影响，而多线程中，所有变量都由所有线程共享，所以，任何一个变量都可以被任何一个线

程修改，因此，线程之间共享数据最大的危险在于多个线程同时改一个变量，把内容给改乱了

 #使用lock的方法是， 在每个线程执行运算修改共享内存之前，执行lock.acquire()将共享内存上锁，

 # 确保当前线程执行时，内存不会被其他线程访问，

 # 执行运算完毕后，使用lock.release()将锁打开， 保证其他的线程可以使用该共享内存。

 import threading

 def job1():

     global A, lock#使用线程锁

     lock.acquire()#lock.acquire()将共享内存上锁

     for i in range(10):

         A += 1

         print('job1', A)

     lock.release()#lock.release()将锁打开， 保证其他的线程可以使用该共享内存

 def job2():

     global A, lock

     lock.acquire()#确保当前线程执行时，内存不会被其他线程访问

     for i in range(10):

         A += 10

         print('job2', A)

     lock.release()#保证其他的线程可以使用该共享内存

 if __name__ == '__main__':

     lock = threading.Lock()#创建一个线程锁

     A = 0

     t1 = threading.Thread(target=job1)

     t2 = threading.Thread(target=job2)

     t1.start()

     t2.start()

     t1.join()

     t2.join()

 ------------------------------------------------------------

 job1 1

 job1 2

 job1 3

 job1 4

 job1 5

 job1 6

 job1 7

 job1 8

 job1 9

 job1 10

 job2 20

 job2 30

 job2 40

 job2 50

 job2 60

 job2 70

 job2 80

 job2 90

 job2 100

lock

 #使用 Thread 对象的 Lock 和 Rlock 可以实现简单的线程同步

 import threading

 import time

 class myThread (threading.Thread):

     def __init__(self, threadID, name, counter):

         threading.Thread.__init__(self)

         self.threadID = threadID

         self.name = name

         self.counter = counter

     def run(self):

         print ("开启线程： " + self.name)

         # 获取锁，用于线程同步

         threadLock.acquire()

         print_time(self.name, self.counter, 3)

         # 释放锁，开启下一个线程

         threadLock.release()

 def print_time(threadName, delay, counter):

     while counter:

         time.sleep(delay)

         print ("%s: %s" % (threadName, time.ctime(time.time())))

         counter -= 1

 threadLock = threading.Lock()

 threads = []

 # 创建新线程

 thread1 = myThread(1, "Thread-1", 1)

 thread2 = myThread(2, "Thread-2", 2)

 # 开启新线程

 thread1.start()

 thread2.start()

 # 添加线程到线程列表

 threads.append(thread1)

 threads.append(thread2)

 # 等待所有线程完成

 for t in threads:

     t.join()

 print ("退出主线程")

 ------------------------------------------------------

 开启线程： Thread-1

 开启线程： Thread-2

 Thread-1: Thu May 17 17:25:58 2018

 Thread-1: Thu May 17 17:25:59 2018

 Thread-1: Thu May 17 17:26:00 2018

 Thread-2: Thu May 17 17:26:02 2018

 Thread-2: Thu May 17 17:26:04 2018

 Thread-2: Thu May 17 17:26:06 2018

 退出主线程

lock 补充

递归锁RLcok类的用法和Lock类一模一样，但它支持嵌套，在多个锁没有释放的时候一般会使用使用RLcok类

信号量（BoundedSemaphore类）用法和Lock类一模一样，但同时允许一定数量的线程更改数据

7.补充

三.多进程multiprocessing(或者说是多核)

1.多进程 Multiprocessing 和多线程 threading 类似，但是多核使用，用来弥补 threading 的一些劣势, 比如GIL

2.创建进程

 import multiprocessing as mp

 import threading as td

 def job(a,d):#定义一个被线程和进程调用的函数

     print('aaaaa')

 if __name__=='__main__':

 # 创建线程和进程

     t1 = td.Thread(target=job,args=(1,2))

     p1 = mp.Process(target=job,args=(1,2))

     t1.start()

     p1.start()

     t1.join()

     p1.join()

 -------------------------------------------------------

 aaaaa

 aaaaa

进程创建

3.Queue:将每个核或线程的运算结果放在队里中，等到每个线程或核运行完毕后再从队列中取出结果，继续加载运算。

 import multiprocessing as mp

 #调用的函数不能有返回值

 def job(q):

     res = 0

     for i in range(1000):

         res += i+i**2+i**3

     q.put(res) # queue把结果放到q中

 if __name__ == '__main__':

     q = mp.Queue()

     p1 = mp.Process(target=job, args=(q,))

     #要加逗号，否则TypeError: 'Queue' object is not iterable

     p2 = mp.Process(target=job, args=(q,))

     p1.start()

     p2.start()

     p1.join()

     p2.join()

     res1 = q.get()#获取q中的结果

     res2 = q.get()

     print(res1+res2)

 --------------------------------------------------------------

 499667166000

queue

4.效率对比

 #多线程与多进程的效率对比

 import multiprocessing as mp

 import threading as td

 import time

 def job(q):

     res = 0

     for i in range(1000000):

         res += i+i**2+i**3

     q.put(res) # queue

 def multicore():

     q = mp.Queue()

     p1 = mp.Process(target=job, args=(q,))

     p2 = mp.Process(target=job, args=(q,))

     p1.start()

     p2.start()

     p1.join()

     p2.join()

     res1 = q.get()

     res2 = q.get()

     print('multicore:' , res1+res2)

 def normal():#普通运算

     res = 0

     for _ in range(2):

         for i in range(1000000):

             res += i+i**2+i**3

     print('normal:', res)

 def multithread():

     q = mp.Queue()

     t1 = td.Thread(target=job, args=(q,))

     t2 = td.Thread(target=job, args=(q,))

     t1.start()

     t2.start()

     t1.join()

     t2.join()

     res1 = q.get()

     res2 = q.get()

     print('multithread:', res1+res2)

 if __name__ == '__main__':

     st = time.time()

     normal()

     st1= time.time()

     print('normal time:', st1 - st)

     st2 = time.time()

     multithread()

     st3 = time.time()

     print('multithread time:', st3 - st2)

     st4 = time.time()

     multicore()

     st5 = time.time()

     print('multicore time:', st5-st4)

 -----------------------------------------------------------------

 normal: 499999666667166666000000

 normal time: 1.8749690055847168

 multithread: 499999666667166666000000

 multithread time: 1.583195686340332

 multicore: 499999666667166666000000

 multicore time: 1.2573533058166504

效率对比

5.进程池pool

 #进程池就是我们将所要运行的东西，放到池子里，Python会自行解决多进程的问题

 import multiprocessing as mp

 def job(x):

     return x*x

 def multicore():

     pool = mp.Pool(processes=2)#创建进程池，定义调用2个cpu核

     # 让池子对应某一个函数，我们向池子里丢数据，池子就会返回函数返回的值

     # Pool和之前的Process的不同点是丢向Pool的函数有返回值，而Process的没有返回值

     res = pool.map(job, range(10))

     #用map()获取结果，在map()中需要放入函数和需要迭代运算的值，然后它会自动分配给CPU核，返回结果

     print(res)

     res = pool.apply_async(job, (2,))

     # apply_async()返回结果，只能传递一个值，它只会放入一个核进行运算，但是传入值时要注意是可迭代的，

     # 所以在传入值后需要加逗号, 同时需要用get()方法获取返回值

     print(res.get())

     # 将apply_async()放入迭代器中，来输出多个结果

     multi_res =[pool.apply_async(job, (i,)) for i in range(10)]

     print([res.get() for res in multi_res])

 if __name__ == '__main__':

     multicore()

 --------------------------------------------------

 [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

 4

 [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

pool

6.进程锁lock

 # 共享内存

 # import multiprocessing as mp

 #

 # # Value数据存储在一个共享的内存表中

 # value1 = mp.Value('i', 0)

 # value2 = mp.Value('d', 3.14)

 # #d和i参数用来设置数据类型的

 #

 # #Array类，可以和共享内存交互，来实现在进程之间共享数据。

 # array = mp.Array('i', [1, 2, 3, 4])

 # # 它只能是一维的，不能是多维的

 #进程锁lock

 import multiprocessing as mp

 import time

 def job(v, num, l):

     l.acquire()#设置进程锁的使用，保证运行时一个进程的对锁内内容的独占

     for _ in range(10):

         time.sleep(0.1)

         v.value += num# v.value获取共享变量值

         print(v.value)

     l.release()#进程锁保证了进程p1的完整运行，然后才进行了进程p2的运行

 def multicore():

     l = mp.Lock()# 定义一个进程锁

     v = mp.Value('i', 0) # 定义共享变量

     # 设定不同的number看如何抢夺内存

     p1 = mp.Process(target=job, args=(v, 1, l))

     p2 = mp.Process(target=job, args=(v, 3, l))

     p1.start()

     p2.start()

     p1.join()

     p2.join()

 if __name__ == '__main__':

     multicore()

 -------------------------------------------------

 1

 2

 3

 4

 5

 6

 7

 8

 9

 10

 13

 16

 19

 22

 25

 28

 31

 34

 37

 40

进程锁lock

python3 多线程和多进程的更多相关文章

Python3 多线程、多进程
python中的线程是假线程,不同线程之间的切换是需要耗费资源的,因为需要存储线程的上下文,不断的切换就会耗费资源.. python多线程适合io操作密集型的任务(如socket server 网络并 ...
多线程、多进程、协程、缓存（memcache、redis）
本节内容: 线程: a:基本的使用: 创建线程: 1:方法 import threading def f1(x): print(x) if __name__=='__main__': t=thread ...
Python多线程和多进程谁更快？
python多进程和多线程谁更快 python3.6 threading和multiprocessing 四核+三星250G-850-SSD 自从用多进程和多线程进行编程,一致没搞懂到底谁更快.网上很 ...
重写父类、多线程、多进程、logging模块
1. 重写父类 1) 子类定义父类同名函数后,父类函数被覆盖: 2) 如果需要,可以在子类中调用父类方法:”父类名.方法(self)”.”父类名().方法()”.”super(子类名,self). ...
Python 多线程、多进程（一）之源码执行流程、GIL
Python 多线程.多进程 (一)之源码执行流程.GIL Python 多线程.多进程 (二)之多线程.同步.通信 Python 多线程.多进程 (三)之线程进程对比.多线程一.python ...
Python 多线程、多进程（二）之多线程、同步、通信
Python 多线程.多进程 (一)之源码执行流程.GIL Python 多线程.多进程 (二)之多线程.同步.通信 Python 多线程.多进程 (三)之线程进程对比.多线程一.python ...
Python之多线程与多进程（二）
多进程上一章:Python多线程与多进程(一) 由于GIL的存在,Python的多线程并没有实现真正的并行.因此,一些问题使用threading模块并不能解决不过Python为并行提供了一个替代方 ...
python多线程与多进程及其区别
个人一直觉得对学习任何知识而言,概念是相当重要的.掌握了概念和原理,细节可以留给实践去推敲.掌握的关键在于理解,通过具体的实例和实际操作来感性的体会概念和原理可以起到很好的效果.本文通过一些具体的例子 ...
python的多线程和多进程（一）
在进入主题之前,我们先学习一下并发和并行的概念: --并发:在操作系统中,并发是指一个时间段中有几个程序都处于启动到运行完毕之间,且这几个程序都是在同一个处理机上运行.但任一时刻点上只有一个程序在处理 ...

随机推荐

poj1873（二进制枚举+求凸包周长）
题目链接:https://vjudge.net/problem/POJ-1873 题意:n个点(2<=n<=15),给出n个点的坐标(x,y).价值v.做篱笆时的长度l,求选择哪些点来做篱 ...
【CodeForces】1172E. Nauuo and ODT
题解看了一遍题解(以及代码)但是没写代码-- 后来做梦的时候忽然梦到了这道题--意识到我需要补一下-- 这道题就是,对于每种颜色,把没有染成这种颜色的点标成黑点,然后计算每个联通块的平方然后每个点 ...
剑指offer60：把二叉树打印成多行。上到下按层打印二叉树。
1 题目描述从上到下按层打印二叉树,同一层结点从左至右输出.每一层输出一行. 2 思路和方法 vector变量存储每一层的元素vector<vector<int> > ans ...
PAT甲级堆相关题_C++题解
堆目录 <算法笔记>重点摘要 1147 Heaps (30) 1155 Heap Paths (30) <算法笔记> 9.7 堆重点摘要 1. 定义堆是完全二叉树,树中每 ...
C++Primer 5th Chap5 Statements
else语句对应的始终是最近的那条if语句,除非有{}强行控制,如: if(A){ if(B){/*.............*/} }else{/*.......*/}//这里else和if(A)对 ...
Python--yaml文件写入
原文地址:https://www.cnblogs.com/yoyoketang/p/9255109.html yaml作为配置文件是非常友好的一种格式,前面一篇讲了yaml的一些基础语法和读取方法,本 ...
SAS学习笔记8 循环语句（do函数）
do-end函数
css 学习笔记常用到的知识
做 loading 居中剧中通常就是 top left 50% 再调一下自己就可以了关键是要有 width height 遇到一些base on content 决定 height 的情况一般上有 ...
（一）mybatis介绍
一.mybatis简介 MyBatis 是一款优秀的持久层框架,它支持定制化 SQL.存储过程以及高级映射.MyBatis 避免了几乎所有的 JDBC 代码和手动设置参数以及获取结果集.MyBatis ...
（二十二）SpringBoot之使用mybatis generator自动生成bean、mapper、mapper xml
一.下载mybatis generator插件二.生成generatorConfig.xml new一个generatorConfig.xml 三.修改generatorConfig.xml 里面的 ...

python3 多线程和多进程

python3 多线程和多进程的更多相关文章

随机推荐

热门专题