Python学习 Day53 爬虫进阶——进程线程 01

2021-05-16 16:08 381人阅读评论(0)

进程与线程

一、进程与线程概述

程序

程序（Program）是计算机指令的集合，用于实现特定功能

进程

1.启动后的程序称为进程，系统会为进程分配内存空间
2.一个进程中至少包含一个线程

线程

1.CPU调度执行的基本单元
2.一个进程中包含多个线程
3.进程结束，线程一定结束，线程结束，进程不一定结束
4.同一个进程中的多个线程，共享内存地址

二、进程与线程的区别

区别	进程	线程
根本区别	资源分配的单位	调度和执行的单位
开销	每个进程都有独立的代码和数据控价（寄存器、堆栈、上下文）	线程是轻量级的进程，同一进程的线程共享代码和数据空间，线程间的切换空间小
所处环境	在操作系统中同时运行多个任务（程序）	同一应用程序有多个顺序流同时执行
分配内存	系统在运行的时候会为每一个进程分配不同的内存空间	除了CPU之外，不会为线程分配内存（线程所使用的资源是它所属的进程的资源），同一进程的线程共享资源
包含关系	进程中包含多个线程，只有一个线程的进程称为单线程	线程是进程的一部分

三、pycharm建立进程与线程

单线程：fun1()和fun2()执行完都需要5s的时间，所以整个程序完成的时间为10s

import time
def fun1():
    for i in range(5):
        print('------fun1中i的值为：',i)
        #休眠一秒后再执行其他语句
        time.sleep(1)

def fun2():
    for i in range(5):
        print('===============fun2中i的值为：',i)
        # 休眠一秒后再执行其他语句
        time.sleep(1)

def single():
    fun1()
    fun2()

if __name__ == '__main__':
    single()

------fun1中i的值为： 0
------fun1中i的值为： 1
------fun1中i的值为： 2
------fun1中i的值为： 3
------fun1中i的值为： 4
===============fun2中i的值为： 0
===============fun2中i的值为： 1
===============fun2中i的值为： 2
===============fun2中i的值为： 3
===============fun2中i的值为： 4

Process finished with exit code 0

多线程：fun1和fun2并行执行，程序运行时间为5s

import time
import threading
def fun1():
    for i in range(5):
        print('------fun1中i的值为：',i)
        #休眠一秒后再执行其他语句
        time.sleep(1)

def fun2():
    for i in range(5):
        print('===============fun2中i的值为：',i)
        # 休眠一秒后再执行其他语句
        time.sleep(1)

def single():
    fun1()
    fun2()

def mult():
    #创建线程对象
    t1 = threading.Thread(target=fun1) #注意：函数不带括号，否则为传值
    t2 = threading.Thread(target=fun2)
    #启动线程
    t1.start()
    t2.start()

if __name__ == '__main__':
    #single()
    mult()

每次运行产生的都是随机的结果

------fun1中i的值为： 0
===============fun2中i的值为： 0
===============fun2中i的值为： 1
------fun1中i的值为： 1
------fun1中i的值为： 2
===============fun2中i的值为： 2
===============fun2中i的值为： 3
------fun1中i的值为： 3
------fun1中i的值为： 4
===============fun2中i的值为： 4

Process finished with exit code 0

四、继承方式实现多线程

为什么要使用类的方式创建线程

因为类可以更加方便的管理代码，可以让我们使用面向对象的方式进行编程

实现多线程的方式

1.继承threading.Thread类
2.实现run()方法
3.调用线程Thread类的start()方法启动线程

import threading
import time
class CodingThread(threading.Thread):
    def run(self):
        for i in range(5):
            print('------fun1中i的值为：', i)
            # 休眠一秒后再执行其他语句
            time.sleep(1)

class CodingThread2(threading.Thread):
    def run(self):
        for i in range(5):
            print('===============fun2中i的值为：', i)
            # 休眠一秒后再执行其他语句
            time.sleep(1)

def mult():
    #创建线程对象
    t1 = CodingThread()
    t2 = CodingThread2()
    #启动线程
    t1.start()
    t2.start()

if __name__ == '__main__':
    mult()

------fun1中i的值为： 0
===============fun2中i的值为： 0
------fun1中i的值为： 1
===============fun2中i的值为： 1
------fun1中i的值为： 2
===============fun2中i的值为： 2
===============fun2中i的值为： 3
------fun1中i的值为： 3
===============fun2中i的值为： 4
------fun1中i的值为： 4

Process finished with exit code 0

五、线程的常用方法

方法名称	描述
threading.current_thread()	获取当前线程对象
threading.enumenate()	获取当前运行的N多线程信息
getName()	获取线程的名称
setName()	设置线程名称

1.获取当前线程对象

class CodingThread(threading.Thread):
    def run(self):
        #获取当前线程对象
        thread = threading.current_thread()
        print(thread)
        for i in range(5):
            print('------fun1中i的值为：', i)
            # 休眠一秒后再执行其他语句
            time.sleep(1)

<CodingThread(Thread-1, started 5576)>
------fun1中i的值为： 0
<CodingThread2(Thread-2, started 12292)>
===============fun2中i的值为： 0
<CodingThread2(Thread-3, started 10356)>
===============fun2中i的值为： 0
------fun1中i的值为： 1
===============fun2中i的值为： 1
===============fun2中i的值为： 1
------fun1中i的值为： 2
===============fun2中i的值为： 2
===============fun2中i的值为： 2
===============fun2中i的值为： 3
===============fun2中i的值为： 3
------fun1中i的值为： 3
===============fun2中i的值为： 4
------fun1中i的值为： 4
===============fun2中i的值为： 4

Process finished with exit code 0

2.获取线程信息

def mult():
    #创建线程对象
    t1 = CodingThread()
    t2 = CodingThread2()
    t3 = CodingThread2()
    #获取线程信息
    print(threading.enumerate())
    #启动线程
    t1.start()
    t2.start()
    t3.start()

[<_MainThread(MainThread, started 3652)>]
<CodingThread(Thread-1, started 10860)>
------fun1中i的值为： 0
<CodingThread2(Thread-2, started 7836)>
===============fun2中i的值为： 0
<CodingThread2(Thread-3, started 6292)>
...

3.获取线程名称

class CodingThread(threading.Thread):
    def run(self):
        #获取当前线程对象
        thread = threading.current_thread()
        print(thread)
        print('线程的名称：',thread.getName())
        for i in range(5):
            print('------fun1中i的值为：', i)
            # 休眠一秒后再执行其他语句
            time.sleep(1)

线程的名称： Thread-1
<CodingThread2(Thread-2, started 10888)>
------fun1中i的值为： 0
线程的名称： Thread-2
===============fun2中i的值为： 0
<CodingThread2(Thread-3, started 8068)>
线程的名称： Thread-3
===============fun2中i的值为： 0
------fun1中i的值为： 1
...

4.修改线程名称

class CodingThread(threading.Thread):
    def run(self):
        #获取当前线程对象
        thread = threading.current_thread()
        print(thread)
        print('线程的名称：',thread.getName())
        #修改线程名称
        thread.setName('我的线程')
        #获取新修改的线程名称
        print(thread.getName())
        for i in range(5):
            print('------fun1中i的值为：', i)
            # 休眠一秒后再执行其他语句
            time.sleep(1)

[<_MainThread(MainThread, started 10372)>]
<CodingThread(Thread-1, started 1836)>
线程的名称： Thread-1
<CodingThread2(Thread-2, started 3916)>
我的线程
线程的名称： Thread-2
------fun1中i的值为： 0
===============fun2中i的值为： 0
...

六、多线程访问全局变量的安全性问题

多线程可以提高程序的运行效率，但同时也会有访问全局变量的安全性问题

案例：车站售票

import threading
import time

ticket = 100 #全局变量
def sale_ticket():
    global ticket
    for i in range(1000):
        if ticket > 0:
            print(threading.current_thread().getName()+'--->>正在出售第{}张票'.format(ticket))
            ticket-=1
            time.sleep(1) #每秒钟售卖一张票
def start():
    for i in range(2):
        #创建线程对象
        t = threading.Thread(target=sale_ticket)
        #启动线程
        t.start()

if __name__ == '__main__':
    start()

七、锁机制

threading.Lock类

1.为了解决多线程访问全局变量所造成的安全性问题可以采用锁机制
2.访问全局变量无需加锁
3.修改全局变量时才需要加锁，修改完毕之后释放锁

加锁的操作步骤

1.创建锁对象 threading.Lock()
2.加锁 .acquare()
3.释放锁 .release()

ticket = 100 #全局变量
#创建锁对象
lock = threading.Lock()
def sale_ticket():
    global ticket
    for i in range(1000):
        lock.acquire() #加锁
        if ticket > 0:
            print(threading.current_thread().getName()+'--->>正在出售第{}张票'.format(ticket))
            ticket-=1
            time.sleep(1) #每秒钟售卖一张票
            lock.release() #解锁

转载：https://blog.csdn.net/ShengXIABai/article/details/116794852

查看评论

飞道的博客

飞道的博客

个人资料

文章分类

文章存档

阅读排行

评论排行

推荐文章