【python之路26】模块

模块简介

一、time模块
二、sys模块
三、datetime模块
四、pickle模块
五、json模块
六、OS模块
七、hashlib加密模块
八、第三方模块的安装方法
九、requests模块
十、XML模块
十一、configparser模块
十二、shutil
十三、subprocess模块
十四、logging模块

模块的分类

1、内置模块

2、自定义模块

3、第三方模块（需要安装）

模块前言：模块的导入：

模块的导入有两点需要注意：

1、在sys.path中的可以直接进行导入，因为sys.path返回的是列表，所以可以利用append把路径加入到列表当中

2、把模块导入后，再次导入模块，python会检测内存中是否存在该模块，如果存在该模块则会忽略导入

一、time模块

1、time.sleep(5) #等待5秒钟

#!usr/bin/env python

# -*- coding:utf-8 -*-

import time

print('start to sleep.....')

time.sleep(5)  #等待5秒

print('wake up.....')

#!usr/bin/env python

# -*- coding:utf-8 -*-

import time

print(time.clock())  #返回处理器时间，3.3已经废弃

print(time.process_time()) #返回处理器时间，3.3已经废弃

print(time.time())  #返回时间戳,1970年1月1号0:00至现在一共多少秒，所有系统的时间戳都是从该时间计时的

print(time.ctime(time.time())) #将时间戳转化为字符串，Thu Feb 16 10:10:20 2017

print(time.ctime(time.time()-24*60*60)) #将时间转化为字符串，昨天的这个时间，Thu Feb 15 10:10:20 2017

print(time.gmtime(time.time())) #返回time.struct_time对象，可以分别提取年月日时分秒

print(time.localtime(time.time())) #返回time.struct_time对象，对象显示的是本地时间，可以分别提取年月日时分秒

print(time.mktime(time.localtime(time.time()))) #与localtime功能相反，将time.struct_time对象转化为时间戳

print(time.strftime('%Y-%m-%d %H:%M:%S',time.localtime(time.time()))) #将time.struct_time对象转化为字符串格式
print(time.strftime('%Y-%m-%d %H:%M:%S')  #输出当前时间

print(time.strptime('2017-2-16 11:40:27','%Y-%m-%d %H:%M:%S')) #将字符串转化为time.struct_time对象

#!usr/bin/env python

# -*- coding:utf-8 -*-

#time.gmtime()的用法

#time.localtime()的用法

import time

time_object = time.gmtime(time.time())

print(time_object)

#time.struct_time(tm_year=2017, tm_mon=2, tm_mday=16,

# tm_hour=2, tm_min=35, tm_sec=54, tm_wday=3, tm_yday=47, tm_isdst=0)

time_str = '%s-%s-%s %s:%s:%s' %(time_object.tm_year,time_object.tm_mon,time_object.tm_mday,time_object.tm_hour,time_object.tm_min,time_object.tm_sec)

print(time_str) #2017-2-16 2:45:26  格林威治时间

二、sys模块

1）sys.argv ----命令参数List,第一个参数是程序本身路径

import sys

print(sys.argv)

if sys.argv[1] == 'one':

    print('')

else:

    print('.....')

在终端中运行结果如下：

E:\python_code\2\11day>python time2.py one
['time2.py', 'one']
11111

2）sys.path ----返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值

#!usr/bin/env python

# -*- coding:utf-8 -*-

import sys

print(sys.path)

#打印输出：

#['E:\\python_code\\2\\11day',

# 'E:\\python_code\\2',

# 'D:\\python_lib',

#  'D:\\Program Files (x86)\\Python36\\python36.zip',

# 'D:\\Program Files (x86)\\Python36\\DLLs',

# 'D:\\Program Files (x86)\\Python36\\lib',

# 'D:\\Program Files (x86)\\Python36',

# 'D:\\Program Files (x86)\\Python36\\lib\\site-packages']

当导入python包时，会从sys.path路径返回的列表中查找包，找到则运行，并返回。

一般第三方安装的路径都是在：

D:\\Program Files (x86)\\Python36\\lib\\site-packages

3)sys.exit(n) ----退出程序，正常退出时exit(0)

#!usr/bin/env python

# -*- coding:utf-8 -*-

import sys

st = input('退出程序:q;输入1打印one;输入2打印two:')

if st == 'q':

    sys.exit('goodbye!')  #也可以用：exit('goodbye!'),退出当前程序

elif st == '':

    print('one')

elif st == '':

    print('two')

print('thanks!')

4)sys.version ----获取python解释器程序的版本信息

#!usr/bin/env python

# -*- coding:utf-8 -*-

import sys

print(sys.version)

#打印输出：

#3.6.0 (v3.6.0:41df79263a11, Dec 23 2016, 08:06:12) [MSC v.1900 64 bit (AMD64)]

5）sys.platform ----返回操作系统平台名称

#!usr/bin/env python

# -*- coding:utf-8 -*-

import sys

print(sys.platform)  #打印输出win32

6)sys.stdout.wirte("#") #在同一行中输出字符串，可以做进度条原理

pip.exe install django ----从网络下载django并安装，显示安装百分比和进度条

pip uninstall django ----卸载django

import sys

import time

for i in range(20):

    sys.stdout.write('#')

    time.sleep(0.3)

在终端中输入上面的代码，可以像进度条一样不断输出#

7）sys.stdin.readline() #屏幕输入函数

#!usr/bin/env python

# -*- coding:utf-8 -*-

import sys

val = sys.stdin.readline()[:-1]  #[:-1]可以用切片截取输入的字符串

print(val) #将输入的内容打印出来，input函数底层实际也调用的这个函数

8）练习：

读取用户输入的目录，根据用户的输入，创建一个相应的目录

# -*- coding:utf-8 -*-

import sys,os

os.mkdir(sys.argv[1])

自己写一个简单的脚本，可以在任何路径导入

自己做一个带百分比的进度条

#!usr/bin/env python

# -*- coding:utf-8 -*-

import sys,time

for i in range(11):

    sys.stdout.write('\r')  #回车符

    sys.stdout.write('%s%% %s' %(i *10,'#' * i))

    sys.stdout.flush() #从显存刷新到屏幕，没打印一次刷新一次屏幕

    time.sleep(0.2)

三、datetime模块

#时间加减

import datetime

print(datetime.datetime.now()) #返回 2016-08-19 12:47:03.941925

print(datetime.date.fromtimestamp(time.time()) )  # 时间戳直接转成日期格式 2016-08-19

print(datetime.datetime.now() )

print(datetime.datetime.now() + datetime.timedelta(3)) #当前时间+3天

print(datetime.datetime.now() + datetime.timedelta(-3)) #当前时间-3天

print(datetime.datetime.now() + datetime.timedelta(hours=3)) #当前时间+3小时

print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #当前时间+30分,

参数：days=0, seconds=0, microseconds=0,
milliseconds=0, minutes=0, hours=0, weeks=0

  c_time = datetime.datetime.now() print(c_time.replace(minute=3,hour=2)) #时间替换

四、pickle模块

1、把字典转化为字符串写入文件

#!usr/bin/env python

# -*- coding:utf-8 -*-

accounts = {1000:

                {'name':'sunshuhai',

                 'email':'888@163.com',

                 'password':'',

                 'balance':15000,

                 'phone':'',

                 'bank_num':{'ICBC':'','CBC':''}

                 },

            1001:

                {'name': 'zhangsan',

                 'email': '777@163.com',

                 'password': '',

                 'balance': 12000,

                 'phone': '',

                 'bank_num': {'ICBC': '', 'CBC': ''}

                 },

}

f = open('ac.txt','w',encoding='utf-8')

text = str(accounts)

print(text)

f.write(text)

f.close()

但写入文件后，进行无法进行修改，所以可以用pickle模块

2、pickle把字典转为字节类型存入文件

#!usr/bin/env python

# -*- coding:utf-8 -*-

accounts = {1000:

                {'name':'sunshuhai',

                 'email':'888@163.com',

                 'password':'',

                 'balance':15000,

                 'phone':'',

                 'bank_num':{'ICBC':'','CBC':''}

                 },

            1001:

                {'name': 'zhangsan',

                 'email': '777@163.com',

                 'password': '',

                 'balance': 12000,

                 'phone': '',

                 'bank_num': {'ICBC': '', 'CBC': ''}

                 },

}

import pickle

f = open('ac.txt','wb')

text = pickle.dumps(accounts)  #返回的是(bytes)字节类型，将字典类型转化为字节类型，不管以何种方式打开文件，均返回字节型，所以必须用wb方式打开可以直接写入

f.write(text)  #写入文件

f.close()

dumps与loads均以字节的形式写入和读取的

#!usr/bin/env python

# -*- coding:utf-8 -*-

import pickle

f = open('ac.txt','rb')

f_read = f.read()  #把pickle写入的字符串还原为字节形式

print(f_read)

dic_text = pickle.loads(f_read) #pickle.dumps存储的字符串转化为字典

print(type(dic_text)) #打印输出：<class 'dict'>

例子：修改文件中ID为1000的balance的值+500

#!usr/bin/env python

# -*- coding:utf-8 -*-

import pickle

f = open('ac.txt','rb')

text = f.read()  #读取文本内容

dic_text = pickle.loads(text)  #利用pickle将文本内容转化为字典类型

dic_text[1000]['balance'] += 500  #修改字典中的值

f.close()

f = open('ac.txt','wb')

text_write = pickle.dumps(dic_text) #将字典转化为文本内容

f.write(text_write) #把文本内容重新覆盖写入文件

f.close()

f = open('ac.txt','rb')

text_read = f.read() #读取文件中的文本内容

text_dic = pickle.loads(text_read) #利用pickle将文本内容转化为字典

print(type(text_dic),text_dic) #打印字典的类型和字典内容

3、pickle中的dump和load 与 dumps和loads的区别

dumps可以将字典转为字节，然后以字节方式打开写入文件

dump 可以直接将字典以字节方式打开的文件

loads 可以将字节类型的字符串，直接转化为字典

load可以将以字节方式打开的文件对象直接转化为字典

#!usr/bin/env python

# -*- coding:utf-8 -*-

accounts = {1000:

                {'name':'sunshuhai',

                 'email':'888@163.com',

                 'password':'',

                 'balance':15000,

                 'phone':'',

                 'bank_num':{'ICBC':'','CBC':''}

                 },

            1001:

                {'name': 'zhangsan',

                 'email': '777@163.com',

                 'password': '',

                 'balance': 12000,

                 'phone': '',

                 'bank_num': {'ICBC': '', 'CBC': ''}

                 },

}

import pickle

f = open('ac.txt','wb')

pickle.dump(accounts,f)  #dump可以直接把字典写入文件

f.close()

f = open('ac.txt','rb')

dic = pickle.load(f)   #load可以直接把文件内容读出为字典类型

print(type(dic),dic)

五、json模块

json模块与pickle模块用法一致；

区别：

1、pickle模块已字节的方式进行转化，而json是以字符串的方式进行转化。

2、pickle模块可以将任何数据类型序列化，但json不支持某些数据类型的序列化，如data，json只支持字典、列表、简单的变量、字符串等，json不支持元组，稍微复杂的数据类型就不能处理了

3、pickle只在python语言支持，不同语言之间、不同系统之间的接口、数据交互为了通用性，一般选择json。

dumps和loads的用法：

#!usr/bin/env python

# -*- coding:utf-8 -*-

accounts = {1000:

                {'name':'sunshuhai',

                 'email':'888@163.com',

                 'password':'',

                 'balance':15000,

                 'phone':'',

                 'bank_num':{'ICBC':'','CBC':''}

                 },

            1001:

                {'name': 'zhangsan',

                 'email': '777@163.com',

                 'password': '',

                 'balance': 12000,

                 'phone': '',

                 'bank_num': {'ICBC': '', 'CBC': ''}

                 },

}

import json

f = open('ac.txt','w',encoding='utf-8')

text = json.dumps(accounts) #将字典转化为字符串

f.write(text)  #将转化后的字符串写入文件

f.close()

f = open('ac.txt','r',encoding='utf-8')

text1 = f.read() #读取文件中的字符串

dic = json.loads(text1) #将字符串转化为字典

print(type(dic),dic)

f.close()

dump和load的用法：

#!usr/bin/env python

# -*- coding:utf-8 -*-

accounts = {1000:

                {'name':'sunshuhai',

                 'email':'888@163.com',

                 'password':'',

                 'balance':15000,

                 'phone':'',

                 'bank_num':{'ICBC':'','CBC':''}

                 },

            1001:

                {'name': 'zhangsan',

                 'email': '777@163.com',

                 'password': '',

                 'balance': 12000,

                 'phone': '',

                 'bank_num': {'ICBC': '', 'CBC': ''}

                 },

}

import json

#dump和load

f = open('ac.txt','w',encoding='utf-8')

json.dump(accounts,f)  #直接将accounts转化为文本，写入f对象

f.close()

f = open('ac.txt','r',encoding='utf-8')

dic = json.load(f)  #直接读取f对象中的文件，并转化为字典

print(type(dic),dic)

注意：python代码中列表、字典等内部尽量的字符串使用双号，如果将含有单引号的字符串传给json，则会报错，所以以后列表、字典等内部的字符串尽量使用双引号loads时使用双引号表示字符串，dumps可以使用单引号表示字符串，例如：

#!usr/bin/env python

# -*- coding:utf-8 -*-

import json

st = '{"name":"sunshuhai","age":100}'

#如果st = "{'name':'sunshuhai','age':100}"则会报错

st_dic = json.loads(st)

print(type(st_dic),st_dic)

json不支持元组

tup = (11,22,33)

import json

r2 = json.dumps(tup)

print(r2)   #打印输出[11, 22, 33]

s = '("11","22","33")'

import json

r1 = json.loads(s)

print(r1) #会报错，因为json不支持元组，只支持列表和字典

六、OS模块

 os.getcwd()                 获取当前工作目录，即当前python脚本工作的目录路径

 os.chdir("dirname")         改变当前脚本工作目录；相当于shell下cd

 os.curdir                   返回当前目录: ('.')

 os.pardir                   获取当前目录的父目录字符串名：('..')

 os.makedirs('dir1/dir2')    可生成多层递归目录

 os.removedirs('dirname1')   若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推

 os.mkdir('dirname')         生成单级目录；相当于shell中mkdir dirname

 os.rmdir('dirname')         删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname

 os.listdir('dirname')       列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印

 os.remove()                 删除一个文件

 os.rename("oldname","new")  重命名文件/目录

 os.stat('path/filename')    获取文件/目录信息

 os.sep                      操作系统特定的路径分隔符，win下为"\\",Linux下为"/"

 os.linesep                  当前平台使用的行终止符，win下为"\t\n",Linux下为"\n"

 os.pathsep                  用于分割文件路径的字符串

 os.name                     字符串指示当前使用平台。win->'nt'; Linux->'posix'

 os.system("bash command")   运行shell命令，直接显示
　　os.popen("dir")　　　　　　　　运行shell命令，返回结果

 os.environ                  获取系统环境变量

 os.path.abspath(path)       返回path规范化的绝对路径

 os.path.split(path)         将path分割成目录和文件名二元组返回

 os.path.dirname(path)       返回path的目录。其实就是os.path.split(path)的第一个元素

 os.path.basename(path)      返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素

 os.path.exists(path)        如果path存在，返回True；如果path不存在，返回False

 os.path.isabs(path)         如果path是绝对路径，返回True

 os.path.isfile(path)        如果path是一个存在的文件，返回True。否则返回False

 os.path.isdir(path)         如果path是一个存在的目录，则返回True。否则返回False

 os.path.join(path1[, path2[, ...]])  将多个路径组合后返回，第一个绝对路径之前的参数将被忽略

 os.path.getatime(path)      返回path所指向的文件或者目录的最后存取时间

 os.path.getmtime(path)      返回path所指向的文件或者目录的最后修改时间

stat 结构:

st_mode: inode 保护模式

st_ino: inode 节点号。

st_dev: inode 驻留的设备。

st_nlink: inode 的链接数。

st_uid: 所有者的用户ID。

st_gid: 所有者的组ID。

st_size: 普通文件以字节为单位的大小；包含等待某些特殊文件的数据。

st_atime: 上次访问的时间。

st_mtime: 最后一次修改的时间。

st_ctime: 由操作系统报告的"ctime"。在某些系统上（如Unix）是最新的元数据更改的时间，在其它系统上（如Windows）是创建时间（详细信息参见平台的文档）

注意：os.stat('path/filename') 获取文件/目录信息的结构说明

#cmd命令执行的两个命令：os.system和os.popen的区别

#os.system直接执行命令不能拿到执行的返回值

# os.popen执行命令后返回对象，通过对象的read方法能拿到返回值

os.system('dir')

obj = os.popen('dir')

ret = obj.read()

print(ret)

cmd命令执行的两个命令：os.system和os.popen的区别

七、hashlib加密模块

用于加密相关的操作，代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

import hashlib

# ######## md5 ########

hash = hashlib.md5()

# help(hash.update)

hash.update(bytes('admin', encoding='utf-8'))

print(hash.hexdigest())

print(hash.digest())

######## sha1 ########

hash = hashlib.sha1()

hash.update(bytes('admin', encoding='utf-8'))

print(hash.hexdigest())

# ######## sha256 ########

hash = hashlib.sha256()

hash.update(bytes('admin', encoding='utf-8'))

print(hash.hexdigest())

# ######## sha384 ########

hash = hashlib.sha384()

hash.update(bytes('admin', encoding='utf-8'))

print(hash.hexdigest())

# ######## sha512 ########

hash = hashlib.sha512()

hash.update(bytes('admin', encoding='utf-8'))

print(hash.hexdigest())

以上加密算法虽然依然非常厉害，但时候存在缺陷，即：通过撞库可以反解。所以，有必要对加密算法中添加自定义key再来做加密。

import hashlib

# ######## md5 ########

hash = hashlib.md5(bytes('898oaFs09f',encoding="utf-8"))  #加盐

hash.update(bytes('admin',encoding="utf-8"))

print(hash.hexdigest())

#python内置还有一个 hmac（哈希信息验证码 Hash Message Authentication Code） 模块，它内部对我们创建 key 和 内容 进行进一步的处理然后再加密

import hmac

h = hmac.new(bytes('898oaFs09f',encoding="utf-8"))

h.update(bytes('admin',encoding="utf-8"))

print(h.hexdigest())

固定加盐的安全行还是不够高，因为拿到数据库后可以到网站注册100个用户，拿到100个用户对应的密码加密字符，这样反查这些字符对应的密码，所以还有一种方式就是动态加密，加盐的字符串设定为用户名。

实例：加密账号注册及登录

#!usr/bin/env python

# -*- coding:utf-8 -*-

import hashlib

def register(user,pwd):

    with open('user_information','a',encoding='utf-8') as f:

        f.write(user + '|' + md5(pwd) + '\r')

def login(user,pwd):

    with open('user_information','r',encoding='utf-8') as f:

        for line in f:

            u,p = line.strip().split('|')

            if user == u and md5(pwd) == p:

                return True

        return False

def md5(arg):

    hash = hashlib.md5(bytes('aaatttbbbccceee',encoding='utf-8'))

    hash.update(bytes(arg,encoding='utf-8'))

    return hash.hexdigest()

option = input('用户注册请按1，用户登陆请按2：')

if option == '':

    user = input('请输入用户名：')

    pwd = input('请输入密码：')

    register(user,pwd)

elif option == '':

    user = input('请输入用户名：')

    pwd = input('请输入密码：')

    if login(user,pwd):

        print('登陆成功！')

    else:

        print('登陆失败！')

账号注册及登录（密码加密）

八、第三方模块的安装方法

第三方模块的安装有两种方式：

1）软件工具安装，例如要安装requests

python3里面用pip3进行安装，要想用pip3安装，首先要安装pip3，想安装pip3必须先安装setuptools，pip3依赖setuptools。

一般python3中默认都安装着pip3，路径为：D:\Program Files (x86)\Python36\Scripts

设置环境变量：将路径：D:\Program Files (x86)\Python36\Scripts加入系统变量（下面的）PATH中
CMD中执行：pip3 install requests，系统会自动下载并安装requests

2）源码安装

下载源码文件，并解压源文件，找到setup.py文件
CMD中切换到setup.py文件所在的目录
输入：python setup.py install 按回车开始安装

九、requests模块

1、requests获取HTML

#!usr/bin/env python

# -*- coding:utf-8 -*-

import requests

response = requests.get('http://www.baidu.com')

response.encoding = 'utf-8'

html = response.text

print(html)

2、利用requests模块和xml模块判断QQ是否在线

#!usr/bin/env python

# -*- coding:utf-8 -*-

import requests

response = requests.get('http://www.webxml.com.cn//webservices/qqOnlineWebService.asmx/qqCheckOnline?qqCode=258888289')

response.encoding = 'utf-8'

result = response.text

print(result)

# <?xml version="1.0" encoding="utf-8"?>

# <string xmlns="http://WebXml.com.cn/">Y</string>

#需要拿到上面XML中的Y

from xml.etree import ElementTree as ET

xml_obj = ET.XML(result)

if xml_obj.text == 'Y':

    print('您要查找的QQ在线！')

else:

    print('您要查找的QQ离线！')

3、利用requests 模块查找列车车次信息

#!usr/bin/env python

# -*- coding:utf-8 -*-

import requests

response = requests.get('http://www.webxml.com.cn/WebServices/TrainTimeWebService.asmx/getDetailInfoByTrainCode?TrainCode=G666&UserID=')

response.encoding = 'utf-8'

result = response.text

from xml.etree import ElementTree as ET

root = ET.XML(result)

for node in root.iter('TrainDetailInfo'):

    #print(node.tag,node.attrib)  #tag是表示标签名，attrib表示属性

    print(node.find('TrainStation').text,node.find('ArriveTime').text,node.find('StartTime').text,node.find('KM').text)

#!/usr/bin/env python

# -*- coding:utf-8 -*-

import requests

import json

import xlwings as xw

def PostDataA():

    wb = xw.Book.caller()

    mastUrl = wb.sheets["配置表"].range('B1').value  #主地址

    #wb.sheets["Sheet99"].range('D8').value = ""

    coo = wb.sheets["Sheet99"].range('D1').value

    #coo = '{"_c_cookie": "1", "_c_aut": "nm%3FwuskI%60p%2CX%5Ezo3FrAF%60%2CQTNEi_PTp_%2CnN%40By_oSP%3D%2CtnR%5EbpWg%7C%7E","PHPSESSID": "le50ak8oncg4cvv0kr5dum2d17", "g_language": "zh-cn", "_c_chk": "7814UHBHV18GT0FEUw%3D%3D"}'

    cookie_dic = json.loads(coo)  #原cookie字典

    heads = {

                    "Accept":"application/json, text/javascript, */*; q=0.01",

                    "Accept-Encoding":"gzip, deflate, sdch",

                    "Accept-Language":"zh-CN,zh;q=0.9",

                    "Connection":"keep-alive",

                    "Host": "ww2671.ws5170.com",

                    "Referer": mastUrl + "op.php?op=member_5h&fp=bet_beforedo&palygroup=r1&gametype=24",

                    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36",

                    "X-Requested-With":"XMLHttpRequest",

    }

    data_str = wb.sheets["Sheet99"].range('D8').value

    data_dic =  json.loads(data_str)

    r = requests.post(mastUrl + 'op.php?op=member_5h&fp=bet_do&palygroup=r1&gametype=24&disk=4',cookies=cookie_dic,headers=heads,data=data_dic)

    # r.encoding='gb2312'

    html=r.text

    if html[0:13]=="<script>paren":

        wb.sheets["Sheet99"].range('D7').value="Y"

十、XML模块

单行XML解析

response = '<string xmlns="http://WebXml.com.cn/">Y</string>'

from xml.etree import ElementTree as ET

node = ET.XML(response)

print(node.text)  #打印输出 Y

建立XML文件代码为例，文件名为first.xml：

<data>

    <country name="Liechtenstein">

        <rank updated="yes">2</rank>

        <year>2023</year>

        <gdppc>141100</gdppc>

        <neighbor direction="E" name="Austria" />

        <neighbor direction="W" name="Switzerland" />

    </country>

    <country name="Singapore">

        <rank updated="yes">5</rank>

        <year>2026</year>

        <gdppc>59900</gdppc>

        <neighbor direction="N" name="Malaysia" />

    </country>

    <country name="Panama">

        <rank updated="yes">69</rank>

        <year>2026</year>

        <gdppc>13600</gdppc>

        <neighbor direction="W" name="Costa Rica" />

        <neighbor direction="E" name="Colombia" />

    </country>

</data>

读取xml文件为字符串，将字符串转化为XML对象：

#!usr/bin/env python

# -*- coding:utf-8 -*-

with open('first.xml','r',encoding='utf-8') as f:

    xml_text = f.read()

from xml.etree import ElementTree as ET

root = ET.XML(xml_text)  #这是<data>层

for node in root:   #循环的是data下层，也就是country层

    print(node.find('rank').tag)  #打印标签

    print(node.find('rank').attrib) #打印rank标签的属性

    print(node.find('rank').text)  #打印rank标签中的内容

from xml.etree import ElementTree as ET

root = ET.XML(st)

for r in root.iter('country'):  #表示跟节点下面的所有子孙标签中country的节点集合

    print(r.tag)

# country

# country

# country

#遍历所有标签的内容

from xml.etree import ElementTree as ET

root = ET.XML(st)

for r2 in root: #root表示country层，那么r2表示data层下的每一个country层

    for r3 in r2: #r2是country层，那么r3就是country下面的每个标签层

        print(r3.text)  #打印r3层的每个标签的内容

直接将xml文件转化为xml对象，此方式可以修改xml对象并更新文件：

#!usr/bin/env python

# -*- coding:utf-8 -*-

from xml.etree import ElementTree as ET

tree = ET.parse('first.xml')  #将文件读入内存并解析为xml对象

root =  tree.getroot()   #获取对象中的第一层根目录

print(root.tag)  #打印输出data

for node in root.iter('year'):  #root下面的year标签进行循环

    new_year = int(node.text) + 1  #将year标签中的内容转为int类型并且加1

    node.text = str(new_year)  #将new_year转化为字符串重新复制给year标签中的内容

    node.set('name','sea')  #给year标签增加一个属性：name='sea'

    node.set('age','')  #给year标签增加一个属性：age='28'

    del node.attrib['name'] #把year标签中的name属性删除

tree.write('first.xml')  #将内存中修改完毕的xml字符串，重新写进first.xml

删除节点：

#!usr/bin/env python

# -*- coding:utf-8 -*-

#删除节点

from xml.etree import ElementTree as ET

tree = ET.parse('first.xml')

print(tree)

root = tree.getroot()

for country in root.findall('country'):

    rank = int(country.find('rank').text)

    if rank > 50:

        root.remove(country) #删除root根下的country节点

tree.write('first.xml')

打印xml下root的主要功能：

#!usr/bin/env python

# -*- coding:utf-8 -*-

#打印功能

from xml.etree import ElementTree as ET

tree = ET.parse('first.xml')

root = tree.getroot()

print(dir(root))

#root功能有：

['__class__', '__copy__', '__deepcopy__', '__delattr__', '__delitem__',

 '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__',

 '__getitem__', '__getstate__', '__gt__', '__hash__', '__init__',

 '__init_subclass__', '__le__', '__len__', '__lt__', '__ne__', '__new__',

 '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__',

 '__setstate__', '__sizeof__', '__str__', '__subclasshook__',

 'append', 'attrib', 'clear', 'extend', 'find', 'findall', 'findtext',

 'get', 'getchildren', 'getiterator', 'insert', 'items', 'iter', 'iterfind',

 'itertext', 'keys', 'makeelement', 'remove', 'set', 'tag', 'tail', 'text']

常用的功能：

tag attrib find set get iter

创建XML文档实例：

#!usr/bin/env python

# -*- coding:utf-8 -*-

from xml.etree import ElementTree as ET

#创建跟节点

new_xml = ET.Element('namelist')

#创建子节点

name1 = ET.SubElement(new_xml,'name',attrib={"enrolled":"yes"})

age1 = ET.SubElement(name1,'age',attrib={"checked":"no"})

sex1 = ET.SubElement(name1,'sex')

sex1.text = ''

#创建根节点的子节点

name2 = ET.SubElement(new_xml,'name',attrib={"enrolled":"no"})

age2 = ET.SubElement(name2,'age')

age2.text = ''

#写入硬盘

et = ET.ElementTree(new_xml)  #创建ElementTree类的对象，参数是根节点，该对象有write方法

et.write('test.xml',encoding='utf-8',xml_declaration=True)

操作XML

XML格式类型是节点嵌套节点，对于每一个节点均有以下功能，以便对当前节点进行操作：

 class Element:

     """An XML element.

     This class is the reference implementation of the Element interface.

     An element's length is its number of subelements.  That means if you

     want to check if an element is truly empty, you should check BOTH

     its length AND its text attribute.

     The element tag, attribute names, and attribute values can be either

     bytes or strings.

     *tag* is the element name.  *attrib* is an optional dictionary containing

     element attributes. *extra* are additional element attributes given as

     keyword arguments.

     Example form:

         <tag attrib>text<child/>...</tag>tail

     """

     当前节点的标签名

     tag = None

     """The element's name."""

     当前节点的属性

     attrib = None

     """Dictionary of the element's attributes."""

     当前节点的内容

     text = None

     """

     Text before first subelement. This is either a string or the value None.

     Note that if there is no text, this attribute may be either

     None or the empty string, depending on the parser.

     """

     tail = None

     """

     Text after this element's end tag, but before the next sibling element's

     start tag.  This is either a string or the value None.  Note that if there

     was no text, this attribute may be either None or an empty string,

     depending on the parser.

     """

     def __init__(self, tag, attrib={}, **extra):

         if not isinstance(attrib, dict):

             raise TypeError("attrib must be dict, not %s" % (

                 attrib.__class__.__name__,))

         attrib = attrib.copy()

         attrib.update(extra)

         self.tag = tag

         self.attrib = attrib

         self._children = []

     def __repr__(self):

         return "<%s %r at %#x>" % (self.__class__.__name__, self.tag, id(self))

     def makeelement(self, tag, attrib):

         创建一个新节点

         """Create a new element with the same type.

         *tag* is a string containing the element name.

         *attrib* is a dictionary containing the element attributes.

         Do not call this method, use the SubElement factory function instead.

         """

         return self.__class__(tag, attrib)

     def copy(self):

         """Return copy of current element.

         This creates a shallow copy. Subelements will be shared with the

         original tree.

         """

         elem = self.makeelement(self.tag, self.attrib)

         elem.text = self.text

         elem.tail = self.tail

         elem[:] = self

         return elem

     def __len__(self):

         return len(self._children)

     def __bool__(self):

         warnings.warn(

             "The behavior of this method will change in future versions.  "

             "Use specific 'len(elem)' or 'elem is not None' test instead.",

             FutureWarning, stacklevel=2

             )

         return len(self._children) != 0 # emulate old behaviour, for now

     def __getitem__(self, index):

         return self._children[index]

     def __setitem__(self, index, element):

         # if isinstance(index, slice):

         #     for elt in element:

         #         assert iselement(elt)

         # else:

         #     assert iselement(element)

         self._children[index] = element

     def __delitem__(self, index):

         del self._children[index]

     def append(self, subelement):

         为当前节点追加一个子节点

         """Add *subelement* to the end of this element.

         The new element will appear in document order after the last existing

         subelement (or directly after the text, if it's the first subelement),

         but before the end tag for this element.

         """

         self._assert_is_element(subelement)

         self._children.append(subelement)

     def extend(self, elements):

         为当前节点扩展 n 个子节点

         """Append subelements from a sequence.

         *elements* is a sequence with zero or more elements.

         """

         for element in elements:

             self._assert_is_element(element)

         self._children.extend(elements)

     def insert(self, index, subelement):

         在当前节点的子节点中插入某个节点，即：为当前节点创建子节点，然后插入指定位置

         """Insert *subelement* at position *index*."""

         self._assert_is_element(subelement)

         self._children.insert(index, subelement)

     def _assert_is_element(self, e):

         # Need to refer to the actual Python implementation, not the

         # shadowing C implementation.

         if not isinstance(e, _Element_Py):

             raise TypeError('expected an Element, not %s' % type(e).__name__)

     def remove(self, subelement):

         在当前节点在子节点中删除某个节点

         """Remove matching subelement.

         Unlike the find methods, this method compares elements based on

         identity, NOT ON tag value or contents.  To remove subelements by

         other means, the easiest way is to use a list comprehension to

         select what elements to keep, and then use slice assignment to update

         the parent element.

         ValueError is raised if a matching element could not be found.

         """

         # assert iselement(element)

         self._children.remove(subelement)

     def getchildren(self):

         获取所有的子节点（废弃）

         """(Deprecated) Return all subelements.

         Elements are returned in document order.

         """

         warnings.warn(

             "This method will be removed in future versions.  "

             "Use 'list(elem)' or iteration over elem instead.",

             DeprecationWarning, stacklevel=2

             )

         return self._children

     def find(self, path, namespaces=None):

         获取第一个寻找到的子节点

         """Find first matching element by tag name or path.

         *path* is a string having either an element tag or an XPath,

         *namespaces* is an optional mapping from namespace prefix to full name.

         Return the first matching element, or None if no element was found.

         """

         return ElementPath.find(self, path, namespaces)

     def findtext(self, path, default=None, namespaces=None):

         获取第一个寻找到的子节点的内容

         """Find text for first matching element by tag name or path.

         *path* is a string having either an element tag or an XPath,

         *default* is the value to return if the element was not found,

         *namespaces* is an optional mapping from namespace prefix to full name.

         Return text content of first matching element, or default value if

         none was found.  Note that if an element is found having no text

         content, the empty string is returned.

         """

         return ElementPath.findtext(self, path, default, namespaces)

     def findall(self, path, namespaces=None):

         获取所有的子节点

         """Find all matching subelements by tag name or path.

         *path* is a string having either an element tag or an XPath,

         *namespaces* is an optional mapping from namespace prefix to full name.

         Returns list containing all matching elements in document order.

         """

         return ElementPath.findall(self, path, namespaces)

     def iterfind(self, path, namespaces=None):

         获取所有指定的节点，并创建一个迭代器（可以被for循环）

         """Find all matching subelements by tag name or path.

         *path* is a string having either an element tag or an XPath,

         *namespaces* is an optional mapping from namespace prefix to full name.

         Return an iterable yielding all matching elements in document order.

         """

         return ElementPath.iterfind(self, path, namespaces)

     def clear(self):

         清空节点

         """Reset element.

         This function removes all subelements, clears all attributes, and sets

         the text and tail attributes to None.

         """

         self.attrib.clear()

         self._children = []

         self.text = self.tail = None

     def get(self, key, default=None):

         获取当前节点的属性值

         """Get element attribute.

         Equivalent to attrib.get, but some implementations may handle this a

         bit more efficiently.  *key* is what attribute to look for, and

         *default* is what to return if the attribute was not found.

         Returns a string containing the attribute value, or the default if

         attribute was not found.

         """

         return self.attrib.get(key, default)

     def set(self, key, value):

         为当前节点设置属性值

         """Set element attribute.

         Equivalent to attrib[key] = value, but some implementations may handle

         this a bit more efficiently.  *key* is what attribute to set, and

         *value* is the attribute value to set it to.

         """

         self.attrib[key] = value

     def keys(self):

         获取当前节点的所有属性的 key

         """Get list of attribute names.

         Names are returned in an arbitrary order, just like an ordinary

         Python dict.  Equivalent to attrib.keys()

         """

         return self.attrib.keys()

     def items(self):

         获取当前节点的所有属性值，每个属性都是一个键值对

         """Get element attributes as a sequence.

         The attributes are returned in arbitrary order.  Equivalent to

         attrib.items().

         Return a list of (name, value) tuples.

         """

         return self.attrib.items()

     def iter(self, tag=None):

         在当前节点的子孙中根据节点名称寻找所有指定的节点，并返回一个迭代器（可以被for循环）。

         """Create tree iterator.

         The iterator loops over the element and all subelements in document

         order, returning all elements with a matching tag.

         If the tree structure is modified during iteration, new or removed

         elements may or may not be included.  To get a stable set, use the

         list() function on the iterator, and loop over the resulting list.

         *tag* is what tags to look for (default is to return all elements)

         Return an iterator containing all the matching elements.

         """

         if tag == "*":

             tag = None

         if tag is None or self.tag == tag:

             yield self

         for e in self._children:

             yield from e.iter(tag)

     # compatibility

     def getiterator(self, tag=None):

         # Change for a DeprecationWarning in 1.4

         warnings.warn(

             "This method will be removed in future versions.  "

             "Use 'elem.iter()' or 'list(elem.iter())' instead.",

             PendingDeprecationWarning, stacklevel=2

         )

         return list(self.iter(tag))

     def itertext(self):

         在当前节点的子孙中根据节点名称寻找所有指定的节点的内容，并返回一个迭代器（可以被for循环）。

         """Create text iterator.

         The iterator loops over the element and all subelements in document

         order, returning all inner text.

         """

         tag = self.tag

         if not isinstance(tag, str) and tag is not None:

             return

         if self.text:

             yield self.text

         for e in self:

             yield from e.itertext()

             if e.tail:

                 yield e.tail

 节点功能一览表

XML所有功能

创建节点并附加到原文件节点的实例：

 #!usr/bin/env python

# -*- coding:utf-8 -*-

from xml.etree import ElementTree as ET

tree = ET.parse('first.xml')

root = tree.getroot()

new_root = root.makeelement('name',{'age':''}) #在内存创建一个节点，该节点与root没有关系

root.append(new_root)   #将创建的节点作为子节点附加到root

tree.write('first.xml') #将内存中的节点写入文件

#!usr/bin/env python

# -*- coding:utf-8 -*-

#!usr/bin/env python

# -*- coding:utf-8 -*-

from xml.etree import ElementTree as ET

tree = ET.parse('first.xml')

root = tree.getroot()

new_root = root.makeelement('name',{'age':''}) #在内存创建一个节点，该节点与root没有关系

new_root1 = ET.Element('name1',{'age1':''}) #利用类创建一个节点，上面的代码实质上是调用了这句

new_root.append(new_root1)   #new_root1做为new_root的子节点

root.append(new_root)

tree.write('first.xml',encoding='utf-8',xml_declaration=True,short_empty_elements=False) #将内存中的节点写入文件,参数utf-8支持中文，

# xml_declaration=True文件头部加上注释，

# short_empty_elements=False节点没有值时也全闭合，例如：<country name="Panama"></country>

给xml文件添加缩进：

#!usr/bin/env python

# -*- coding:utf-8 -*-

from xml.etree import ElementTree as ET

root = ET.Element('data')

son_root1 = ET.SubElement(root,'country',{'name':'Liechtenstein'})

son_root2 = ET.SubElement(root,'country',{'name':'Singapore'})

son_root3 = ET.SubElement(root,'country',{'name':'Panama'})

son_root1_son1 = ET.SubElement(son_root1,'rank',{'updated':'yes'})

son_root1_son1.text = ''

son_root1_son1 = ET.SubElement(son_root1,'year')

son_root1_son1.text = ''

son_root1_son1 = ET.SubElement(son_root1,'gdppc')

son_root1_son1.text = ''

son_root1_son1 = ET.SubElement(son_root1,'neighbor',attrib = {'direction':'B','name':'Austria'})

son_root1_son1.text = ''

son_root1_son1 = ET.SubElement(son_root1,'neighbor',attrib = {'direction':'W','name':'Switzerland'})

# et = ET.ElementTree(new_xml)

# et.write('test.xml',encoding='utf-8',xml_declaration=True,short_empty_elements=False)

#以下为将xml字符串转换为有缩进的字符串

#minidom功能少，效率低，一般我们用ElementTree，但minidom有美化缩进xml的功能

from xml.dom import minidom

root_string = ET.tostring(root,encoding='utf-8')  #将ET节点转化为字符串

reparsed = minidom.parseString(root_string)    #利用minidom将字符串转化为xml.dom.minidom.Document对象

pretty = reparsed.toprettyxml(indent='\t')  #利用minidom函数转化为有缩进格式的xml

with open('first.xml','w',encoding='utf-8') as f:

    f.write(pretty)

XML命名空间：

XML 命名空间提供避免元素命名冲突的方法。

命名冲突

在 XML 中，元素名称是由开发者定义的，当两个不同的文档使用相同的元素名时，就会发生命名冲突。

这个 XML 文档携带着某个表格中的信息：

<table>

   <tr>

   <td>Apples</td>

   <td>Bananas</td>

   </tr>

</table>

这个 XML 文档携带有关桌子的信息（一件家具）：

<table>

   <name>African Coffee Table</name>

   <width>80</width>

   <length>120</length>

</table>

假如这两个 XML 文档被一起使用，由于两个文档都包含带有不同内容和定义的 <table> 元素，就会发生命名冲突。

XML 解析器无法确定如何处理这类冲突。

使用前缀来避免命名冲突

此文档带有某个表格中的信息：

<h:table>

   <h:tr>

   <h:td>Apples</h:td>

   <h:td>Bananas</h:td>

   </h:tr>

</h:table>

此 XML 文档携带着有关一件家具的信息：

<f:table>

   <f:name>African Coffee Table</f:name>

   <f:width>80</f:width>

   <f:length>120</f:length>

</f:table>

现在，命名冲突不存在了，这是由于两个文档都使用了不同的名称来命名它们的 <table> 元素 (<h:table> 和 <f:table>)。

通过使用前缀，我们创建了两种不同类型的 <table> 元素。

使用命名空间（Namespaces）

这个 XML 文档携带着某个表格中的信息：

<h:table xmlns:h="http://www.w3.org/TR/html4/">

   <h:tr>

   <h:td>Apples</h:td>

   <h:td>Bananas</h:td>

   </h:tr>

</h:table>

此 XML 文档携带着有关一件家具的信息：

<f:table xmlns:f="http://www.w3school.com.cn/furniture">

   <f:name>African Coffee Table</f:name>

   <f:width>80</f:width>

   <f:length>120</f:length>

</f:table>

与仅仅使用前缀不同，我们为 <table> 标签添加了一个 xmlns 属性，这样就为前缀赋予了一个与某个命名空间相关联的限定名称。

XML Namespace (xmlns) 属性

XML 命名空间属性被放置于元素的开始标签之中，并使用以下的语法：

xmlns:namespace-prefix="namespaceURI"

当命名空间被定义在元素的开始标签中时，所有带有相同前缀的子元素都会与同一个命名空间相关联。

注释：用于标示命名空间的地址不会被解析器用于查找信息。其惟一的作用是赋予命名空间一个惟一的名称。不过，很多公司常常会作为指针来使用命名空间指向实际存在的网页，这个网页包含关于命名空间的信息。

请访问 http://www.w3.org/TR/html4/。

统一资源标识符（Uniform Resource Identifier (URI)）

统一资源标识符是一串可以标识因特网资源的字符。最常用的 URI 是用来标示因特网域名地址的统一资源定位器(URL)。另一个不那么常用的 URI 是统一资源命名(URN)。在我们的例子中，我们仅使用 URL。

默认的命名空间（Default Namespaces）

为元素定义默认的命名空间可以让我们省去在所有的子元素中使用前缀的工作。

请使用下面的语法：

xmlns="namespaceURI"

这个 XML 文档携带着某个表格中的信息：

<table xmlns="http://www.w3.org/TR/html4/">

   <tr>

   <td>Apples</td>

   <td>Bananas</td>

   </tr>

</table>

此 XML 文档携带着有关一件家具的信息：

<table xmlns="http://www.w3school.com.cn/furniture">

   <name>African Coffee Table</name>

   <width>80</width>

   <length>120</length>

</table>

命名空间的实际应用

当开始使用 XSL 时，您不久就会看到实际使用中的命名空间。XSL 样式表用于将 XML 文档转换为其他格式，比如 HTML。

如果您仔细观察下面的这个 XSL 文档，就会看到大多数的标签是HTML标签。非 HTML 的标签都有前缀 xsl，并由此命名空间标示："http://www.w3.org/1999/XSL/Transform"：

<?xml version="1.0" encoding="ISO-8859-1"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="/">

<html>

<body>

  <h2>My CD Collection</h2>

  <table border="1">

    <tr>

      <th align="left">Title</th>

      <th align="left">Artist</th>

    </tr>

    <xsl:for-each select="catalog/cd">

    <tr>

      <td><xsl:value-of select="title"/></td>

      <td><xsl:value-of select="artist"/></td>

    </tr>

    </xsl:for-each>

  </table>

</body>

</html>

</xsl:template>

</xsl:stylesheet>

XML介绍

命名空间实例：

#!usr/bin/env python

# -*- coding:utf-8 -*-

#*******************************************************************************************

# <f:table xmlns:f="http://www.w3school.com.cn/furniture">

#    <f:name>African Coffee Table</f:name>

#    <f:width hhh="123">80</f:width>

#    <f:length>120</f:length>

# </f:table

#*******************************************************************************************

from xml.etree import ElementTree as ET

ET.register_namespace('f','http://www.w3school.com.cn/furniture')

# 创建节点

root = ET.Element('{http://www.w3school.com.cn/furniture}table')

root_son1 = ET.SubElement(root,'{http://www.w3school.com.cn/furniture}name')

root_son1.text = 'African Coffee Table'

root_son2 = ET.SubElement(root,'{http://www.w3school.com.cn/furniture}width',attrib={'{http://www.w3school.com.cn/furniture}hh':''})

root_son2.text = ''

root_son3 = ET.SubElement(root,'{http://www.w3school.com.cn/furniture}lenth')

root_son3.text = ''

root_str = ET.tostring(root)

from xml.dom import minidom

reparse = minidom.parseString(root_str)

pretty = reparse.toprettyxml(indent='\t')

with open('abc.xml','w',encoding='utf-8') as f:

    f.write(pretty)

XML接口处理实例：

文件名为bb.xml

读取XML接口实例

十一、configparser模块

configparser用于处理特定格式的文件，其本质上是利用open来操作文件

# 注释1

;  注释2

[section1] # 节点

k1 = v1    # 值

k2:v2       # 值

[section2] # 节点

k1 = v1    # 值

文件格式

1、获取所有节点

import configparser

config = configparser.ConfigParser()

config.read('xxxooo', encoding='utf-8')

ret = config.sections()

print(ret)

2、获取指定节点下所有的键值对

import configparser

config = configparser.ConfigParser()

config.read('xxxooo', encoding='utf-8')

ret = config.items('section1')

print(ret)

3、获取指定节点下所有的建

import configparser

config = configparser.ConfigParser()

config.read('xxxooo', encoding='utf-8')

ret = config.options('section1')

print(ret)

4、获取指定节点下指定key的值

import configparser

config = configparser.ConfigParser()

config.read('xxxooo', encoding='utf-8')

v = config.get('section1', 'k1')

# v = config.getint('section1', 'k1')

# v = config.getfloat('section1', 'k1')

# v = config.getboolean('section1', 'k1')

print(v)

5、检查、删除、添加节点

import configparser

config = configparser.ConfigParser()

config.read('xxxooo', encoding='utf-8')

# 检查

has_sec = config.has_section('section1')

print(has_sec)

# 添加节点

config.add_section("SEC_1")

config.write(open('xxxooo', 'w'))

# 删除节点

config.remove_section("SEC_1")

config.write(open('xxxooo', 'w'))

6、检查、删除、设置指定组内的键值对

import configparser

config = configparser.ConfigParser()

config.read('xxxooo', encoding='utf-8')

# 检查

has_opt = config.has_option('section1', 'k1')

print(has_opt)

# 删除

config.remove_option('section1', 'k1')

config.write(open('xxxooo', 'w'))

# 设置

config.set('section1', 'k10', "")

config.write(open('xxxooo', 'w'))

十二、shutil

高级的文件、文件夹、压缩包处理模块

shutil.copyfileobj(fsrc, fdst[, length])
将文件内容拷贝到另一个文件中

import shutil

shutil.copyfileobj(open('old.xml','r'), open('new.xml', 'w'))

shutil.copyfile(src, dst)
拷贝文件

1	`shutil.copyfile('f1.log',` `'f2.log')`

shutil.copymode(src, dst)
仅拷贝权限。内容、组、用户均不变

1	`shutil.copymode('f1.log',` `'f2.log')`

shutil.copystat(src, dst)
仅拷贝状态的信息，包括：mode bits, atime, mtime, flags

1	`shutil.copystat('f1.log',` `'f2.log')`

shutil.copy(src, dst)
拷贝文件和权限

import shutil

shutil.copy('f1.log', 'f2.log')

shutil.copy2(src, dst)
拷贝文件和状态信息

import shutil

shutil.copy2('f1.log', 'f2.log')

shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
递归的去拷贝文件夹

import shutil

shutil.copytree('folder1', 'folder2', ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))

import shutil

shutil.copytree('f1', 'f2', symlinks=True, ignore=shutil.ignore_patterns('*.pyc', 'tmp*'))

#symlinks=True  是拷贝快捷方式，还是拷贝快捷方式对应的原文件

shutil.rmtree(path[, ignore_errors[, onerror]])
递归的去删除文件

import shutil

shutil.rmtree('folder1')

shutil.move(src, dst)
递归的去移动文件，它类似mv命令，其实就是重命名。

import shutil

shutil.move('folder1', 'folder3')

shutil.make_archive(base_name, format,...)

创建压缩包并返回文件路径，例如：zip、tar

base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，
如：www =>保存至当前路径
如：/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”
root_dir：要压缩的文件夹路径（默认当前目录）
owner：用户，默认当前用户
group：组，默认当前组
logger：用于记录日志，通常是logging.Logger对象

#将 /Users/wupeiqi/Downloads/test 下的文件打包放置当前程序目录

import shutil

ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

#将 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目录

import shutil

ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的，详细：

import zipfile

# 压缩

z = zipfile.ZipFile('laxi.zip', 'w')

z.write('a.log')

z.write('data.data')

z.close()

# 解压

z = zipfile.ZipFile('laxi.zip', 'r')

z.extractall()

z.close()

zipfile解压缩

import tarfile

# 压缩

tar = tarfile.open('your.tar','w')

tar.add('/Users/wupeiqi/PycharmProjects/bbs2.log', arcname='bbs2.log')

tar.add('/Users/wupeiqi/PycharmProjects/cmdb.log', arcname='cmdb.log')

tar.close()

# 解压

tar = tarfile.open('your.tar','r')

tar.extractall()  # 可设置解压地址

tar.close()

tarfile解压缩

十三、subprocess模块

执行shell命令的相关的模块和函数的功能均在 subprocess 模块中实现，并提供了更丰富的功能。

call

执行命令，返回状态码

1 2	`ret` `=` `subprocess.call(["ls",` `"-l"], shell=False)` `ret` `=` `subprocess.call("ls -l", shell=True)`

check_call

执行命令，如果执行状态码是 0 ，则返回0，否则抛异常

1 2	`subprocess.check_call(["ls",` `"-l"])` `subprocess.check_call("exit 1", shell=True)`

check_output

执行命令，如果状态码是 0 ，则返回执行结果，否则抛异常

1 2	`subprocess.check_output(["echo",` `"Hello World!"])` `subprocess.check_output("exit 1", shell=True)`

subprocess.Popen(...)

用于执行复杂的系统命令

参数：

args：shell命令，可以是字符串或者序列类型（如：list，元组）
bufsize：指定缓冲。0 无缓冲,1 行缓冲,其他缓冲区大小,负值系统缓冲
stdin, stdout, stderr：分别表示程序的标准输入、输出、错误句柄
preexec_fn：只在Unix平台下有效，用于指定一个可执行对象（callable object），它将在子进程运行之前被调用
close_sfs：在windows平台下，如果close_fds被设置为True，则新创建的子进程将不会继承父进程的输入、输出、错误管道。
所以不能将close_fds设置为True同时重定向子进程的标准输入、输出与错误(stdin, stdout, stderr)。
shell：同上
cwd：用于设置子进程的当前目录
env：用于指定子进程的环境变量。如果env = None，子进程的环境变量将从父进程中继承。
universal_newlines：不同系统的换行符不同，True -> 同意使用 \n
startupinfo与createionflags只在windows下有效
将被传递给底层的CreateProcess()函数，用于设置子进程的一些属性，如：主窗口的外观，进程的优先级等等

import subprocess

ret1 = subprocess.Popen(["mkdir","t1"])

ret2 = subprocess.Popen("mkdir t2", shell=True)

终端输入的命令分为两种：

输入即可得到输出，如：ifconfig
输入进行某环境，依赖再输入，如：python

import subprocess

obj = subprocess.Popen("mkdir t3", shell=True, cwd='/home/dev',)

import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)

obj.stdin.write("print(1)\n")

obj.stdin.write("print(2)")

out_error_list = obj.communicate()

print(out_error_list)

import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)

out_error_list = obj.communicate('print("hello")')

print(out_error_list)

十四、logging模块

用于便捷记录日志且线程安全的模块，多人可以同时打开保存，系统会控制好队列

1、写单文件日志

#!usr/bin/env python

# -*- coding:utf-8 -*-

import logging

logging.basicConfig(filename='log.log',

                    format='%(asctime)s - %(name)s - %(levelname)s -%(module)s: %(message)s',

                    level=10)  #当设置level=10时，下面的级别>=10时才写入日志

logging.debug('debug')

logging.info('info')

logging.warning('warning')

logging.error('erro')

logging.critical('critical')

注：只有【当前写等级】大于【日志等级】时，日志文件才被记录。

定义handler的输出格式formatter。输出格式有很多。

format参数中可能用到的格式化串：
%(name)s Logger的名字
%(levelno)s 数字形式的日志级别
%(levelname)s 文本形式的日志级别
%(filename)s 调用日志输出函数的模块的文件名
%(module)s 调用日志输出函数的模块名
%(funcName)s 调用日志输出函数的函数名
%(lineno)d 调用日志输出函数的语句所在的代码行
%(created)f 当前时间，用UNIX标准的表示时间的浮点数表示
%(relativeCreated)d 输出日志信息时的，自Logger创建以来的毫秒数
%(asctime)s 字符串形式的当前时间。默认格式是 “2003-07-08 16:49:45,896”。逗号后面的是毫秒
%(message)s用户输出的消息

2、多文件日志

对于上述记录日志的功能，只能将日志记录在单文件中，如果想要设置多个日志文件，logging.basicConfig将无法完成，需要自定义文件和日志操作对象。

#!usr/bin/env python

# -*- coding:utf-8 -*-

import logging

#*************定义文件***********************************************************************************

#实例化一个FileHander对象，该对象是将格式化后的日志记录写入硬盘

file_1_1 = logging.FileHandler('11_1.log','a',encoding='utf-8') #实例化一个FileHander对象

#实例化一个Formatter对象,该对象是将日志记录转化为文本

fmt = logging.Formatter(fmt='%(asctime)s - %(name)s - %(levelname)s - %(module)s: %(message)s')

file_1_1.setFormatter(fmt)  #将file_1_1的格式设置为fmt

file_1_2 = logging.FileHandler('11_2.log','a',encoding='utf-8')

fmt = logging.Formatter(fmt='%(asctime)s - %(name)s - %(levelname)s - %(module)s: %(message)s')

file_1_2.setFormatter(fmt)

#**********************************************************************************************************

#*************定义日志*************************************************************************************

#实例化一个Logger对象，作用：初始化一个logging对象

logger1 = logging.Logger('s1',level=logging.ERROR)

logger1.addHandler(file_1_1) #将规定的handler加入日志

logger1.addHandler(file_1_2)

#**********************************************************************************************************

#*************写入日志*************************************************************************************

logger1.critical('')

当然也可以用自定义的方法记录到一个日志文件中：

#!usr/bin/env python

# -*- coding:utf-8 -*-

import logging

#定义文件

file_2_1 = logging.FileHandler('21-1.log','a',encoding='utf-8')

fmt = logging.Formatter(fmt='%(asctime)s - %(name)s - %(levelname)s - %(module)s: %(message)s')

file_2_1.setFormatter(fmt)

#定义日志

logger2 = logging.Logger('s2',level=10)

logger2.addHandler(file_2_1)

#写入日志

logger2.info('记录一个info错误')

logging按照对象记录方式实例*****：

import logging

#1、创建一个Log对象

logger = logging.getLogger()

#2、创建一个文件管理操作符对象 handler，用于写入日志文件

fh = logging.FileHandler('test.log',encoding='utf-8') 

#3、创建一个屏幕管理操作符对象 再创建一个handler，用于输出到控制台

ch = logging.StreamHandler()
#4、创建一个日志输出格式

formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

fh.setLevel(logging.DEBUG)


#5、文件操作符 绑定 日志输出格式

fh.setFormatter(formatter) 
#6、屏幕管理操作符 绑定 入至输出格式

ch.setFormatter(formatter) 

#7、log对象绑定文件操作符

logger.addHandler(fh) #logger对象可以添加多个fh和ch对象 
#8、log对象绑定屏幕操作符

logger.addHandler(ch) 

logger.debug('logger debug message')

logger.info('logger info message')

logger.warning('logger warning message')

logger.error('logger error message')

logger.critical('logger critical message')

stat 结构:

st_mode: inode 保护模式

st_ino: inode 节点号。

st_dev: inode 驻留的设备。

st_nlink: inode 的链接数。

st_uid: 所有者的用户ID。

st_gid: 所有者的组ID。

st_size: 普通文件以字节为单位的大小；包含等待某些特殊文件的数据。

st_atime: 上次访问的时间。

st_mtime: 最后一次修改的时间。

st_ctime: 由操作系统报告的"ctime"。在某些系统上（如Unix）是最新的元数据更改的时间，在其它系统上（如Windows）是创建时间（详细信息参见平台的文档）