Python自动化开发 - 常用模块(二)

本节内容

1、shutil模块

2、shelve模块

3、xml处理模块

一、shutil模块

高级的文件、文件夹、压缩包处理模块

1、shutil.copyfileobj(fsrc, fdst[, length])

将文件内容拷贝到另一个文件中，可以部分内容

import shutil

f1 = open("access.log.1")

f2 = open("access02", "w")

shutil.copyfileobj(f1, f2)  # 参数为文件对象，需要打开文件

2、shutil.copyfile(src, dst)

拷贝文件

shutil.copyfile("access.log.1", "access03")   # 参数为文件名

3、shutil.copymode(src, dst)
仅拷贝权限。内容、组、用户均不变

shutil.copymode("access.log.1", "access02")  # 仅拷贝权限，内容、组、用户均不变。文件必须存在

4、shutil.copystat(src, dst)
拷贝状态的信息，包括：mode bits, atime, mtime, flags

shutil.copystat("access.log.1", "access02")  # 拷贝状态的信息，包括：mode bits, atime, mtime, flags。文件必须存在

5、shutil.copy(src, dst)
拷贝文件和权限

shutil.copy("access.log.1", "access03")  # 拷贝文件和权限

6、shutil.copy2(src, dst)
拷贝文件和状态信息

shutil.copy2("access.log.1", "access04")  # 拷贝文件和状态

7、shutil.copytree(src, dst, symlinks=False, ignore=shutil.ignore_patterns(*patterns))

递归式拷贝文件夹src中的内容，不包括文件夹src，dst可以起名为原文件夹名src

shutil.copytree("E:\s16\day5", "day5", ignore=shutil.ignore_patterns("access*", '*.py'))

8、shutil.rmtree(path[, ignore_errors[, onerror]])

递归的去删除文件

shutil.rmtree('E:\s16\day6\day5')

9、shutil.move(src, dst)

递归的去移动文件

src包括文件、文件夹，必须存在；dst不能存在同名文件，相当于新建文件

shutil.move('E:\s16\day6\day5', 'E:\s16\day7')

10、shutil.make_archive(base_name, format,...)

创建压缩包并返回文件路径，例如：zip、tar

base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，

　　　　如：www =>保存至当前路径
　　　　如：/Users/wupeiqi/www =>保存至/Users/wupeiqi/

format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”
root_dir：要压缩的文件夹路径（默认当前目录）
owner：用户，默认当前用户
group：组，默认当前组
logger：用于记录日志，通常是logging.Logger对象

res = shutil.make_archive("www", 'zip', root_dir='E:\s16\day7')

print(res)

shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的

import zipfile

# 压缩

z = zipfile.ZipFile('laxi.zip', 'w')

z.write('a.log')

z.write('data.data')

z.close()

# 解压

z = zipfile.ZipFile('laxi.zip', 'r')

z.extractall()

z.close()

zipfile 压缩解压

zipfile 压缩解压

import tarfile

# 压缩

tar = tarfile.open('your.tar','w')

tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')

tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')

tar.close()

# 解压

tar = tarfile.open('your.tar','r')

tar.extractall()  # 可设置解压地址

tar.close()

tarfile 压缩解压

二、shelve模块

shelve模块是一个简单的k,v将内存数据通过文件持久化的模块，可以持久化任何pickle可支持的python数据格式

三、xml处理模块

xml是实现不同语言或程序之间进行数据交换的协议，跟json差不多，但json使用起来更简。

至今很多传统公司如金融行业的很多系统的接口还主要是xml

xml的格式如下，就是通过<>节点来区别数据结构

<?xml version="1.0"?>

<data>

    <country name="Liechtenstein">

        <rank updated="yes"></rank>

        <year></year>

        <gdppc></gdppc>

        <neighbor name="Austria" direction="E"/>

        <neighbor name="Switzerland" direction="W"/>

    </country>

    <country name="Singapore">

        <rank updated="yes"></rank>

        <year></year>

        <gdppc></gdppc>

        <neighbor name="Malaysia" direction="N"/>

    </country>

    <country name="Panama">

        <rank updated="yes"></rank>

        <year></year>

        <gdppc></gdppc>

        <neighbor name="Costa Rica" direction="W"/>

        <neighbor name="Colombia" direction="E"/>

    </country>

</data>

xml数据文件

xml协议在各个语言里的都是支持的，在python中可以用以下模块操作xml

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")

root = tree.getroot()

print(root.tag)

#遍历xml文档

for child in root:

    print(child.tag, child.attrib)

    for i in child:

        print(i.tag,i.text)

#只遍历year 节点

for node in root.iter('year'):

    print(node.tag,node.text)

import xml.etree.ElementTree as ET

tree = ET.parse("xmltest.xml")

root = tree.getroot()

#修改

for node in root.iter('year'):

    new_year = int(node.text) +

    node.text = str(new_year)

    node.set("updated","yes")

tree.write("xmltest.xml")

#删除node

for country in root.findall('country'):

   rank = int(country.find('rank').text)

   if rank > :

     root.remove(country)

tree.write('output.xml')

修改和删除xml文档内容

import xml.etree.ElementTree as ET

new_xml = ET.Element("namelist")

name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})

age = ET.SubElement(name,"age",attrib={"checked":"no"})

sex = ET.SubElement(name,"sex")

sex.text = ''

name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})

age = ET.SubElement(name2,"age")

age.text = ''

et = ET.ElementTree(new_xml) #生成文档对象

et.write("test.xml", encoding="utf-8",xml_declaration=True)

ET.dump(new_xml) #打印生成的格式

创建xml文档

四、configparser模块

用于生成和修改常见配置文档，当前模块的名称在 python 3.x 版本中变更为 configparser。

来看一个好多软件的常见文档格式如下

[DEFAULT]

ServerAliveInterval = 45

Compression = yes

CompressionLevel = 9

ForwardX11 = yes

[bitbucket.org]

User = hg

[topsecret.server.com]

Port = 50022

ForwardX11 = no

import configparser

config = configparser.ConfigParser()

config["DEFAULT"] = {'ServerAliveInterval': '',

                      'Compression': 'yes',

                     'CompressionLevel': ''}

config['bitbucket.org'] = {}

config['bitbucket.org']['User'] = 'hg'

config['topsecret.server.com'] = {}

topsecret = config['topsecret.server.com']

topsecret['Host Port'] = ''     # mutates the parser

topsecret['ForwardX11'] = 'no'  # same here

config['DEFAULT']['ForwardX11'] = 'yes'

with open('example.ini', 'w') as configfile:

   config.write(configfile)

创建configparser

>>> import configparser

>>> config = configparser.ConfigParser()

>>> config.sections()

[]

>>> config.read('example.ini')

['example.ini']

>>> config.sections()

['bitbucket.org', 'topsecret.server.com']

>>> 'bitbucket.org' in config

True

>>> 'bytebong.com' in config

False

>>> config['bitbucket.org']['User']

'hg'

>>> config['DEFAULT']['Compression']

'yes'

>>> topsecret = config['topsecret.server.com']

>>> topsecret['ForwardX11']

'no'

>>> topsecret['Port']

''

>>> for key in config['bitbucket.org']: print(key)

...

user

compressionlevel

serveraliveinterval

compression

forwardx11

>>> config['bitbucket.org']['ForwardX11']

'yes'

读取configparser

[section1]

k1 = v1

k2:v2

[section2]

k1 = v1

import ConfigParser

config = ConfigParser.ConfigParser()

config.read('i.cfg')

# ########## 读 ##########

#secs = config.sections()

#print secs

#options = config.options('group2')

#print options

#item_list = config.items('group2')

#print item_list

#val = config.get('group1','key')

#val = config.getint('group1','key')

# ########## 改写 ##########

#sec = config.remove_section('group1')

#config.write(open('i.cfg', "w"))

#sec = config.has_section('wupeiqi')

#sec = config.add_section('wupeiqi')

#config.write(open('i.cfg', "w"))

#config.set('group2','k1',)

#config.write(open('i.cfg', "w"))

#config.remove_option('group2','age')

#config.write(open('i.cfg', "w"))

增删改查configparser

五、hashlib模块

用于加密相关的操作，3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

import hashlib

m = hashlib.md5()

m.update(b'Hello')

m.update(b' World')

print('md5:', m.hexdigest())

m2 = hashlib.md5()

m2.update(b'Hello World')

print('md5:', m2.hexdigest())  # 十六进制格式hash

print('md5_2进制：',m2.digest())     # 二进制格式hash

sh1 = hashlib.sha1()

sh1.update(b'admin')

print('sha1:', sh1.hexdigest())

sh256 = hashlib.sha256()

sh256.update(b'xxx')

print('sha256:', sh256.hexdigest())

sh384 = hashlib.sha384()

sh384.update(b'xxx')

print('sha384:', sh384.hexdigest())

sh512 = hashlib.sha512()

sh512.update(b'xxx')

print('sha512:', sh512.hexdigest())

Python 还有一个 hmac 模块，它内部对我们创建 key 和内容再进行处理然后再加密

散列消息鉴别码，简称HMAC，是一种基于消息鉴别码MAC（Message Authentication Code）的鉴别机制。

使用HMAC时,消息通讯的双方，通过验证消息中加入的鉴别密钥K来鉴别消息的真伪；

一般用于网络通信中消息加密，前提是双方先要约定好key,就像接头暗号一样，然后消息发送把用key把消息加密，

接收方用key ＋消息明文再加密，拿加密后的值跟发送者的相对比是否相等，这样就能验证消息的真实性，及发送者的合法性了

import hmac

h = hmac.new('天王盖地虎'.encode('utf-8'), '宝塔镇河妖'.encode('utf-8'))

print(h.hexdigest())

六、subprocess模块

这个模块创建新的进程，来连接输入\输出\错误通道，并得到返回代码，常用示例如下：

# run 执行命令，常用满足大多数需求

>>> subprocess.run(['ls', '-l']

>>> subprocess.run("ls -l", shell=True)

# 执行命令，返回命令执行状态， 0 or 非0

>>> retcode = subprocess.call(["ls", "-l"])

# 执行命令，如果命令结果为0，就正常返回，否则抛异常

>>> subprocess.check_call(["ls", "-l"])

# 接收字符串格式命令，返回元组形式，第1个元素是执行状态，第2个是命令结果

>>> subprocess.getstatusoutput('ls /bin/ls')
(0, '/bin/ls')

#上面那些方法，底层都是封装的subprocess.Popen

poll()
Check if child process has terminated. Returns returncode

wait()
Wait for child process to terminate. Returns returncode attribute.

terminate() 杀掉所启动进程

communicate() 等待任务结束

stdin 标准输入

stdout 标准输出

stderr 标准错误

pid

The process ID of the child process.

# 例子
>>> p = subprocess.Popen("df -h|grep disk",stdin=subprocess.PIPE,stdout=subprocess.PIPE,shell=True)
>>> p.stdout.read()

调用subprocess.run(...)是推荐的常用方法，在大多数情况下能满足需求，

但如果你可能需要进行一些复杂的与系统的交互的话，你还可以用subprocess.Popen(),语法如下：

p = subprocess.Popen("find / -size +1000000 -exec ls -shl {} \;",shell=True,stdout=subprocess.PIPE)

print(p.stdout.read())

args：shell命令，可以是字符串或者序列类型（如：list，元组）
bufsize：指定缓冲。0 无缓冲,1 行缓冲,其他缓冲区大小,负值系统缓冲
stdin, stdout, stderr：分别表示程序的标准输入、输出、错误句柄
preexec_fn：只在Unix平台下有效，用于指定一个可执行对象（callable object），它将在子进程运行之前被调用
close_sfs：在windows平台下，如果close_fds被设置为True，则新创建的子进程将不会继承父进程的输入、输出、错误管道。
所以不能将close_fds设置为True同时重定向子进程的标准输入、输出与错误(stdin, stdout, stderr)。
shell：同上
cwd：用于设置子进程的当前目录
env：用于指定子进程的环境变量。如果env = None，子进程的环境变量将从父进程中继承。
universal_newlines：不同系统的换行符不同，True -> 同意使用 \n
startupinfo与createionflags只在windows下有效
将被传递给底层的CreateProcess()函数，用于设置子进程的一些属性，如：主窗口的外观，进程的优先级等等

终端输入的命令分为两种：

输入即可得到输出，如：ifconfig
输入进行某环境，依赖再输入，如：python

需要交互的命令示

import subprocess

obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

obj.stdin.write('print 1 \n ')

obj.stdin.write('print 2 \n ')

obj.stdin.write('print 3 \n ')

obj.stdin.write('print 4 \n ')

out_error_list = obj.communicate(timeout=10)

print out_error_list

subprocess实现sudo 自动输入密码

import subprocess

def mypass():

    mypass = '123' #or get the password from anywhere

    return mypass

echo = subprocess.Popen(['echo',mypass()],

                        stdout=subprocess.PIPE,

                        )

sudo = subprocess.Popen(['sudo','-S','iptables','-L'],

                        stdin=echo.stdout,

                        stdout=subprocess.PIPE,

                        )

end_of_pipe = sudo.stdout

print "Password ok \n Iptables Chains %s" % end_of_pipe.read()

七、re模块

常用正则表达式符号

'.'     默认匹配除\n之外的任意一个字符，若指定flag DOTALL,则匹配任意字符，包括换行

'^'     匹配字符开头，若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)

'$'     匹配字符结尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以

'*'     匹配*号前的字符0次或多次，re.findall("ab*","cabb3abcbbac")  结果为['abb', 'ab', 'a']

'+'     匹配前一个字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']

'?'     匹配前一个字符1次或0次

'{m}'   匹配前一个字符m次

'{n,m}' 匹配前一个字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']

'|'     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 结果'ABC'

'(...)' 分组匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c

'\A'    只从字符开头匹配，re.search("\Aabc","alexabc") 是匹配不到的

'\Z'    匹配字符结尾，同$

'\d'    匹配数字0-9

'\D'    匹配非数字

'\w'    匹配[A-Za-z0-9]

'\W'    匹配非[A-Za-z0-9]

's'     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t'

'(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 结果{'province': '3714', 'city': '81', 'birthday': '1993'}

最常用的匹配语法

re.match 从头开始匹配

re.search 匹配包含

re.findall 把所有匹配到的字符放到以列表中的元素返回

re.splitall 以匹配到的字符当做列表分隔符

re.sub      匹配字符并替

Python自动化开发 - 常用模块(二)的更多相关文章

Python自动化开发 - 常用模块(一)
本节内容 1.模块介绍 2.time&datetime模块 3.random模块 4.os模块 5.sys模块 6.json&pickle模块 7.logging模块一.模块介绍模 ...
Python自动化之常用模块
1 time和datetime模块 #_*_coding:utf-8_*_ __author__ = 'Alex Li' import time # print(time.clock()) #返回处理 ...
Python自动化之常用模块学习
自动化常用模块 urllib和request模块学习笔记 '获取页面,UI自动化校验页面展示作用': #-*- coding : utf-8 -*-import urllib.requestimpor ...
Python自动化开发 - select模块
介绍: IO-多路复用:监听多个socker对象是否有变化,包括可读.可写.发送错误 Python中的select模块专注于I/O多路复用,提供了select poll epoll三个方法(其中后两个 ...
python之路----常用模块二
collections模块在内置数据类型(dict.list.set.tuple)的基础上,collections模块还提供了几个额外的数据类型:Counter.deque.defaultdict. ...
Day5 - Python基础5 常用模块学习
Python 之路 Day5 - 常用模块学习本节大纲: 模块介绍 time &datetime模块 random os sys shutil json & picle shel ...
python自动化开发学习进程, 线程, 协程
python自动化开发学习进程, 线程, 协程前言在过去单核CPU也可以执行多任务,操作系统轮流让各个任务交替执行,任务1执行0.01秒,切换任务2,任务2执行0.01秒,在切换到任务3,这 ...
python自动化开发学习 I/O多路复用
python自动化开发学习 I/O多路复用一. 简介 socketserver在内部是由I/O多路复用,多线程和多进程,实现了并发通信.IO多路复用的系统消耗很小. IO多路复用底层就是监听so ...
python之路——常用模块
阅读目录认识模块什么是模块模块的导入和使用常用模块一 collections模块时间模块 random模块 os模块 sys模块序列化模块 re模块常用模块二 hashlib模块 con ...

随机推荐

iOS push新的调用方法
// IOS8 新系统需要使用新的代码if ([[[UIDevice currentDevice] systemVersion] floatValue] >= 8.0){ [[UIAppl ...
Java中方法的重写
★★前提:方法的重写建立在继承关系上★★ 在Java程序中,类的继承关系可以产生一个子类,子类继承父类,它具备了父类所有的特征,继承了父类所有的方法和变量. 所谓方法的重写是指子类中的方法与父类中继承 ...
.net序列化
在开发过程中,会遇到很多需要使用序列化的场景,比如wcf,webservice或者jquery+.net等.那今天就说说我对序列化的理解. 在.net中有几种序列化的方式,XML.二进制.SOAP.还 ...
会话和http请求
一次HTTP请求和响应的过程域名解析 --> 发起TCP的3次握手 --> 建立TCP连接后发起http请求 --> 服务器响应http请求,浏览器得到html代码 --> ...
Spring 循环引用（二）源码分析
Spring 循环引用(二)源码分析 Spring 系列目录(https://www.cnblogs.com/binarylei/p/10198698.html) Spring 循环引用相关文章: & ...
各种编译不通过xcode
2017-08-24 Apple Mach-O Linker (Id) Error Linker command failed with exit code 1 (use -v to see invo ...
Mvvm Light 无法添加MvvmView(Win81)的问题
After I create a MvvmLight(Win81) project, I want add a new view , but there is only MvvmView(Win8), ...
TensorFlow实现的激活函数可视化
书上的代码: # coding: utf-8 # In[1]: import matplotlib.pyplot as plt import numpy as np import tensorflow ...
ARM cortexM4中断优先级的一点理解。
根据手册PM0214 40页.213页.200.195.interrupt priority grouping. 根据手册EM0090 第371页. stm32f42xxx除掉fpu部分,有91个可屏 ...
kbmmw 中JSON 操作入门
现在各种系统中JSON 用的越来越多.delphi 也自身支持JSON 处理. 今天简要说一下kbmmw 内部如何使用和操作JSON. kbmmw 中json的操作是以TkbmMWJSONStream ...

Python自动化开发 - 常用模块(二)

Python自动化开发 - 常用模块(二)的更多相关文章

随机推荐

热门专题