python-常用模块xml、shelve、configparser、hashlib

一、shelve模块

shelve模块也是用来序列化的.

使用方法:

　　1.open

　　2.读写

　　3.close

import shelve

# 序列化

sl = shelve.open('shlvetest.txt',)

sl['date'] = '8-13'

sl.close()

# 反序列化

s2 = shelve.open('shlvetest.txt',)

print(s2['date'])

s2.close()

特点:使用方法简单,提供一个文件名就可以开始读写

　　读写的方法和字典一致,可以当成自动序列化的字典

注:内部使用的就是pickle,所以也存在跨平台差的问题。自己存的数据只有自己知道如何取，一般用于单击程序

二、XML模块

XML全称为可扩展标记语言，标记符号为<>.

XML是为了能够跨平台数据交互。

XML语法格式：

　　1.任何的起始标签都必须有一个结束标签<tagname> </tagname>

　　2.可以采用另一种简化语法，可以在一个标签中同时表示起始和结束标签，用法为在大于符号之前加一个斜线（/）例如<tagname/>

　　3.标签必须按合适的顺序进行嵌套，所以结束标签必须按镜像顺序匹配起始标签

　　4.所有的特性都必须有值

　　5.所有的特性都必须在值的周围加上双引号。

注：最外层有且只有一个标签，这个标签为根标签。第一行应该有文档声明，用于告诉计算机怎么理解

<?XML version="1.0" encoding="utf-8"?>

使用场景：

　　1.配置文件　　2.常规的数据交互

XML与json的区别：

　　作用是一样的都是一种数据格式

　　XML比json先诞生

　　json的数据比XML小

　　json是主流

python中XML处理

　　ElmentTree：表示整个文件的元素树

　　Elment：表示一个节点

　　　　属性：1.text 在开始标签和结束标签中间的文本

　　　　　　 2.attrib：所有的属性字典类型

　　　　　　　3.tag：标签的名字

1.解析XML

　　查找节点（标签）　　　　

　　　　find：在子标签中获取匹配的第一个

　　　　findall：在子标签中获取名字匹配的所有

　　　　iter（名字）在全文中查找匹配的所有标签，返回一个迭代器

　　　　方法-----> get 获取某个属性的值

XML数据：

<?xml version="1.0"?>

<data>

    <country name="Liechtenstein">

        <rank updated="yes">2</rank>

        <year>2008</year>

        <gdppc>141100</gdppc>

        <neighbor name="Austria" direction="E"/>

        <neighbor name="Switzerland" direction="W"/>

    </country>

    <country name="Singapore">

        <rank updated="yes">5</rank>

        <year>2011</year>

        <gdppc>59900</gdppc>

        <neighbor name="Malaysia" direction="N"/>

    </country>

    <country name="Panama">

        <rank updated="yes">69</rank>

        <year>2011</year>

        <gdppc>13600</gdppc>

        <neighbor name="Costa Rica" direction="W"/>

        <neighbor name="Colombia" direction="E"/>

    </country>

</data>

xml数据

print(root.iter('year')) #全文搜索

print(root.find('country')) #在root的子节点找，只找一个

print(root.findall('country')) #在root的子节点找，找所有

# 当要获取属性值时，用attrib方法。

# 当要获取节点值时，用text方法。

# 当要获取节点名时，用tag方法。

import xml.etree.cElementTree as et

#读取XML文档到内存中, 得到一个包含所有数据的节点数

# 每一个标签就称之为一个节点 或元素

tree = et.parse("test.xml")

# 或取根标签

root = tree.getroot()

print(root)#<Element 'data' at 0x00000204FFD1CA48>

# 获取country  默认找的是第一个

print(root.find("country"))

# 获取所有的

print(root.findall("country")) #获取一个列表

# # 获取其中一个 year

print(root.iter("year"))

#只遍历year 节点

for i in root.iter("year"):

    print(i)

# 遍历整个xml

for country in root:

    print(country.tag,country.attrib,country.text)

    for t in country:

        print(t.tag, t.attrib, t.text)

修改：

# ======================    修改  素有country的year文本改成+1

# 读取到内存

tree = et.parse('test.xml')

for country in tree.findall("country"):

    yeartag = country.find("year")

    yeartag.text = str(int(yeartag.text)+1)

# 写回到文件

tree.write('test.xml',encoding="utf-8",xml_declaration = False)

需要注意的是：读出来修改后，一定要记得写回去。

删除：

# 删除

tree = et.parse('test.xml')

for country in tree.findall("country"):

    print(country.find("year"))

    country.remove(country.find("year"))

# 写回到文件

tree.write('test.xml',encoding="utf-8",xml_declaration = False)

添加自定义标签：

tree = et.parse('test.xml')

for country in tree.findall("country"):

#添加自标签

    newtag = et.Element("newTag")

#文本

    newtag.text = ""

#属性

    newtag.attrib["name"] = "DSB"

#添加

    country.append(newtag)

# 写回到文件

tree.write('test.xml',encoding="utf-8",xml_declaration = False)

xml_declaration = True 给XML添加文档说明

用代码生成XML文档：

import xml.etree.ElementTree as et

# 创建根标签

root = et.Element("root")

# 创建节点树

t1 = et.ElementTree(root)

# 加一个peron标签

persson = et.Element("person")

persson.attrib["name"] = "zfj"

persson.attrib["sex"] = "man"

persson.attrib["age"] = ""

persson.text = "这是一个person标签"

root.append(persson)

# 写入文件

t1.write("newXML.xml",encoding="utf-8",xml_declaration=True)

三、configparser模块

configparser模块是配置文件解析模块，配置文件是用于提供程序运行所需要的一些信息的文件

配置文件内容格式：只包括两种元素　　section分区、option分区

一个文件可以有多个section，一个section可以有多个选项

核心功能：

　　1.section 获取所有分区

　　2.option获取所有选项

　　3.get　　获取一个值传入section option

========配置文件========

[section1]

k1 = v1

k2:v2

user=egon

age=18

is_admin=true

salary=31

[section2]

k1 = v1

配置读取：

import configparser

config=configparser.ConfigParser()

config.read('a.cfg')

#查看所有的标题

res=config.sections() #['section1', 'section2']

print(res)

#查看标题section1下所有key=value的key

options=config.options('section1')

print(options) #['k1', 'k2', 'user', 'age', 'is_admin', 'salary']

#查看标题section1下所有key=value的(key,value)格式

item_list=config.items('section1')

print(item_list) #[('k1', 'v1'), ('k2', 'v2'), ('user', 'egon'), ('age', '18'), ('is_admin', 'true'), ('salary', '31')]

#查看标题section1下user的值=>字符串格式

val=config.get('section1','user')

print(val) #egon

#查看标题section1下age的值=>整数格式

val1=config.getint('section1','age')

print(val1) #

#查看标题section1下is_admin的值=>布尔值格式

val2=config.getboolean('section1','is_admin')

print(val2) #True

#查看标题section1下salary的值=>浮点型格式

val3=config.getfloat('section1','salary')

print(val3) #31.0

配置改写：

import configparser

config=configparser.ConfigParser()

config.read('a.cfg',encoding='utf-8')

#删除整个标题section2

config.remove_section('section2')

#删除标题section1下的某个k1和k2

config.remove_option('section1','k1')

config.remove_option('section1','k2')

#判断是否存在某个标题

print(config.has_section('section1'))

#判断标题section1下是否有user

print(config.has_option('section1',''))

#添加一个标题

config.add_section('egon')

#在标题egon下添加name=egon,age=18的配置

config.set('egon','name','egon')

config.set('egon','age',18) #报错,必须是字符串

#最后将修改的内容写入文件,完成最终的修改

config.write(open('a.cfg','w'))

模拟一个下载功能最大连接速度可以由用户来控制，用户不能看代码，所以提供一个配置文件

import configparser

cfg = configparser.ConfigParser()

cfg.read("download.ini")

print(cfg.sections())

print(cfg.options("section1"))

print(type(cfg.get("section1","maxspeed")))

print(cfg.get("section1","maxspeed"))

print(cfg.getint("section2","minspeed"))

#修改最大速度为2048

cfg.set("section1","maxspeed","")

cfg.write(open("download.ini","w",encoding="utf-8"))

四、hashlib模块

hash是一种算法，用于将任意长度的数据，压缩映射到一段固定长度的字符（提取特征）常用于加密和文件校验

hash值的特点：

　　1.传入值不同，得到的hash值有可能相同

　　2.不能由hash值返解成内容

　　3.只要hash算法不变，无论输入的数据长度是多少，得到的hash值长度相等

破解MD5的方法可以尝试撞库，原理：有一个数据库中存放了常见的明文和密文的对应关系，可以拿密文去查数据库里已经存在的明文，如果有就是撞库成功，能不能破解全凭运气

import hashlib

md = hashlib.md5()

md.update("".encode("utf-8"))

print(md.hexdigest())

常用的提升安全性的手段就是加盐

md2 = hashlib.md5()

md2.update("".encode("utf-8"))

md2.update(pwd.encode("utf-8"))

md2.update("".encode("utf-8"))

print(md2.hexdigest())

还有一个 hmac 模块，它内部对我们创建 key 和内容进行进一步的处理然后再加密，不加盐会报错

hmac模块的使用步骤与hashlib模块的使用步骤基本一致，只是在第1步获取hmac对象时，只能使用hmac.new()函数，因为hmac模块没有提供与具体哈希算法对应的函数来获取hmac对象。

import hmac

h = hmac.new(b"net")

h.update(b"luzhuo.me")

h_str = h.hexdigest()

print(h_str)

补充：

hash.digest()
返回摘要，作为二进制数据字符串值

hash.hexdigest()
返回摘要，作为十六进制数据字符串值

每天都学习！！！！！！