本片文章主要是对pickle官网的阅读记录。

The pickle module implements binary protocols for serializing and de-serializing a Python object structure. “Pickling” is the process whereby a Python object hierarchy is converted into a byte stream, and “unpickling” is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as “serialization”, “marshalling,” [1] or “flattening”; however, to avoid confusion, the terms used here are “pickling” and “unpickling”.

pickle是python标准模块之一,不需要再额外安装。

pickle用来 序列化和反序列化 Python object structure。其实就是一种数据存储方式,将python的数据结构以特定的形式保存下来。另外,经过pickle序列化后的数据不是human-readable的。

这里提一下老外对事物的命名习惯,pickle是腌制的意思,那么对python object的"腌制",其实就是一种数据处理,至于数据处理的规则是什么,这里暂时不做进一步介绍。

“Pickling”  就是将有层次结构的python object转换成字节流;“unpickling” 就是相反的过程。

说明: 如果碰到“Pickling” “serialization”, “marshalling,”  or “flattening”,都是表达相同的意思,翻译成"序列化"就好了;如果单词前加了un,就翻成“反序列化”。

Warning:The pickle module is not secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source.

不要去序列化 错误的或者恶意的 结构化数据,也不要去反序列化 不受信任或未授权的数据源。意思就是“序列化”和“反序列化”要按照pickle模块的规则来进行。

Data stream format

The data format used by pickle is Python-specific. This has the advantage that there are no restrictions imposed by external standards such as JSON or XDR (which can’t represent pointer sharing); however it means that non-Python programs may not be able to reconstruct pickled Python objects.

pickle使用的数据格式是Python语言特有的。非Python程序可能不能重构 被序列化 的数据。

By default, the pickle data format uses a relatively compact binary representation. If you need optimal size characteristics, you can efficiently compress pickled data.

默认,pickle的序列化数据格式是一种相对紧凑的二进制表示。如果对数据大小有更高要求,可以压缩 已序列化的数据。

The module pickletools contains tools for analyzing data streams generated by picklepickletools source code has extensive comments about opcodes used by pickle protocols.

pickletools包含很多用来解析 已序列化数据的工具。

There are currently 5 different protocols which can be used for pickling. The higher the protocol used, the more recent the version of Python needed to read the pickle produced.

  • Protocol version 0 is the original “human-readable” protocol and is backwards compatible with earlier versions of Python.
  • Protocol version 1 is an old binary format which is also compatible with earlier versions of Python.
  • Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of new-style classes. Refer to PEP 307 for information about improvements brought by protocol 2.
  • Protocol version 3 was added in Python 3.0. It has explicit support for bytes objects and cannot be unpickled by Python 2.x. This is the default protocol, and the recommended protocol when compatibility with other Python 3 versions is required.
  • Protocol version 4 was added in Python 3.4. It adds support for very large objects, pickling more kinds of objects, and some data format optimizations. Refer to PEP 3154 for information about improvements brought by protocol 4.

Note:

Serialization is a more primitive notion than persistence; although pickle reads and writes file objects, it does not handle the issue of naming persistent objects, nor the (even more complicated) issue of concurrent access to persistent objects. The pickle module can transform a complex object into a byte stream and it can transform the byte stream into an object with the same internal structure. Perhaps the most obvious thing to do with these byte streams is to write them onto a file, but it is also conceivable to send them across a network or store them in a database. The shelve module provides a simple interface to pickle and unpickle objects on DBM-style database files.

Module Interface

To serialize an object hierarchy, you simply call the dumps() function. Similarly, to de-serialize a data stream, you call the loads() function. However, if you want more control over serialization and de-serialization, you can create a Pickler or an Unpickler object, respectively.

通过dumps()进行序列化,通过loads()进行反序列化.

The pickle module provides the following constants:

  pickle.HIGHEST_PROTOCOL 即指定协议版本号为最高版本号。

    An integer, the highest protocol version available. This value can be passed as a protocol value to functions dump() and dumps() as well as the Pickler constructor.

  pickle.DEFAULT_PROTOCOL 即指定默认版本号。当前的默认版本号是version3

    An integer, the default protocol version used for pickling. May be less than HIGHEST_PROTOCOL. Currently the default protocol is 3, a new protocol designed for Python 3.

The pickle module provides the following functions to make the pickling process more convenient:

  pickle.dump(objfileprotocol=None*fix_imports=True) 即将数据obj写进文件

Write a pickled representation of obj to the open file object file. This is equivalent to Pickler(file, protocol).dump(obj).

The optional protocol argument, an integer, tells the pickler to use the given protocol; supported protocols are 0 to HIGHEST_PROTOCOL. If not specified, the default is DEFAULT_PROTOCOL. If a negative number is specified, HIGHEST_PROTOCOL is selected.

The file argument must have a write() method that accepts a single bytes argument. It can thus be an on-disk file opened for binary writing, an io.BytesIO instance, or any other custom object that meets this interface.

If fix_imports is true and protocol is less than 3, pickle will try to map the new Python 3 names to the old module names used in Python 2, so that the pickle data stream is readable with Python 2.

pickle.dumps(objprotocol=None*fix_imports=True)

Return the pickled representation of the object as a bytes object, instead of writing it to a file.

Arguments protocol and fix_imports have the same meaning as in dump().

pickle.load(file*fix_imports=Trueencoding="ASCII"errors="strict")

Read a pickled object representation from the open file object file and return the reconstituted object hierarchy specified therein. This is equivalent to Unpickler(file).load().

The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object’s representation are ignored.

The argument file must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return bytes. Thus file can be an on-disk file opened for binary reading, an io.BytesIO object, or any other custom object that meets this interface.

Optional keyword arguments are fix_importsencoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.

pickle.loads(bytes_object*fix_imports=Trueencoding="ASCII"errors="strict")

Read a pickled object hierarchy from a bytes object and return the reconstituted object hierarchy specified therein.

The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object’s representation are ignored.

Optional keyword arguments are fix_importsencoding and errors, which are used to control compatibility support for pickle stream generated by Python 2. If fix_imports is true, pickle will try to map the old Python 2 names to the new names used in Python 3. The encoding and errors tell pickle how to decode 8-bit string instances pickled by Python 2; these default to ‘ASCII’ and ‘strict’, respectively. The encoding can be ‘bytes’ to read these 8-bit string instances as bytes objects.

The pickle module defines three exceptions:

exception pickle.PickleError

Common base class for the other pickling exceptions. It inherits Exception.

exception pickle.PicklingError

Error raised when an unpicklable object is encountered by Pickler. It inherits PickleError.

Refer to What can be pickled and unpickled? to learn what kinds of objects can be pickled.

exception pickle.UnpicklingError

Error raised when there is a problem unpickling an object, such as a data corruption or a security violation. It inherits PickleError.

Note that other exceptions may also be raised during unpickling, including (but not necessarily limited to) AttributeError, EOFError, ImportError, and IndexError.

What can be pickled and unpickled?

The following types can be pickled:

  • NoneTrue, and False
  • integers, floating point numbers, complex numbers
  • strings, bytes, bytearrays
  • tuples, lists, sets, and dictionaries containing only picklable objects
  • functions defined at the top level of a module (using def, not lambda)
  • built-in functions defined at the top level of a module
  • classes that are defined at the top level of a module
  • instances of such classes whose __dict__ or the result of calling __getstate__() is picklable (see section Pickling Class Instances for details).

python对象序列化之pickle的更多相关文章

  1. Python学习笔记12:标准库之对象序列化(pickle包,cPickle包)

    计算机的内存中存储的是二进制的序列. 我们能够直接将某个对象所相应位置的数据抓取下来,转换成文本流 (这个过程叫做serialize),然后将文本流存入到文件里. 因为Python在创建对象时,要參考 ...

  2. python对象序列化或持久化的方法

    http://blog.csdn.net/chen_lovelotus/article/details/7233293 一.Python对象持久化方法 目前为止,据我所知,在python中对象持久化有 ...

  3. Python基础-序列化(json/pickle)

    我们把对象(变量)从内存中变成可存储的过程称之为序列化,比如XML,在Python中叫pickling,在其他语言中也被称之为serialization,marshalling,flattening等 ...

  4. python对象序列化pickle

    import pickle class A: users = {} c = 1 def get_self(self): return self def n(self): return 1 def pi ...

  5. 对象序列化:pickle和shelve

    import pickle class DVD: def __init__(self,tilte,year=None,duration=None,director_id=None): self.tit ...

  6. python 序列化 json pickle

    python的pickle模块实现了基本的数据序列和反序列化.通过pickle模块的序列化操作我们能够将程序中运行的对象信息保存到文件中去,永久存储:通过pickle模块的反序列化操作,我们能够从文件 ...

  7. cPickle对python对象进行序列化,序列化到文件或内存

    pickle模块使用的数据格式是python专用的,并且不同版本不向后兼容,同时也不能被其他语言说识别.要和其他语言交互,可以使用内置的json包 cPickle可以对任意一种类型的python对象进 ...

  8. Python:序列化 pickle JSON

    序列化 在程序运行的过程中,所有的变量都储存在内存中,例如定义一个dict d=dict(name='Bob',age=20,score=88) 可以随时修改变量,比如把name修改为'Bill',但 ...

  9. python pickle模块的使用/将python数据对象序列化保存到文件中

    # Python 使用pickle/cPickle模块进行数据的序列化 """Python序列化的概念很简单.内存里面有一个数据结构, 你希望将它保存下来,重用,或者发送 ...

随机推荐

  1. Dreamweaver 支持Jquery智能提示

    a.下载扩展插件:jQuery_API.mxp b.选择菜单栏:命令->扩展管理,选择刚下载的文件安装 c.重启DW 可以看到

  2. 关于Java代码优化的44条建议!

    关于Java代码优化的N条建议! 本文是作者:五月的仓颉 结合自己的工作和平时学习的体验重新谈一下为什么要进行代码优化.在修改之前,作者的说法是这样的: 就像鲸鱼吃虾米一样,也许吃一个两个虾米对于鲸鱼 ...

  3. select标签中的选项分组

    select标签中的选项分组 <select name="showtimes"> <optgroup label="下午一点"> < ...

  4. SQL注入深入剖析

    SQL注入是一门很深的学问,也是一门很有技巧性的学问 1.  运算符的优先级介绍 2.  SQL语句执行函数介绍 mysql_query() 仅对 SELECT,SHOW,EXPLAIN 或 DESC ...

  5. centos使用密钥替换密码登录服务器

    一.首先登陆centos,切换用户,切换到你要免密码登陆的用户,进入到家目录,以下我以admin为例,命令:su admincd ~ 二.创建钥匙,命令:ssh-keygen -t rsa,一路按Y搞 ...

  6. 体积雾 global fog unity 及改进

    https://github.com/keijiro/KinoFog lighting box 2.7.2 halfspace fog 相机位于一个位置 这个位置可以在fog volumn里面或者外面 ...

  7. css Table布局:基于display:table的CSS布局

    两种类型的表格布局 你有两种方式使用表格布局 -HTML Table(<table>标签)和CSS Table(display:table 等相关属性). HTML Table是指使用原生 ...

  8. Junit参数化测试Spring应用Dubbo接口

    一.创建基础类. package com.tree.autotest; import org.junit.Before;import org.springframework.context.annot ...

  9. Windows注册表中修改CMD默认路径

    一.开启注册表“win键+R键”并输入regedit 二.在注册表项 HKEY_CURRENT_USER\ Software\ Microsoft\ Command Processor 新建一个项,并 ...

  10. xcode arc 下使用 block警告 Capturing [an object] strongly in this block is likely to lead to a retain cycle” in ARC-enabled code

    xcode arc 下使用 block警告 Capturing [an object] strongly in this block is likely to lead to a retain cyc ...