pickle和json模块
json模块
json模块是实现序列化和反序列化的,主要用户不同程序之间的数据交换,首先来看一下:
dumps()序列化
- import json
- '''json模块是实现序列化和反序列话功能的'''
- users = ["alex","tom","wupeiqi","sb","耿长学"]
- mes = json.dumps(users) #实例化,并打印
- print(mes)
运行结果如下:
["alex", "tom", "wupeiqi", "sb", "\u803f\u957f\u5b66"]
从上面可以看出,dumps其实是生成一个序列化的实例,这个后面会和dump进行区分,而且汉字非英文转化成的是字节码。
loads()反序列化
- import json
- '''json模块是实现序列化和反序列话功能的'''
- users = ["alex","tom","wupeiqi","sb","耿长学"]
- mes = json.dumps(users) #实例化,并打印
- print("序列化:",mes)
- data = json.loads(mes)
- print("反序列化:",data)
运行结果如下:
序列化: ["alex", "tom", "wupeiqi", "sb", "\u803f\u957f\u5b66"]
反序列化: ['alex', 'tom', 'wupeiqi', 'sb', '耿长学']
上面dumps()和loads()序列化和反序列化是在程序之间交互,即没有通过文件,只是以json.dumps()序列化一个实例,然后接收之后json.loads()进行反序列化,依次实现程序之间的交互。那么如何在文件中交互呢?
文件中的交互:
- import json
- '''json模块是实现序列化和反序列话功能的'''
- users = ["alex","tom","wupeiqi","sb","耿长学"]
- mes = json.dumps(users) #实例化,并打印
- '''把mes序列化实例写入文件'''
- with open("users",'w+') as fw:
- fw.write(mes)
上面代码中,我们把序列化的实例写入到文件中,使用json.dumps()生成实例,fw.write()写入文件,需要通过文件的write()写入。
文件反序列化:
- import json
- with open('users','r+') as fr:
- mess = json.loads(fr.read())
- print(mess)
- 运行结果如下:
- ['alex', 'tom', 'wupeiqi', 'sb', '耿长学']
上面文件反序列化过程中,首先要把文件中的内容读取出来,因为是使用write()写进去的,因此要read()出来,然后再进行反序列化,生成一个实例。
下面我们来看看load()和dump()实现序列化和反序列化:
dump()序列化
- import json
- users = {"alex":"sb","wupeiqi":,"耿长学":}
- with open("users","w+") as fw:
- json.dump(users,fw)
从上面可以看出,json.dump()是直接把实例化的信息序列化到指定文件中,不需要write()来写入,直接可以自己来写入,而dumps()只是序列化成一个实例,还需要自己write()到文件中。
load()反序列化
- import json
- with open('users','r+') as fr:
- users = json.load(fr)
- print(users)
- 运行结果如下:
- {'alex': 'sb', '耿长学': , 'wupeiqi': }
load()反序列化不需要读取之后才反序列化,直接可以从文件反序列化,因为是dump()进去的。
总结:
从上面dumps()、loads()和dump()、load()序列化和反序列化可以看出,dumps()是序列化生成一个实例,loads()是读取写进去的实例,dumps()和loads()主要用于不同程序或者接口之间的数据交换传输,而dump()和load()更适合于文件级别之间的读取和写入,两者之间还是有侧重点的。
这个不是说两者功能一样,用法不同,其实设计的时候侧重方向就是不一样的。
json源代码
- r"""JSON (JavaScript Object Notation) <http://json.org> is a subset of
- JavaScript syntax (ECMA- 3rd edition) used as a lightweight data
- interchange format.
- :mod:`json` exposes an API familiar to users of the standard library
- :mod:`marshal` and :mod:`pickle` modules. It is derived from a
- version of the externally maintained simplejson library.
- Encoding basic Python object hierarchies::
- >>> import json
- >>> json.dumps(['foo', {'bar': ('baz', None, 1.0, )}])
- '["foo", {"bar": ["baz", null, 1.0, 2]}]'
- >>> print(json.dumps("\"foo\bar"))
- "\"foo\bar"
- >>> print(json.dumps('\u1234'))
- "\u1234"
- >>> print(json.dumps('\\'))
- "\\"
- >>> print(json.dumps({"c": , "b": , "a": }, sort_keys=True))
- {"a": , "b": , "c": }
- >>> from io import StringIO
- >>> io = StringIO()
- >>> json.dump(['streaming API'], io)
- >>> io.getvalue()
- '["streaming API"]'
- Compact encoding::
- >>> import json
- >>> from collections import OrderedDict
- >>> mydict = OrderedDict([('', ), ('', )])
- >>> json.dumps([,,,mydict], separators=(',', ':'))
- '[1,2,3,{"4":5,"6":7}]'
- Pretty printing::
- >>> import json
- >>> print(json.dumps({'': , '': }, sort_keys=True, indent=))
- {
- "": ,
- "":
- }
- Decoding JSON::
- >>> import json
- >>> obj = ['foo', {'bar': ['baz', None, 1.0, ]}]
- >>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]') == obj
- True
- >>> json.loads('"\\"foo\\bar"') == '"foo\x08ar'
- True
- >>> from io import StringIO
- >>> io = StringIO('["streaming API"]')
- >>> json.load(io)[] == 'streaming API'
- True
- Specializing JSON object decoding::
- >>> import json
- >>> def as_complex(dct):
- ... if '__complex__' in dct:
- ... return complex(dct['real'], dct['imag'])
- ... return dct
- ...
- >>> json.loads('{"__complex__": true, "real": 1, "imag": 2}',
- ... object_hook=as_complex)
- (+2j)
- >>> from decimal import Decimal
- >>> json.loads('1.1', parse_float=Decimal) == Decimal('1.1')
- True
- Specializing JSON object encoding::
- >>> import json
- >>> def encode_complex(obj):
- ... if isinstance(obj, complex):
- ... return [obj.real, obj.imag]
- ... raise TypeError(repr(o) + " is not JSON serializable")
- ...
- >>> json.dumps( + 1j, default=encode_complex)
- '[2.0, 1.0]'
- >>> json.JSONEncoder(default=encode_complex).encode( + 1j)
- '[2.0, 1.0]'
- >>> ''.join(json.JSONEncoder(default=encode_complex).iterencode( + 1j))
- '[2.0, 1.0]'
- Using json.tool from the shell to validate and pretty-print::
- $ echo '{"json":"obj"}' | python -m json.tool
- {
- "json": "obj"
- }
- $ echo '{ 1.2:3.4}' | python -m json.tool
- Expecting property name enclosed in double quotes: line column (char )
- """
- __version__ = '2.0.9'
- __all__ = [
- 'dump', 'dumps', 'load', 'loads',
- 'JSONDecoder', 'JSONDecodeError', 'JSONEncoder',
- ]
- __author__ = 'Bob Ippolito <bob@redivi.com>'
- from .decoder import JSONDecoder, JSONDecodeError
- from .encoder import JSONEncoder
- _default_encoder = JSONEncoder(
- skipkeys=False,
- ensure_ascii=True,
- check_circular=True,
- allow_nan=True,
- indent=None,
- separators=None,
- default=None,
- )
- def dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True,
- allow_nan=True, cls=None, indent=None, separators=None,
- default=None, sort_keys=False, **kw):
- """Serialize ``obj`` as a JSON formatted stream to ``fp`` (a
- ``.write()``-supporting file-like object).
- If ``skipkeys`` is true then ``dict`` keys that are not basic types
- (``str``, ``int``, ``float``, ``bool``, ``None``) will be skipped
- instead of raising a ``TypeError``.
- If ``ensure_ascii`` is false, then the strings written to ``fp`` can
- contain non-ASCII characters if they appear in strings contained in
- ``obj``. Otherwise, all such characters are escaped in JSON strings.
- If ``check_circular`` is false, then the circular reference check
- for container types will be skipped and a circular reference will
- result in an ``OverflowError`` (or worse).
- If ``allow_nan`` is false, then it will be a ``ValueError`` to
- serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``)
- in strict compliance of the JSON specification, instead of using the
- JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
- If ``indent`` is a non-negative integer, then JSON array elements and
- object members will be pretty-printed with that indent level. An indent
- level of will only insert newlines. ``None`` is the most compact
- representation.
- If specified, ``separators`` should be an ``(item_separator, key_separator)``
- tuple. The default is ``(', ', ': ')`` if *indent* is ``None`` and
- ``(',', ': ')`` otherwise. To get the most compact JSON representation,
- you should specify ``(',', ':')`` to eliminate whitespace.
- ``default(obj)`` is a function that should return a serializable version
- of obj or raise TypeError. The default simply raises TypeError.
- If *sort_keys* is ``True`` (default: ``False``), then the output of
- dictionaries will be sorted by key.
- To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
- ``.default()`` method to serialize additional types), specify it with
- the ``cls`` kwarg; otherwise ``JSONEncoder`` is used.
- """
- # cached encoder
- if (not skipkeys and ensure_ascii and
- check_circular and allow_nan and
- cls is None and indent is None and separators is None and
- default is None and not sort_keys and not kw):
- iterable = _default_encoder.iterencode(obj)
- else:
- if cls is None:
- cls = JSONEncoder
- iterable = cls(skipkeys=skipkeys, ensure_ascii=ensure_ascii,
- check_circular=check_circular, allow_nan=allow_nan, indent=indent,
- separators=separators,
- default=default, sort_keys=sort_keys, **kw).iterencode(obj)
- # could accelerate with writelines in some versions of Python, at
- # a debuggability cost
- for chunk in iterable:
- fp.write(chunk)
- def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True,
- allow_nan=True, cls=None, indent=None, separators=None,
- default=None, sort_keys=False, **kw):
- """Serialize ``obj`` to a JSON formatted ``str``.
- If ``skipkeys`` is true then ``dict`` keys that are not basic types
- (``str``, ``int``, ``float``, ``bool``, ``None``) will be skipped
- instead of raising a ``TypeError``.
- If ``ensure_ascii`` is false, then the return value can contain non-ASCII
- characters if they appear in strings contained in ``obj``. Otherwise, all
- such characters are escaped in JSON strings.
- If ``check_circular`` is false, then the circular reference check
- for container types will be skipped and a circular reference will
- result in an ``OverflowError`` (or worse).
- If ``allow_nan`` is false, then it will be a ``ValueError`` to
- serialize out of range ``float`` values (``nan``, ``inf``, ``-inf``) in
- strict compliance of the JSON specification, instead of using the
- JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``).
- If ``indent`` is a non-negative integer, then JSON array elements and
- object members will be pretty-printed with that indent level. An indent
- level of will only insert newlines. ``None`` is the most compact
- representation.
- If specified, ``separators`` should be an ``(item_separator, key_separator)``
- tuple. The default is ``(', ', ': ')`` if *indent* is ``None`` and
- ``(',', ': ')`` otherwise. To get the most compact JSON representation,
- you should specify ``(',', ':')`` to eliminate whitespace.
- ``default(obj)`` is a function that should return a serializable version
- of obj or raise TypeError. The default simply raises TypeError.
- If *sort_keys* is ``True`` (default: ``False``), then the output of
- dictionaries will be sorted by key.
- To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the
- ``.default()`` method to serialize additional types), specify it with
- the ``cls`` kwarg; otherwise ``JSONEncoder`` is used.
- """
- # cached encoder
- if (not skipkeys and ensure_ascii and
- check_circular and allow_nan and
- cls is None and indent is None and separators is None and
- default is None and not sort_keys and not kw):
- return _default_encoder.encode(obj)
- if cls is None:
- cls = JSONEncoder
- return cls(
- skipkeys=skipkeys, ensure_ascii=ensure_ascii,
- check_circular=check_circular, allow_nan=allow_nan, indent=indent,
- separators=separators, default=default, sort_keys=sort_keys,
- **kw).encode(obj)
- _default_decoder = JSONDecoder(object_hook=None, object_pairs_hook=None)
- def load(fp, cls=None, object_hook=None, parse_float=None,
- parse_int=None, parse_constant=None, object_pairs_hook=None, **kw):
- """Deserialize ``fp`` (a ``.read()``-supporting file-like object containing
- a JSON document) to a Python object.
- ``object_hook`` is an optional function that will be called with the
- result of any object literal decode (a ``dict``). The return value of
- ``object_hook`` will be used instead of the ``dict``. This feature
- can be used to implement custom decoders (e.g. JSON-RPC class hinting).
- ``object_pairs_hook`` is an optional function that will be called with the
- result of any object literal decoded with an ordered list of pairs. The
- return value of ``object_pairs_hook`` will be used instead of the ``dict``.
- This feature can be used to implement custom decoders that rely on the
- order that the key and value pairs are decoded (for example,
- collections.OrderedDict will remember the order of insertion). If
- ``object_hook`` is also defined, the ``object_pairs_hook`` takes priority.
- To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
- kwarg; otherwise ``JSONDecoder`` is used.
- """
- return loads(fp.read(),
- cls=cls, object_hook=object_hook,
- parse_float=parse_float, parse_int=parse_int,
- parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
- def loads(s, encoding=None, cls=None, object_hook=None, parse_float=None,
- parse_int=None, parse_constant=None, object_pairs_hook=None, **kw):
- """Deserialize ``s`` (a ``str`` instance containing a JSON
- document) to a Python object.
- ``object_hook`` is an optional function that will be called with the
- result of any object literal decode (a ``dict``). The return value of
- ``object_hook`` will be used instead of the ``dict``. This feature
- can be used to implement custom decoders (e.g. JSON-RPC class hinting).
- ``object_pairs_hook`` is an optional function that will be called with the
- result of any object literal decoded with an ordered list of pairs. The
- return value of ``object_pairs_hook`` will be used instead of the ``dict``.
- This feature can be used to implement custom decoders that rely on the
- order that the key and value pairs are decoded (for example,
- collections.OrderedDict will remember the order of insertion). If
- ``object_hook`` is also defined, the ``object_pairs_hook`` takes priority.
- ``parse_float``, if specified, will be called with the string
- of every JSON float to be decoded. By default this is equivalent to
- float(num_str). This can be used to use another datatype or parser
- for JSON floats (e.g. decimal.Decimal).
- ``parse_int``, if specified, will be called with the string
- of every JSON int to be decoded. By default this is equivalent to
- int(num_str). This can be used to use another datatype or parser
- for JSON integers (e.g. float).
- ``parse_constant``, if specified, will be called with one of the
- following strings: -Infinity, Infinity, NaN, null, true, false.
- This can be used to raise an exception if invalid JSON numbers
- are encountered.
- To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
- kwarg; otherwise ``JSONDecoder`` is used.
- The ``encoding`` argument is ignored and deprecated.
- """
- if not isinstance(s, str):
- raise TypeError('the JSON object must be str, not {!r}'.format(
- s.__class__.__name__))
- if s.startswith(u'\ufeff'):
- raise JSONDecodeError("Unexpected UTF-8 BOM (decode using utf-8-sig)",
- s, )
- if (cls is None and object_hook is None and
- parse_int is None and parse_float is None and
- parse_constant is None and object_pairs_hook is None and not kw):
- return _default_decoder.decode(s)
- if cls is None:
- cls = JSONDecoder
- if object_hook is not None:
- kw['object_hook'] = object_hook
- if object_pairs_hook is not None:
- kw['object_pairs_hook'] = object_pairs_hook
- if parse_float is not None:
- kw['parse_float'] = parse_float
- if parse_int is not None:
- kw['parse_int'] = parse_int
- if parse_constant is not None:
- kw['parse_constant'] = parse_constant
- return cls(**kw).decode(s)
pickle源代码
- """Create portable serialized representations of Python objects.
- See module copyreg for a mechanism for registering custom picklers.
- See module pickletools source for extensive comments.
- Classes:
- Pickler
- Unpickler
- Functions:
- dump(object, file)
- dumps(object) -> string
- load(file) -> object
- loads(string) -> object
- Misc variables:
- __version__
- format_version
- compatible_formats
- """
- from types import FunctionType
- from copyreg import dispatch_table
- from copyreg import _extension_registry, _inverted_registry, _extension_cache
- from itertools import islice
- import sys
- from sys import maxsize
- from struct import pack, unpack
- import re
- import io
- import codecs
- import _compat_pickle
- __all__ = ["PickleError", "PicklingError", "UnpicklingError", "Pickler",
- "Unpickler", "dump", "dumps", "load", "loads"]
- # Shortcut for use in isinstance testing
- bytes_types = (bytes, bytearray)
- # These are purely informational; no code uses these.
- format_version = "4.0" # File format version we write
- compatible_formats = ["1.0", # Original protocol
- "1.1", # Protocol with INST added
- "1.2", # Original protocol
- "1.3", # Protocol with BINFLOAT added
- "2.0", # Protocol
- "3.0", # Protocol
- "4.0", # Protocol
- ] # Old format versions we can read
- # This is the highest protocol number we know how to read.
- HIGHEST_PROTOCOL =
- # The protocol we write by default. May be less than HIGHEST_PROTOCOL.
- # We intentionally write a protocol that Python .x cannot read;
- # there are too many issues with that.
- DEFAULT_PROTOCOL =
- class PickleError(Exception):
- """A common base class for the other pickling exceptions."""
- pass
- class PicklingError(PickleError):
- """This exception is raised when an unpicklable object is passed to the
- dump() method.
- """
- pass
- class UnpicklingError(PickleError):
- """This exception is raised when there is a problem unpickling an object,
- such as a security violation.
- Note that other exceptions may also be raised during unpickling, including
- (but not necessarily limited to) AttributeError, EOFError, ImportError,
- and IndexError.
- """
- pass
- # An instance of _Stop is raised by Unpickler.load_stop() in response to
- # the STOP opcode, passing the object that is the result of unpickling.
- class _Stop(Exception):
- def __init__(self, value):
- self.value = value
- # Jython has PyStringMap; it's a dict subclass with string keys
- try:
- from org.python.core import PyStringMap
- except ImportError:
- PyStringMap = None
- # Pickle opcodes. See pickletools.py for extensive docs. The listing
- # here is in kind-of alphabetical order of -character pickle code.
- # pickletools groups them by purpose.
- MARK = b'(' # push special markobject on stack
- STOP = b'.' # every pickle ends with STOP
- POP = b'' # discard topmost stack item
- POP_MARK = b'' # discard stack top through topmost markobject
- DUP = b'' # duplicate top stack item
- FLOAT = b'F' # push float object; decimal string argument
- INT = b'I' # push integer or bool; decimal string argument
- BININT = b'J' # push four-byte signed int
- BININT1 = b'K' # push -byte unsigned int
- LONG = b'L' # push long; decimal string argument
- BININT2 = b'M' # push -byte unsigned int
- NONE = b'N' # push None
- PERSID = b'P' # push persistent object; id is taken from string arg
- BINPERSID = b'Q' # " " " ; " " " " stack
- REDUCE = b'R' # apply callable to argtuple, both on stack
- STRING = b'S' # push string; NL-terminated string argument
- BINSTRING = b'T' # push string; counted binary string argument
- SHORT_BINSTRING= b'U' # " " ; " " " " < bytes
- UNICODE = b'V' # push Unicode string; raw-unicode-escaped'd argument
- BINUNICODE = b'X' # " " " ; counted UTF-8 string argument
- APPEND = b'a' # append stack top to list below it
- BUILD = b'b' # call __setstate__ or __dict__.update()
- GLOBAL = b'c' # push self.find_class(modname, name); string args
- DICT = b'd' # build a dict from stack items
- EMPTY_DICT = b'}' # push empty dict
- APPENDS = b'e' # extend list on stack by topmost stack slice
- GET = b'g' # push item from memo on stack; index is string arg
- BINGET = b'h' # " " " " " " ; " " -byte arg
- INST = b'i' # build & push class instance
- LONG_BINGET = b'j' # push item from memo on stack; index is -byte arg
- LIST = b'l' # build list from topmost stack items
- EMPTY_LIST = b']' # push empty list
- OBJ = b'o' # build & push class instance
- PUT = b'p' # store stack top in memo; index is string arg
- BINPUT = b'q' # " " " " " ; " " 1-byte arg
- LONG_BINPUT = b'r' # " " " " " ; " " 4-byte arg
- SETITEM = b's' # add key+value pair to dict
- TUPLE = b't' # build tuple from topmost stack items
- EMPTY_TUPLE = b')' # push empty tuple
- SETITEMS = b'u' # modify dict by adding topmost key+value pairs
- BINFLOAT = b'G' # push float; arg is -byte float encoding
- TRUE = b'I01\n' # not an opcode; see INT docs in pickletools.py
- FALSE = b'I00\n' # not an opcode; see INT docs in pickletools.py
- # Protocol
- PROTO = b'\x80' # identify pickle protocol
- NEWOBJ = b'\x81' # build object by applying cls.__new__ to argtuple
- EXT1 = b'\x82' # push object from extension registry; -byte index
- EXT2 = b'\x83' # ditto, but -byte index
- EXT4 = b'\x84' # ditto, but -byte index
- TUPLE1 = b'\x85' # build -tuple from stack top
- TUPLE2 = b'\x86' # build -tuple from two topmost stack items
- TUPLE3 = b'\x87' # build -tuple from three topmost stack items
- NEWTRUE = b'\x88' # push True
- NEWFALSE = b'\x89' # push False
- LONG1 = b'\x8a' # push long from < bytes
- LONG4 = b'\x8b' # push really big long
- _tuplesize2code = [EMPTY_TUPLE, TUPLE1, TUPLE2, TUPLE3]
- # Protocol (Python .x)
- BINBYTES = b'B' # push bytes; counted binary string argument
- SHORT_BINBYTES = b'C' # " " ; " " " " < bytes
- # Protocol
- SHORT_BINUNICODE = b'\x8c' # push short string; UTF- length < bytes
- BINUNICODE8 = b'\x8d' # push very long string
- BINBYTES8 = b'\x8e' # push very long bytes string
- EMPTY_SET = b'\x8f' # push empty set on the stack
- ADDITEMS = b'\x90' # modify set by adding topmost stack items
- FROZENSET = b'\x91' # build frozenset from topmost stack items
- NEWOBJ_EX = b'\x92' # like NEWOBJ but work with keyword only arguments
- STACK_GLOBAL = b'\x93' # same as GLOBAL but using names on the stacks
- MEMOIZE = b'\x94' # store top of the stack in memo
- FRAME = b'\x95' # indicate the beginning of a new frame
- __all__.extend([x for x in dir() if re.match("[A-Z][A-Z0-9_]+$", x)])
- class _Framer:
- _FRAME_SIZE_TARGET = *
- def __init__(self, file_write):
- self.file_write = file_write
- self.current_frame = None
- def start_framing(self):
- self.current_frame = io.BytesIO()
- def end_framing(self):
- if self.current_frame and self.current_frame.tell() > :
- self.commit_frame(force=True)
- self.current_frame = None
- def commit_frame(self, force=False):
- if self.current_frame:
- f = self.current_frame
- if f.tell() >= self._FRAME_SIZE_TARGET or force:
- with f.getbuffer() as data:
- n = len(data)
- write = self.file_write
- write(FRAME)
- write(pack("<Q", n))
- write(data)
- f.seek()
- f.truncate()
- def write(self, data):
- if self.current_frame:
- return self.current_frame.write(data)
- else:
- return self.file_write(data)
- class _Unframer:
- def __init__(self, file_read, file_readline, file_tell=None):
- self.file_read = file_read
- self.file_readline = file_readline
- self.current_frame = None
- def read(self, n):
- if self.current_frame:
- data = self.current_frame.read(n)
- if not data and n != :
- self.current_frame = None
- return self.file_read(n)
- if len(data) < n:
- raise UnpicklingError(
- "pickle exhausted before end of frame")
- return data
- else:
- return self.file_read(n)
- def readline(self):
- if self.current_frame:
- data = self.current_frame.readline()
- if not data:
- self.current_frame = None
- return self.file_readline()
- if data[-] != b'\n'[]:
- raise UnpicklingError(
- "pickle exhausted before end of frame")
- return data
- else:
- return self.file_readline()
- def load_frame(self, frame_size):
- if self.current_frame and self.current_frame.read() != b'':
- raise UnpicklingError(
- "beginning of a new frame before end of current frame")
- self.current_frame = io.BytesIO(self.file_read(frame_size))
- # Tools used for pickling.
- def _getattribute(obj, name):
- for subpath in name.split('.'):
- if subpath == '<locals>':
- raise AttributeError("Can't get local attribute {!r} on {!r}"
- .format(name, obj))
- try:
- parent = obj
- obj = getattr(obj, subpath)
- except AttributeError:
- raise AttributeError("Can't get attribute {!r} on {!r}"
- .format(name, obj))
- return obj, parent
- def whichmodule(obj, name):
- """Find the module an object belong to."""
- module_name = getattr(obj, '__module__', None)
- if module_name is not None:
- return module_name
- # Protect the iteration by using a list copy of sys.modules against dynamic
- # modules that trigger imports of other modules upon calls to getattr.
- for module_name, module in list(sys.modules.items()):
- if module_name == '__main__' or module is None:
- continue
- try:
- if _getattribute(module, name)[] is obj:
- return module_name
- except AttributeError:
- pass
- return '__main__'
- def encode_long(x):
- r"""Encode a long to a two's complement little-endian binary string.
- Note that is a special case, returning an empty string, to save a
- byte in the LONG1 pickling context.
- >>> encode_long()
- b''
- >>> encode_long()
- b'\xff\x00'
- >>> encode_long()
- b'\xff\x7f'
- >>> encode_long(-)
- b'\x00\xff'
- >>> encode_long(-)
- b'\x00\x80'
- >>> encode_long(-)
- b'\x80'
- >>> encode_long()
- b'\x7f'
- >>>
- """
- if x == :
- return b''
- nbytes = (x.bit_length() >> ) +
- result = x.to_bytes(nbytes, byteorder='little', signed=True)
- if x < and nbytes > :
- if result[-] == 0xff and (result[-] & 0x80) != :
- result = result[:-]
- return result
- def decode_long(data):
- r"""Decode a long from a two's complement little-endian binary string.
- >>> decode_long(b'')
- >>> decode_long(b"\xff\x00")
- >>> decode_long(b"\xff\x7f")
- >>> decode_long(b"\x00\xff")
- -
- >>> decode_long(b"\x00\x80")
- -
- >>> decode_long(b"\x80")
- -
- >>> decode_long(b"\x7f")
- """
- return int.from_bytes(data, byteorder='little', signed=True)
- # Pickling machinery
- class _Pickler:
- def __init__(self, file, protocol=None, *, fix_imports=True):
- """This takes a binary file for writing a pickle data stream.
- The optional *protocol* argument tells the pickler to use the
- given protocol; supported protocols are , , , and . The
- default protocol is ; a backward-incompatible protocol designed
- for Python .
- Specifying a negative protocol version selects the highest
- protocol version supported. The higher the protocol used, the
- more recent the version of Python needed to read the pickle
- produced.
- The *file* argument must have a write() method that accepts a
- single bytes argument. It can thus be a file object opened for
- binary writing, an io.BytesIO instance, or any other custom
- object that meets this interface.
- If *fix_imports* is True and *protocol* is less than , pickle
- will try to map the new Python names to the old module names
- used in Python , so that the pickle data stream is readable
- with Python .
- """
- if protocol is None:
- protocol = DEFAULT_PROTOCOL
- if protocol < :
- protocol = HIGHEST_PROTOCOL
- elif not <= protocol <= HIGHEST_PROTOCOL:
- raise ValueError("pickle protocol must be <= %d" % HIGHEST_PROTOCOL)
- try:
- self._file_write = file.write
- except AttributeError:
- raise TypeError("file must have a 'write' attribute")
- self.framer = _Framer(self._file_write)
- self.write = self.framer.write
- self.memo = {}
- self.proto = int(protocol)
- self.bin = protocol >=
- self.fast =
- self.fix_imports = fix_imports and protocol <
- def clear_memo(self):
- """Clears the pickler's "memo".
- The memo is the data structure that remembers which objects the
- pickler has already seen, so that shared or recursive objects
- are pickled by reference and not by value. This method is
- useful when re-using picklers.
- """
- self.memo.clear()
- def dump(self, obj):
- """Write a pickled representation of obj to the open file."""
- # Check whether Pickler was initialized correctly. This is
- # only needed to mimic the behavior of _pickle.Pickler.dump().
- if not hasattr(self, "_file_write"):
- raise PicklingError("Pickler.__init__() was not called by "
- "%s.__init__()" % (self.__class__.__name__,))
- if self.proto >= :
- self.write(PROTO + pack("<B", self.proto))
- if self.proto >= :
- self.framer.start_framing()
- self.save(obj)
- self.write(STOP)
- self.framer.end_framing()
- def memoize(self, obj):
- """Store an object in the memo."""
- # The Pickler memo is a dictionary mapping object ids to -tuples
- # that contain the Unpickler memo key and the object being memoized.
- # The memo key is written to the pickle and will become
- # the key in the Unpickler's memo. The object is stored in the
- # Pickler memo so that transient objects are kept alive during
- # pickling.
- # The use of the Unpickler memo length as the memo key is just a
- # convention. The only requirement is that the memo values be unique.
- # But there appears no advantage to any other scheme, and this
- # scheme allows the Unpickler memo to be implemented as a plain (but
- # growable) array, indexed by memo key.
- if self.fast:
- return
- assert id(obj) not in self.memo
- idx = len(self.memo)
- self.write(self.put(idx))
- self.memo[id(obj)] = idx, obj
- # Return a PUT (BINPUT, LONG_BINPUT) opcode string, with argument i.
- def put(self, idx):
- if self.proto >= :
- return MEMOIZE
- elif self.bin:
- if idx < :
- return BINPUT + pack("<B", idx)
- else:
- return LONG_BINPUT + pack("<I", idx)
- else:
- return PUT + repr(idx).encode("ascii") + b'\n'
- # Return a GET (BINGET, LONG_BINGET) opcode string, with argument i.
- def get(self, i):
- if self.bin:
- if i < :
- return BINGET + pack("<B", i)
- else:
- return LONG_BINGET + pack("<I", i)
- return GET + repr(i).encode("ascii") + b'\n'
- def save(self, obj, save_persistent_id=True):
- self.framer.commit_frame()
- # Check for persistent id (defined by a subclass)
- pid = self.persistent_id(obj)
- if pid is not None and save_persistent_id:
- self.save_pers(pid)
- return
- # Check the memo
- x = self.memo.get(id(obj))
- if x is not None:
- self.write(self.get(x[]))
- return
- # Check the type dispatch table
- t = type(obj)
- f = self.dispatch.get(t)
- if f is not None:
- f(self, obj) # Call unbound method with explicit self
- return
- # Check private dispatch table if any, or else copyreg.dispatch_table
- reduce = getattr(self, 'dispatch_table', dispatch_table).get(t)
- if reduce is not None:
- rv = reduce(obj)
- else:
- # Check for a class with a custom metaclass; treat as regular class
- try:
- issc = issubclass(t, type)
- except TypeError: # t is not a class (old Boost; see SF #)
- issc = False
- if issc:
- self.save_global(obj)
- return
- # Check for a __reduce_ex__ method, fall back to __reduce__
- reduce = getattr(obj, "__reduce_ex__", None)
- if reduce is not None:
- rv = reduce(self.proto)
- else:
- reduce = getattr(obj, "__reduce__", None)
- if reduce is not None:
- rv = reduce()
- else:
- raise PicklingError("Can't pickle %r object: %r" %
- (t.__name__, obj))
- # Check for string returned by reduce(), meaning "save as global"
- if isinstance(rv, str):
- self.save_global(obj, rv)
- return
- # Assert that reduce() returned a tuple
- if not isinstance(rv, tuple):
- raise PicklingError("%s must return string or tuple" % reduce)
- # Assert that it returned an appropriately sized tuple
- l = len(rv)
- if not ( <= l <= ):
- raise PicklingError("Tuple returned by %s must have "
- "two to five elements" % reduce)
- # Save the reduce() output and finally memoize the object
- self.save_reduce(obj=obj, *rv)
- def persistent_id(self, obj):
- # This exists so a subclass can override it
- return None
- def save_pers(self, pid):
- # Save a persistent id reference
- if self.bin:
- self.save(pid, save_persistent_id=False)
- self.write(BINPERSID)
- else:
- self.write(PERSID + str(pid).encode("ascii") + b'\n')
- def save_reduce(self, func, args, state=None, listitems=None,
- dictitems=None, obj=None):
- # This API is called by some subclasses
- if not isinstance(args, tuple):
- raise PicklingError("args from save_reduce() must be a tuple")
- if not callable(func):
- raise PicklingError("func from save_reduce() must be callable")
- save = self.save
- write = self.write
- func_name = getattr(func, "__name__", "")
- if self.proto >= and func_name == "__newobj_ex__":
- cls, args, kwargs = args
- if not hasattr(cls, "__new__"):
- raise PicklingError("args[0] from {} args has no __new__"
- .format(func_name))
- if obj is not None and cls is not obj.__class__:
- raise PicklingError("args[0] from {} args has the wrong class"
- .format(func_name))
- save(cls)
- save(args)
- save(kwargs)
- write(NEWOBJ_EX)
- elif self.proto >= and func_name == "__newobj__":
- # A __reduce__ implementation can direct protocol or newer to
- # use the more efficient NEWOBJ opcode, while still
- # allowing protocol and to work normally. For this to
- # work, the function returned by __reduce__ should be
- # called __newobj__, and its first argument should be a
- # class. The implementation for __newobj__
- # should be as follows, although pickle has no way to
- # verify this:
- #
- # def __newobj__(cls, *args):
- # return cls.__new__(cls, *args)
- #
- # Protocols and will pickle a reference to __newobj__,
- # while protocol (and above) will pickle a reference to
- # cls, the remaining args tuple, and the NEWOBJ code,
- # which calls cls.__new__(cls, *args) at unpickling time
- # (see load_newobj below). If __reduce__ returns a
- # three-tuple, the state from the third tuple item will be
- # pickled regardless of the protocol, calling __setstate__
- # at unpickling time (see load_build below).
- #
- # Note that no standard __newobj__ implementation exists;
- # you have to provide your own. This is to enforce
- # compatibility with Python 2.2 (pickles written using
- # protocol or in Python 2.3 should be unpicklable by
- # Python 2.2).
- cls = args[]
- if not hasattr(cls, "__new__"):
- raise PicklingError(
- "args[0] from __newobj__ args has no __new__")
- if obj is not None and cls is not obj.__class__:
- raise PicklingError(
- "args[0] from __newobj__ args has the wrong class")
- args = args[:]
- save(cls)
- save(args)
- write(NEWOBJ)
- else:
- save(func)
- save(args)
- write(REDUCE)
- if obj is not None:
- # If the object is already in the memo, this means it is
- # recursive. In this case, throw away everything we put on the
- # stack, and fetch the object back from the memo.
- if id(obj) in self.memo:
- write(POP + self.get(self.memo[id(obj)][]))
- else:
- self.memoize(obj)
- # More new special cases (that work with older protocols as
- # well): when __reduce__ returns a tuple with or items,
- # the 4th and 5th item should be iterators that provide list
- # items and dict items (as (key, value) tuples), or None.
- if listitems is not None:
- self._batch_appends(listitems)
- if dictitems is not None:
- self._batch_setitems(dictitems)
- if state is not None:
- save(state)
- write(BUILD)
- # Methods below this point are dispatched through the dispatch table
- dispatch = {}
- def save_none(self, obj):
- self.write(NONE)
- dispatch[type(None)] = save_none
- def save_bool(self, obj):
- if self.proto >= :
- self.write(NEWTRUE if obj else NEWFALSE)
- else:
- self.write(TRUE if obj else FALSE)
- dispatch[bool] = save_bool
- def save_long(self, obj):
- if self.bin:
- # If the int is small enough to fit in a signed -byte 's-comp
- # format, we can store it more efficiently than the general
- # case.
- # First one- and two-byte unsigned ints:
- if obj >= :
- if obj <= 0xff:
- self.write(BININT1 + pack("<B", obj))
- return
- if obj <= 0xffff:
- self.write(BININT2 + pack("<H", obj))
- return
- # Next check for -byte signed ints:
- if -0x80000000 <= obj <= 0x7fffffff:
- self.write(BININT + pack("<i", obj))
- return
- if self.proto >= :
- encoded = encode_long(obj)
- n = len(encoded)
- if n < :
- self.write(LONG1 + pack("<B", n) + encoded)
- else:
- self.write(LONG4 + pack("<i", n) + encoded)
- return
- self.write(LONG + repr(obj).encode("ascii") + b'L\n')
- dispatch[int] = save_long
- def save_float(self, obj):
- if self.bin:
- self.write(BINFLOAT + pack('>d', obj))
- else:
- self.write(FLOAT + repr(obj).encode("ascii") + b'\n')
- dispatch[float] = save_float
- def save_bytes(self, obj):
- if self.proto < :
- if not obj: # bytes object is empty
- self.save_reduce(bytes, (), obj=obj)
- else:
- self.save_reduce(codecs.encode,
- (str(obj, 'latin1'), 'latin1'), obj=obj)
- return
- n = len(obj)
- if n <= 0xff:
- self.write(SHORT_BINBYTES + pack("<B", n) + obj)
- elif n > 0xffffffff and self.proto >= :
- self.write(BINBYTES8 + pack("<Q", n) + obj)
- else:
- self.write(BINBYTES + pack("<I", n) + obj)
- self.memoize(obj)
- dispatch[bytes] = save_bytes
- def save_str(self, obj):
- if self.bin:
- encoded = obj.encode('utf-8', 'surrogatepass')
- n = len(encoded)
- if n <= 0xff and self.proto >= :
- self.write(SHORT_BINUNICODE + pack("<B", n) + encoded)
- elif n > 0xffffffff and self.proto >= :
- self.write(BINUNICODE8 + pack("<Q", n) + encoded)
- else:
- self.write(BINUNICODE + pack("<I", n) + encoded)
- else:
- obj = obj.replace("\\", "\\u005c")
- obj = obj.replace("\n", "\\u000a")
- self.write(UNICODE + obj.encode('raw-unicode-escape') +
- b'\n')
- self.memoize(obj)
- dispatch[str] = save_str
- def save_tuple(self, obj):
- if not obj: # tuple is empty
- if self.bin:
- self.write(EMPTY_TUPLE)
- else:
- self.write(MARK + TUPLE)
- return
- n = len(obj)
- save = self.save
- memo = self.memo
- if n <= and self.proto >= :
- for element in obj:
- save(element)
- # Subtle. Same as in the big comment below.
- if id(obj) in memo:
- get = self.get(memo[id(obj)][])
- self.write(POP * n + get)
- else:
- self.write(_tuplesize2code[n])
- self.memoize(obj)
- return
- # proto or proto and tuple isn't empty, or proto > 1 and tuple
- # has more than elements.
- write = self.write
- write(MARK)
- for element in obj:
- save(element)
- if id(obj) in memo:
- # Subtle. d was not in memo when we entered save_tuple(), so
- # the process of saving the tuple's elements must have saved
- # the tuple itself: the tuple is recursive. The proper action
- # now is to throw away everything we put on the stack, and
- # simply GET the tuple (it's already constructed). This check
- # could have been done in the "for element" loop instead, but
- # recursive tuples are a rare thing.
- get = self.get(memo[id(obj)][])
- if self.bin:
- write(POP_MARK + get)
- else: # proto -- POP_MARK not available
- write(POP * (n+) + get)
- return
- # No recursion.
- write(TUPLE)
- self.memoize(obj)
- dispatch[tuple] = save_tuple
- def save_list(self, obj):
- if self.bin:
- self.write(EMPTY_LIST)
- else: # proto -- can't use EMPTY_LIST
- self.write(MARK + LIST)
- self.memoize(obj)
- self._batch_appends(obj)
- dispatch[list] = save_list
- _BATCHSIZE =
- def _batch_appends(self, items):
- # Helper to batch up APPENDS sequences
- save = self.save
- write = self.write
- if not self.bin:
- for x in items:
- save(x)
- write(APPEND)
- return
- it = iter(items)
- while True:
- tmp = list(islice(it, self._BATCHSIZE))
- n = len(tmp)
- if n > :
- write(MARK)
- for x in tmp:
- save(x)
- write(APPENDS)
- elif n:
- save(tmp[])
- write(APPEND)
- # else tmp is empty, and we're done
- if n < self._BATCHSIZE:
- return
- def save_dict(self, obj):
- if self.bin:
- self.write(EMPTY_DICT)
- else: # proto -- can't use EMPTY_DICT
- self.write(MARK + DICT)
- self.memoize(obj)
- self._batch_setitems(obj.items())
- dispatch[dict] = save_dict
- if PyStringMap is not None:
- dispatch[PyStringMap] = save_dict
- def _batch_setitems(self, items):
- # Helper to batch up SETITEMS sequences; proto >= only
- save = self.save
- write = self.write
- if not self.bin:
- for k, v in items:
- save(k)
- save(v)
- write(SETITEM)
- return
- it = iter(items)
- while True:
- tmp = list(islice(it, self._BATCHSIZE))
- n = len(tmp)
- if n > :
- write(MARK)
- for k, v in tmp:
- save(k)
- save(v)
- write(SETITEMS)
- elif n:
- k, v = tmp[]
- save(k)
- save(v)
- write(SETITEM)
- # else tmp is empty, and we're done
- if n < self._BATCHSIZE:
- return
- def save_set(self, obj):
- save = self.save
- write = self.write
- if self.proto < :
- self.save_reduce(set, (list(obj),), obj=obj)
- return
- write(EMPTY_SET)
- self.memoize(obj)
- it = iter(obj)
- while True:
- batch = list(islice(it, self._BATCHSIZE))
- n = len(batch)
- if n > :
- write(MARK)
- for item in batch:
- save(item)
- write(ADDITEMS)
- if n < self._BATCHSIZE:
- return
- dispatch[set] = save_set
- def save_frozenset(self, obj):
- save = self.save
- write = self.write
- if self.proto < :
- self.save_reduce(frozenset, (list(obj),), obj=obj)
- return
- write(MARK)
- for item in obj:
- save(item)
- if id(obj) in self.memo:
- # If the object is already in the memo, this means it is
- # recursive. In this case, throw away everything we put on the
- # stack, and fetch the object back from the memo.
- write(POP_MARK + self.get(self.memo[id(obj)][]))
- return
- write(FROZENSET)
- self.memoize(obj)
- dispatch[frozenset] = save_frozenset
- def save_global(self, obj, name=None):
- write = self.write
- memo = self.memo
- if name is None:
- name = getattr(obj, '__qualname__', None)
- if name is None:
- name = obj.__name__
- module_name = whichmodule(obj, name)
- try:
- __import__(module_name, level=)
- module = sys.modules[module_name]
- obj2, parent = _getattribute(module, name)
- except (ImportError, KeyError, AttributeError):
- raise PicklingError(
- "Can't pickle %r: it's not found as %s.%s" %
- (obj, module_name, name))
- else:
- if obj2 is not obj:
- raise PicklingError(
- "Can't pickle %r: it's not the same object as %s.%s" %
- (obj, module_name, name))
- if self.proto >= :
- code = _extension_registry.get((module_name, name))
- if code:
- assert code >
- if code <= 0xff:
- write(EXT1 + pack("<B", code))
- elif code <= 0xffff:
- write(EXT2 + pack("<H", code))
- else:
- write(EXT4 + pack("<i", code))
- return
- lastname = name.rpartition('.')[]
- if parent is module:
- name = lastname
- # Non-ASCII identifiers are supported only with protocols >= .
- if self.proto >= :
- self.save(module_name)
- self.save(name)
- write(STACK_GLOBAL)
- elif parent is not module:
- self.save_reduce(getattr, (parent, lastname))
- elif self.proto >= :
- write(GLOBAL + bytes(module_name, "utf-8") + b'\n' +
- bytes(name, "utf-8") + b'\n')
- else:
- if self.fix_imports:
- r_name_mapping = _compat_pickle.REVERSE_NAME_MAPPING
- r_import_mapping = _compat_pickle.REVERSE_IMPORT_MAPPING
- if (module_name, name) in r_name_mapping:
- module_name, name = r_name_mapping[(module_name, name)]
- elif module_name in r_import_mapping:
- module_name = r_import_mapping[module_name]
- try:
- write(GLOBAL + bytes(module_name, "ascii") + b'\n' +
- bytes(name, "ascii") + b'\n')
- except UnicodeEncodeError:
- raise PicklingError(
- "can't pickle global identifier '%s.%s' using "
- "pickle protocol %i" % (module, name, self.proto))
- self.memoize(obj)
- def save_type(self, obj):
- if obj is type(None):
- return self.save_reduce(type, (None,), obj=obj)
- elif obj is type(NotImplemented):
- return self.save_reduce(type, (NotImplemented,), obj=obj)
- elif obj is type(...):
- return self.save_reduce(type, (...,), obj=obj)
- return self.save_global(obj)
- dispatch[FunctionType] = save_global
- dispatch[type] = save_type
- # Unpickling machinery
- class _Unpickler:
- def __init__(self, file, *, fix_imports=True,
- encoding="ASCII", errors="strict"):
- """This takes a binary file for reading a pickle data stream.
- The protocol version of the pickle is detected automatically, so
- no proto argument is needed.
- The argument *file* must have two methods, a read() method that
- takes an integer argument, and a readline() method that requires
- no arguments. Both methods should return bytes. Thus *file*
- can be a binary file object opened for reading, an io.BytesIO
- object, or any other custom object that meets this interface.
- The file-like object must have two methods, a read() method
- that takes an integer argument, and a readline() method that
- requires no arguments. Both methods should return bytes.
- Thus file-like object can be a binary file object opened for
- reading, a BytesIO object, or any other custom object that
- meets this interface.
- Optional keyword arguments are *fix_imports*, *encoding* and
- *errors*, which are used to control compatibility support for
- pickle stream generated by Python . If *fix_imports* is True,
- pickle will try to map the old Python names to the new names
- used in Python . The *encoding* and *errors* tell pickle how
- to decode -bit string instances pickled by Python ; these
- default to 'ASCII' and 'strict', respectively. *encoding* can be
- 'bytes' to read theses -bit string instances as bytes objects.
- """
- self._file_readline = file.readline
- self._file_read = file.read
- self.memo = {}
- self.encoding = encoding
- self.errors = errors
- self.proto =
- self.fix_imports = fix_imports
- def load(self):
- """Read a pickled object representation from the open file.
- Return the reconstituted object hierarchy specified in the file.
- """
- # Check whether Unpickler was initialized correctly. This is
- # only needed to mimic the behavior of _pickle.Unpickler.dump().
- if not hasattr(self, "_file_read"):
- raise UnpicklingError("Unpickler.__init__() was not called by "
- "%s.__init__()" % (self.__class__.__name__,))
- self._unframer = _Unframer(self._file_read, self._file_readline)
- self.read = self._unframer.read
- self.readline = self._unframer.readline
- self.mark = object() # any new unique object
- self.stack = []
- self.append = self.stack.append
- self.proto =
- read = self.read
- dispatch = self.dispatch
- try:
- while True:
- key = read()
- if not key:
- raise EOFError
- assert isinstance(key, bytes_types)
- dispatch[key[]](self)
- except _Stop as stopinst:
- return stopinst.value
- # Return largest index k such that self.stack[k] is self.mark.
- # If the stack doesn't contain a mark, eventually raises IndexError.
- # This could be sped by maintaining another stack, of indices at which
- # the mark appears. For that matter, the latter stack would suffice,
- # and we wouldn't need to push mark objects on self.stack at all.
- # Doing so is probably a good thing, though, since if the pickle is
- # corrupt (or hostile) we may get a clue from finding self.mark embedded
- # in unpickled objects.
- def marker(self):
- stack = self.stack
- mark = self.mark
- k = len(stack)-
- while stack[k] is not mark: k = k-
- return k
- def persistent_load(self, pid):
- raise UnpicklingError("unsupported persistent id encountered")
- dispatch = {}
- def load_proto(self):
- proto = self.read()[]
- if not <= proto <= HIGHEST_PROTOCOL:
- raise ValueError("unsupported pickle protocol: %d" % proto)
- self.proto = proto
- dispatch[PROTO[]] = load_proto
- def load_frame(self):
- frame_size, = unpack('<Q', self.read())
- if frame_size > sys.maxsize:
- raise ValueError("frame size > sys.maxsize: %d" % frame_size)
- self._unframer.load_frame(frame_size)
- dispatch[FRAME[]] = load_frame
- def load_persid(self):
- pid = self.readline()[:-].decode("ascii")
- self.append(self.persistent_load(pid))
- dispatch[PERSID[]] = load_persid
- def load_binpersid(self):
- pid = self.stack.pop()
- self.append(self.persistent_load(pid))
- dispatch[BINPERSID[]] = load_binpersid
- def load_none(self):
- self.append(None)
- dispatch[NONE[]] = load_none
- def load_false(self):
- self.append(False)
- dispatch[NEWFALSE[]] = load_false
- def load_true(self):
- self.append(True)
- dispatch[NEWTRUE[]] = load_true
- def load_int(self):
- data = self.readline()
- if data == FALSE[:]:
- val = False
- elif data == TRUE[:]:
- val = True
- else:
- val = int(data, )
- self.append(val)
- dispatch[INT[]] = load_int
- def load_binint(self):
- self.append(unpack('<i', self.read())[])
- dispatch[BININT[]] = load_binint
- def load_binint1(self):
- self.append(self.read()[])
- dispatch[BININT1[]] = load_binint1
- def load_binint2(self):
- self.append(unpack('<H', self.read())[])
- dispatch[BININT2[]] = load_binint2
- def load_long(self):
- val = self.readline()[:-]
- if val and val[-] == b'L'[]:
- val = val[:-]
- self.append(int(val, ))
- dispatch[LONG[]] = load_long
- def load_long1(self):
- n = self.read()[]
- data = self.read(n)
- self.append(decode_long(data))
- dispatch[LONG1[]] = load_long1
- def load_long4(self):
- n, = unpack('<i', self.read())
- if n < :
- # Corrupt or hostile pickle -- we never write one like this
- raise UnpicklingError("LONG pickle has negative byte count")
- data = self.read(n)
- self.append(decode_long(data))
- dispatch[LONG4[]] = load_long4
- def load_float(self):
- self.append(float(self.readline()[:-]))
- dispatch[FLOAT[]] = load_float
- def load_binfloat(self):
- self.append(unpack('>d', self.read())[])
- dispatch[BINFLOAT[]] = load_binfloat
- def _decode_string(self, value):
- # Used to allow strings from Python to be decoded either as
- # bytes or Unicode strings. This should be used only with the
- # STRING, BINSTRING and SHORT_BINSTRING opcodes.
- if self.encoding == "bytes":
- return value
- else:
- return value.decode(self.encoding, self.errors)
- def load_string(self):
- data = self.readline()[:-]
- # Strip outermost quotes
- if len(data) >= and data[] == data[-] and data[] in b'"\'':
- data = data[:-]
- else:
- raise UnpicklingError("the STRING opcode argument must be quoted")
- self.append(self._decode_string(codecs.escape_decode(data)[]))
- dispatch[STRING[]] = load_string
- def load_binstring(self):
- # Deprecated BINSTRING uses signed -bit length
- len, = unpack('<i', self.read())
- if len < :
- raise UnpicklingError("BINSTRING pickle has negative byte count")
- data = self.read(len)
- self.append(self._decode_string(data))
- dispatch[BINSTRING[]] = load_binstring
- def load_binbytes(self):
- len, = unpack('<I', self.read())
- if len > maxsize:
- raise UnpicklingError("BINBYTES exceeds system's maximum size "
- "of %d bytes" % maxsize)
- self.append(self.read(len))
- dispatch[BINBYTES[]] = load_binbytes
- def load_unicode(self):
- self.append(str(self.readline()[:-], 'raw-unicode-escape'))
- dispatch[UNICODE[]] = load_unicode
- def load_binunicode(self):
- len, = unpack('<I', self.read())
- if len > maxsize:
- raise UnpicklingError("BINUNICODE exceeds system's maximum size "
- "of %d bytes" % maxsize)
- self.append(str(self.read(len), 'utf-8', 'surrogatepass'))
- dispatch[BINUNICODE[]] = load_binunicode
- def load_binunicode8(self):
- len, = unpack('<Q', self.read())
- if len > maxsize:
- raise UnpicklingError("BINUNICODE8 exceeds system's maximum size "
- "of %d bytes" % maxsize)
- self.append(str(self.read(len), 'utf-8', 'surrogatepass'))
- dispatch[BINUNICODE8[]] = load_binunicode8
- def load_binbytes8(self):
- len, = unpack('<Q', self.read())
- if len > maxsize:
- raise UnpicklingError("BINBYTES8 exceeds system's maximum size "
- "of %d bytes" % maxsize)
- self.append(self.read(len))
- dispatch[BINBYTES8[]] = load_binbytes8
- def load_short_binstring(self):
- len = self.read()[]
- data = self.read(len)
- self.append(self._decode_string(data))
- dispatch[SHORT_BINSTRING[]] = load_short_binstring
- def load_short_binbytes(self):
- len = self.read()[]
- self.append(self.read(len))
- dispatch[SHORT_BINBYTES[]] = load_short_binbytes
- def load_short_binunicode(self):
- len = self.read()[]
- self.append(str(self.read(len), 'utf-8', 'surrogatepass'))
- dispatch[SHORT_BINUNICODE[]] = load_short_binunicode
- def load_tuple(self):
- k = self.marker()
- self.stack[k:] = [tuple(self.stack[k+:])]
- dispatch[TUPLE[]] = load_tuple
- def load_empty_tuple(self):
- self.append(())
- dispatch[EMPTY_TUPLE[]] = load_empty_tuple
- def load_tuple1(self):
- self.stack[-] = (self.stack[-],)
- dispatch[TUPLE1[]] = load_tuple1
- def load_tuple2(self):
- self.stack[-:] = [(self.stack[-], self.stack[-])]
- dispatch[TUPLE2[]] = load_tuple2
- def load_tuple3(self):
- self.stack[-:] = [(self.stack[-], self.stack[-], self.stack[-])]
- dispatch[TUPLE3[]] = load_tuple3
- def load_empty_list(self):
- self.append([])
- dispatch[EMPTY_LIST[]] = load_empty_list
- def load_empty_dictionary(self):
- self.append({})
- dispatch[EMPTY_DICT[]] = load_empty_dictionary
- def load_empty_set(self):
- self.append(set())
- dispatch[EMPTY_SET[]] = load_empty_set
- def load_frozenset(self):
- k = self.marker()
- self.stack[k:] = [frozenset(self.stack[k+:])]
- dispatch[FROZENSET[]] = load_frozenset
- def load_list(self):
- k = self.marker()
- self.stack[k:] = [self.stack[k+:]]
- dispatch[LIST[]] = load_list
- def load_dict(self):
- k = self.marker()
- items = self.stack[k+:]
- d = {items[i]: items[i+]
- for i in range(, len(items), )}
- self.stack[k:] = [d]
- dispatch[DICT[]] = load_dict
- # INST and OBJ differ only in how they get a class object. It's not
- # only sensible to do the rest in a common routine, the two routines
- # previously diverged and grew different bugs.
- # klass is the class to instantiate, and k points to the topmost mark
- # object, following which are the arguments for klass.__init__.
- def _instantiate(self, klass, k):
- args = tuple(self.stack[k+:])
- del self.stack[k:]
- if (args or not isinstance(klass, type) or
- hasattr(klass, "__getinitargs__")):
- try:
- value = klass(*args)
- except TypeError as err:
- raise TypeError("in constructor for %s: %s" %
- (klass.__name__, str(err)), sys.exc_info()[])
- else:
- value = klass.__new__(klass)
- self.append(value)
- def load_inst(self):
- module = self.readline()[:-].decode("ascii")
- name = self.readline()[:-].decode("ascii")
- klass = self.find_class(module, name)
- self._instantiate(klass, self.marker())
- dispatch[INST[]] = load_inst
- def load_obj(self):
- # Stack is ... markobject classobject arg1 arg2 ...
- k = self.marker()
- klass = self.stack.pop(k+)
- self._instantiate(klass, k)
- dispatch[OBJ[]] = load_obj
- def load_newobj(self):
- args = self.stack.pop()
- cls = self.stack.pop()
- obj = cls.__new__(cls, *args)
- self.append(obj)
- dispatch[NEWOBJ[]] = load_newobj
- def load_newobj_ex(self):
- kwargs = self.stack.pop()
- args = self.stack.pop()
- cls = self.stack.pop()
- obj = cls.__new__(cls, *args, **kwargs)
- self.append(obj)
- dispatch[NEWOBJ_EX[]] = load_newobj_ex
- def load_global(self):
- module = self.readline()[:-].decode("utf-8")
- name = self.readline()[:-].decode("utf-8")
- klass = self.find_class(module, name)
- self.append(klass)
- dispatch[GLOBAL[]] = load_global
- def load_stack_global(self):
- name = self.stack.pop()
- module = self.stack.pop()
- if type(name) is not str or type(module) is not str:
- raise UnpicklingError("STACK_GLOBAL requires str")
- self.append(self.find_class(module, name))
- dispatch[STACK_GLOBAL[]] = load_stack_global
- def load_ext1(self):
- code = self.read()[]
- self.get_extension(code)
- dispatch[EXT1[]] = load_ext1
- def load_ext2(self):
- code, = unpack('<H', self.read())
- self.get_extension(code)
- dispatch[EXT2[]] = load_ext2
- def load_ext4(self):
- code, = unpack('<i', self.read())
- self.get_extension(code)
- dispatch[EXT4[]] = load_ext4
- def get_extension(self, code):
- nil = []
- obj = _extension_cache.get(code, nil)
- if obj is not nil:
- self.append(obj)
- return
- key = _inverted_registry.get(code)
- if not key:
- if code <= : # note that is forbidden
- # Corrupt or hostile pickle.
- raise UnpicklingError("EXT specifies code <= 0")
- raise ValueError("unregistered extension code %d" % code)
- obj = self.find_class(*key)
- _extension_cache[code] = obj
- self.append(obj)
- def find_class(self, module, name):
- # Subclasses may override this.
- if self.proto < and self.fix_imports:
- if (module, name) in _compat_pickle.NAME_MAPPING:
- module, name = _compat_pickle.NAME_MAPPING[(module, name)]
- elif module in _compat_pickle.IMPORT_MAPPING:
- module = _compat_pickle.IMPORT_MAPPING[module]
- __import__(module, level=)
- if self.proto >= :
- return _getattribute(sys.modules[module], name)[]
- else:
- return getattr(sys.modules[module], name)
- def load_reduce(self):
- stack = self.stack
- args = stack.pop()
- func = stack[-]
- stack[-] = func(*args)
- dispatch[REDUCE[]] = load_reduce
- def load_pop(self):
- del self.stack[-]
- dispatch[POP[]] = load_pop
- def load_pop_mark(self):
- k = self.marker()
- del self.stack[k:]
- dispatch[POP_MARK[]] = load_pop_mark
- def load_dup(self):
- self.append(self.stack[-])
- dispatch[DUP[]] = load_dup
- def load_get(self):
- i = int(self.readline()[:-])
- self.append(self.memo[i])
- dispatch[GET[]] = load_get
- def load_binget(self):
- i = self.read()[]
- self.append(self.memo[i])
- dispatch[BINGET[]] = load_binget
- def load_long_binget(self):
- i, = unpack('<I', self.read())
- self.append(self.memo[i])
- dispatch[LONG_BINGET[]] = load_long_binget
- def load_put(self):
- i = int(self.readline()[:-])
- if i < :
- raise ValueError("negative PUT argument")
- self.memo[i] = self.stack[-]
- dispatch[PUT[]] = load_put
- def load_binput(self):
- i = self.read()[]
- if i < :
- raise ValueError("negative BINPUT argument")
- self.memo[i] = self.stack[-]
- dispatch[BINPUT[]] = load_binput
- def load_long_binput(self):
- i, = unpack('<I', self.read())
- if i > maxsize:
- raise ValueError("negative LONG_BINPUT argument")
- self.memo[i] = self.stack[-]
- dispatch[LONG_BINPUT[]] = load_long_binput
- def load_memoize(self):
- memo = self.memo
- memo[len(memo)] = self.stack[-]
- dispatch[MEMOIZE[]] = load_memoize
- def load_append(self):
- stack = self.stack
- value = stack.pop()
- list = stack[-]
- list.append(value)
- dispatch[APPEND[]] = load_append
- def load_appends(self):
- stack = self.stack
- mark = self.marker()
- list_obj = stack[mark - ]
- items = stack[mark + :]
- if isinstance(list_obj, list):
- list_obj.extend(items)
- else:
- append = list_obj.append
- for item in items:
- append(item)
- del stack[mark:]
- dispatch[APPENDS[]] = load_appends
- def load_setitem(self):
- stack = self.stack
- value = stack.pop()
- key = stack.pop()
- dict = stack[-]
- dict[key] = value
- dispatch[SETITEM[]] = load_setitem
- def load_setitems(self):
- stack = self.stack
- mark = self.marker()
- dict = stack[mark - ]
- for i in range(mark + , len(stack), ):
- dict[stack[i]] = stack[i + ]
- del stack[mark:]
- dispatch[SETITEMS[]] = load_setitems
- def load_additems(self):
- stack = self.stack
- mark = self.marker()
- set_obj = stack[mark - ]
- items = stack[mark + :]
- if isinstance(set_obj, set):
- set_obj.update(items)
- else:
- add = set_obj.add
- for item in items:
- add(item)
- del stack[mark:]
- dispatch[ADDITEMS[]] = load_additems
- def load_build(self):
- stack = self.stack
- state = stack.pop()
- inst = stack[-]
- setstate = getattr(inst, "__setstate__", None)
- if setstate is not None:
- setstate(state)
- return
- slotstate = None
- if isinstance(state, tuple) and len(state) == :
- state, slotstate = state
- if state:
- inst_dict = inst.__dict__
- intern = sys.intern
- for k, v in state.items():
- if type(k) is str:
- inst_dict[intern(k)] = v
- else:
- inst_dict[k] = v
- if slotstate:
- for k, v in slotstate.items():
- setattr(inst, k, v)
- dispatch[BUILD[]] = load_build
- def load_mark(self):
- self.append(self.mark)
- dispatch[MARK[]] = load_mark
- def load_stop(self):
- value = self.stack.pop()
- raise _Stop(value)
- dispatch[STOP[]] = load_stop
- # Shorthands
- def _dump(obj, file, protocol=None, *, fix_imports=True):
- _Pickler(file, protocol, fix_imports=fix_imports).dump(obj)
- def _dumps(obj, protocol=None, *, fix_imports=True):
- f = io.BytesIO()
- _Pickler(f, protocol, fix_imports=fix_imports).dump(obj)
- res = f.getvalue()
- assert isinstance(res, bytes_types)
- return res
- def _load(file, *, fix_imports=True, encoding="ASCII", errors="strict"):
- return _Unpickler(file, fix_imports=fix_imports,
- encoding=encoding, errors=errors).load()
- def _loads(s, *, fix_imports=True, encoding="ASCII", errors="strict"):
- if isinstance(s, str):
- raise TypeError("Can't load pickle from unicode string")
- file = io.BytesIO(s)
- return _Unpickler(file, fix_imports=fix_imports,
- encoding=encoding, errors=errors).load()
- # Use the faster _pickle if possible
- try:
- from _pickle import (
- PickleError,
- PicklingError,
- UnpicklingError,
- Pickler,
- Unpickler,
- dump,
- dumps,
- load,
- loads
- )
- except ImportError:
- Pickler, Unpickler = _Pickler, _Unpickler
- dump, dumps, load, loads = _dump, _dumps, _load, _loads
- # Doctest
- def _test():
- import doctest
- return doctest.testmod()
- if __name__ == "__main__":
- import argparse
- parser = argparse.ArgumentParser(
- description='display contents of the pickle files')
- parser.add_argument(
- 'pickle_file', type=argparse.FileType('br'),
- nargs='*', help='the pickle file')
- parser.add_argument(
- '-t', '--test', action='store_true',
- help='run self-test suite')
- parser.add_argument(
- '-v', action='store_true',
- help='run verbosely; only affects self-test run')
- args = parser.parse_args()
- if args.test:
- _test()
- else:
- if not args.pickle_file:
- parser.print_help()
- else:
- import pprint
- for f in args.pickle_file:
- obj = load(f)
- pprint.pprint(obj)
pickle和json模块的更多相关文章
- Python开发之序列化与反序列化:pickle、json模块使用详解
1 引言 在日常开发中,所有的对象都是存储在内存当中,尤其是像python这样的坚持一切接对象的高级程序设计语言,一旦关机,在写在内存中的数据都将不复存在.另一方面,存储在内存够中的对象由于编程语言. ...
- Python序列化-pickle和json模块
Python的“file-like object“就是一种鸭子类型.对真正的文件对象,它有一个read()方法,返回其内容.但是,许多对象,只要有read()方法,都被视为“file-like obj ...
- pytho中pickle、json模块
pickle & json 模块 json,用于字符串 和 python数据类型间进行转换 pickle,用于python特有的类型 和 python的数据类型间进行转换 json模块提供了四 ...
- Python中的序列化以及pickle和json模块介绍
Python中的序列化指的是在程序运行期间,变量都是在内存中保存着的,如果我们想保留一些运行中的变量值,就可以使用序列化操作把变量内容从内存保存到磁盘中,在Python中这个操作叫pickling,等 ...
- 序列化模块之 pickle 和 json
用于序列化的两个模块: json,用于字符串 和 python数据类型间进行转换 pickle,用于python特有的类型 和 python的数据类型间进行转换 Json模块提供了四个功能:dumps ...
- 各类模块的粗略总结(time,re,os,sys,序列化,pickle,shelve.#!json )
***collections 扩展数据类型*** ***re 正则相关操作 正则 匹配字符串*** ***time 时间相关 三种格式:时间戳,格式化时间(字符串),时间元组(结构化时间).***`` ...
- json模块和pickle模块的用法
在python中,可以使用pickle和json两个模块对数据进行序列化操作 其中: json可以用于字符串或者字典等与python数据类型之间的序列化与反序列化操作 pickle可以用于python ...
- 【python标准库模块四】Json模块和Pickle模块学习
Json模块 原来有个eval函数能能够从字符串中提取出对应的数据类型,比如"{"name":"zhangsan"}",可以提取出一个字典. ...
- Python第十四天 序列化 pickle模块 cPickle模块 JSON模块 API的两种格式
Python第十四天 序列化 pickle模块 cPickle模块 JSON模块 API的两种格式 目录 Pycharm使用技巧(转载) Python第一天 安装 shell 文件 Py ...
随机推荐
- Web前端总结(小伙伴的)
以下总结是我工作室的小伙伴的心得,可以参考一下 html+css知识点总结 HTMl+CSS知识点收集 1.letter-spacing和word-spacing的区别 letter-spacing: ...
- [leetcode-572-Subtree of Another Tree]
Given two non-empty binary trees s and t, check whether tree t hasexactly the same structure and nod ...
- 【Android Developers Training】 88. 使用备份API
注:本文翻译自Google官方的Android Developers Training文档,译者技术一般,由于喜爱安卓而产生了翻译的念头,纯属个人兴趣爱好. 原文链接:http://developer ...
- GCD 信号量 dispatch_semaphore_t
1.GCD知识讲解 1)dispatch_semaphore_create(long value) //创建一个信号量,总量为value,value不能小于0 2)dispatch_semaphore ...
- 基于angular4.0分页组件
分页组件一般只某个页面的一小部分,所以我们它是子组件 当然如果你承认这话的,可以往下看,因为我把他当作子组建来写了 分页组件的模版 import { Component } from '@angula ...
- 这个类复制文本文件FileCopy
package JBJADV003; import java.io.File;import java.io.BufferedReader;import java.io.BufferedWriter;i ...
- 浅析CQRS的应用部署
CQRS,中文翻译命令和查询职责分离,它是一种架构,不仅可以从数据库层面实现读写分离,在代码层面上也是推荐读写分离的.在接口上可以更为简单 命令端定义 ICommandResult Execute(I ...
- 详解Android Activity---Activity的生命周期
转载注明来自: http://www.cnblogs.com/wujiancheng/ 一.正常情况下Activity的生命周期: Activity的生命周期大概可以归为三部分 整个的生命周期:o ...
- 阅读MDN文档之布局(四)
Introducing positioning Static positioning Relative positioning Introducing top, bottom, left and ri ...
- 事务之使用JDBC进行事务的操作2
本篇将讲诉如何使用JDBC进行数据库有关事务的操作.在上一篇博客中已经介绍了事务的概念,和在MySQL命令行窗口进行开启事务,提交事务以及回滚事务的操作. 似乎事务和批处理都可以一次同时执行多条SQL ...