python3-day5

模块，用一砣代码实现了某个功能的代码集合。

类似于函数式编程和面向过程编程，函数式编程则完成一个功能，其他代码用来调用即可，提供了代码的重用性和代码间的耦合。而对于一个复杂的功能来，可能需要多个函数才能完成（函数又可以在不同的.py文件中），n个 .py 文件组成的代码集合就称为模块。

如：os 是系统相关的模块；file是文件操作相关的模块

模块分为三种：

自定义模块
内置标准模块（又称标准库）
开源模块

自定义模块和开源模块的使用参考 http://www.cnblogs.com/wupeiqi/articles/4963027.html

一、time 和 datetime模块

 1 #_*_coding:utf-8_*_

 2 __author__ = 'Alex Li'

 3

 4 import time

 5

 6

 7 # print(time.clock()) #返回处理器时间,3.3开始已废弃 , 改成了time.process_time()测量处理器运算时间,不包括sleep时间,不稳定,mac上测不出来

 8 # print(time.altzone)  #返回与utc时间的时间差,以秒计算\

 9 # print(time.asctime()) #返回时间格式"Fri Aug 19 11:14:16 2016",

10 # print(time.localtime()) #返回本地时间 的struct time对象格式

11 # print(time.gmtime(time.time()-800000)) #返回utc时间的struc时间对象格式

12

13 # print(time.asctime(time.localtime())) #返回时间格式"Fri Aug 19 11:14:16 2016",

14 #print(time.ctime()) #返回Fri Aug 19 12:38:29 2016 格式, 同上

15

16

17

18 # 日期字符串 转成  时间戳

19 # string_2_struct = time.strptime("2016/05/22","%Y/%m/%d") #将 日期字符串 转成 struct时间对象格式

20 # print(string_2_struct)

21 # #

22 # struct_2_stamp = time.mktime(string_2_struct) #将struct时间对象转成时间戳

23 # print(struct_2_stamp)

24

25

26

27 #将时间戳转为字符串格式

28 # print(time.gmtime(time.time()-86640)) #将utc时间戳转换成struct_time格式

29 # print(time.strftime("%Y-%m-%d %H:%M:%S",time.gmtime()) ) #将utc struct_time格式转成指定的字符串格式

30

31

32

33

34

35 #时间加减

36 import datetime

37

38 # print(datetime.datetime.now()) #返回 2016-08-19 12:47:03.941925

39 #print(datetime.date.fromtimestamp(time.time()) )  # 时间戳直接转成日期格式 2016-08-19

40 # print(datetime.datetime.now() )

41 # print(datetime.datetime.now() + datetime.timedelta(3)) #当前时间+3天

42 # print(datetime.datetime.now() + datetime.timedelta(-3)) #当前时间-3天

43 # print(datetime.datetime.now() + datetime.timedelta(hours=3)) #当前时间+3小时

44 # print(datetime.datetime.now() + datetime.timedelta(minutes=30)) #当前时间+30分

45

46

47 #

48 # c_time  = datetime.datetime.now()

49 # print(c_time.replace(minute=3,hour=2)) #时间替换

Directive	Meaning	Notes
`%a`	Locale’s abbreviated weekday name.
`%A`	Locale’s full weekday name.
`%b`	Locale’s abbreviated month name.
`%B`	Locale’s full month name.
`%c`	Locale’s appropriate date and time representation.
`%d`	Day of the month as a decimal number [01,31].
`%H`	Hour (24-hour clock) as a decimal number [00,23].
`%I`	Hour (12-hour clock) as a decimal number [01,12].
`%j`	Day of the year as a decimal number [001,366].
`%m`	Month as a decimal number [01,12].
`%M`	Minute as a decimal number [00,59].
`%p`	Locale’s equivalent of either AM or PM.	(1)
`%S`	Second as a decimal number [00,61].	(2)
`%U`	Week number of the year (Sunday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Sunday are considered to be in week 0.	(3)
`%w`	Weekday as a decimal number [0(Sunday),6].
`%W`	Week number of the year (Monday as the first day of the week) as a decimal number [00,53]. All days in a new year preceding the first Monday are considered to be in week 0.	(3)
`%x`	Locale’s appropriate date representation.
`%X`	Locale’s appropriate time representation.
`%y`	Year without century as a decimal number [00,99].
`%Y`	Year with century as a decimal number.
`%z`	Time zone offset indicating a positive or negative time difference from UTC/GMT of the form +HHMM or -HHMM, where H represents decimal hour digits and M represents decimal minute digits [-23:59, +23:59].
`%Z`	Time zone name (no characters if no time zone exists).
`%%`	A literal `'%'` character.

二、random模块

随机数

1 import random

2 print(random.random())

3 print(random.randint(1,2))

4 print(random.randrange(1,10))

生成随机验证码

 1 import random

 2 checkcode = ''

 3 for i in range(4):

 4     current = random.randrange(0,4)

 5     if current != i:

 6         temp = chr(random.randint(65,90))

 7     else:

 8         temp = random.randint(0,9)

 9     checkcode += str(temp)

10 print checkcode

三、OS模块

提供对操作系统进行调用的接口

 1 os.getcwd() 获取当前工作目录，即当前python脚本工作的目录路径

 2 os.chdir("dirname")  改变当前脚本工作目录；相当于shell下cd

 3 os.curdir  返回当前目录: ('.')

 4 os.pardir  获取当前目录的父目录字符串名：('..')

 5 os.makedirs('dirname1/dirname2')    可生成多层递归目录

 6 os.removedirs('dirname1')    若目录为空，则删除，并递归到上一级目录，如若也为空，则删除，依此类推

 7 os.mkdir('dirname')    生成单级目录；相当于shell中mkdir dirname

 8 os.rmdir('dirname')    删除单级空目录，若目录不为空则无法删除，报错；相当于shell中rmdir dirname

 9 os.listdir('dirname')    列出指定目录下的所有文件和子目录，包括隐藏文件，并以列表方式打印

10 os.remove()  删除一个文件

11 os.rename("oldname","newname")  重命名文件/目录

12 os.stat('path/filename')  获取文件/目录信息

13 os.sep    输出操作系统特定的路径分隔符，win下为"\\",Linux下为"/"

14 os.linesep    输出当前平台使用的行终止符，win下为"\t\n",Linux下为"\n"

15 os.pathsep    输出用于分割文件路径的字符串

16 os.name    输出字符串指示当前使用平台。win->'nt'; Linux->'posix'

17 os.system("bash command")  运行shell命令，直接显示

18 os.environ  获取系统环境变量

19 os.path.abspath(path)  返回path规范化的绝对路径

20 os.path.split(path)  将path分割成目录和文件名二元组返回

21 os.path.dirname(path)  返回path的目录。其实就是os.path.split(path)的第一个元素

22 os.path.basename(path)  返回path最后的文件名。如何path以／或\结尾，那么就会返回空值。即os.path.split(path)的第二个元素

23 os.path.exists(path)  如果path存在，返回True；如果path不存在，返回False

24 os.path.isabs(path)  如果path是绝对路径，返回True

25 os.path.isfile(path)  如果path是一个存在的文件，返回True。否则返回False

26 os.path.isdir(path)  如果path是一个存在的目录，则返回True。否则返回False

27 os.path.join(path1[, path2[, ...]])  将多个路径组合后返回，第一个绝对路径之前的参数将被忽略

28 os.path.getatime(path)  返回path所指向的文件或者目录的最后存取时间

29 os.path.getmtime(path)  返回path所指向的文件或者目录的最后修改时间

四、sys模块

1 import sys

2 sys.argv           命令行参数List，第一个元素是程序本身路径

3 sys.exit(n)        退出程序，正常退出时exit(0)

4 sys.version        获取Python解释程序的版本信息

5 sys.maxint         最大的Int值

6 sys.path           返回模块的搜索路径，初始化时使用PYTHONPATH环境变量的值

7 sys.platform       返回操作系统平台名称

8 sys.stdout.write('please:')

9 val = sys.stdin.readline()[:-1]

五、shutil

高级的文件、文件夹、压缩包处理模块

shutil.copyfileobj(fsrc, fdst[, length])

将文件内容拷贝到另一个文件中，可以部分内容

1 def copyfileobj(fsrc, fdst, length=16*1024):

2     """copy data from file-like object fsrc to file-like object fdst"""

3     while 1:

4         buf = fsrc.read(length)

5         if not buf:

6             break

7         fdst.write(buf)

shutil.copyfile(src, dst)
拷贝文件

 1 def copyfile(src, dst):

 2     """Copy data from src to dst"""

 3     if _samefile(src, dst):

 4         raise Error("`%s` and `%s` are the same file" % (src, dst))

 5

 6     for fn in [src, dst]:

 7         try:

 8             st = os.stat(fn)

 9         except OSError:

10             # File most likely does not exist

11             pass

12         else:

13             # XXX What about other special files? (sockets, devices...)

14             if stat.S_ISFIFO(st.st_mode):

15                 raise SpecialFileError("`%s` is a named pipe" % fn)

16

17     with open(src, 'rb') as fsrc:

18         with open(dst, 'wb') as fdst:

19             copyfileobj(fsrc, fdst)

shutil.copymode(src, dst)
仅拷贝权限。内容、组、用户均不变

1 def copymode(src, dst):

2     """Copy mode bits from src to dst"""

3     if hasattr(os, 'chmod'):

4         st = os.stat(src)

5         mode = stat.S_IMODE(st.st_mode)

6         os.chmod(dst, mode)

shutil.copystat(src, dst)
拷贝状态的信息，包括：mode bits, atime, mtime, flags

 1 def copystat(src, dst):

 2     """Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""

 3     st = os.stat(src)

 4     mode = stat.S_IMODE(st.st_mode)

 5     if hasattr(os, 'utime'):

 6         os.utime(dst, (st.st_atime, st.st_mtime))

 7     if hasattr(os, 'chmod'):

 8         os.chmod(dst, mode)

 9     if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):

10         try:

11             os.chflags(dst, st.st_flags)

12         except OSError, why:

13             for err in 'EOPNOTSUPP', 'ENOTSUP':

14                 if hasattr(errno, err) and why.errno == getattr(errno, err):

15                     break

16             else:

17                 raise

shutil.copy(src, dst)
拷贝文件和权限

 1 def copy(src, dst):

 2     """Copy data and mode bits ("cp src dst").

 3

 4     The destination may be a directory.

 5

 6     """

 7     if os.path.isdir(dst):

 8         dst = os.path.join(dst, os.path.basename(src))

 9     copyfile(src, dst)

10     copymode(src, dst)

shutil.copy2(src, dst)
拷贝文件和状态信息

 1 def copy2(src, dst):

 2     """Copy data and all stat info ("cp -p src dst").

 3

 4     The destination may be a directory.

 5

 6     """

 7     if os.path.isdir(dst):

 8         dst = os.path.join(dst, os.path.basename(src))

 9     copyfile(src, dst)

10     copystat(src, dst)

shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)
递归的去拷贝文件

例如：copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*'))

 1 def ignore_patterns(*patterns):

 2     """Function that can be used as copytree() ignore parameter.

 3

 4     Patterns is a sequence of glob-style patterns

 5     that are used to exclude files"""

 6     def _ignore_patterns(path, names):

 7         ignored_names = []

 8         for pattern in patterns:

 9             ignored_names.extend(fnmatch.filter(names, pattern))

10         return set(ignored_names)

11     return _ignore_patterns

12

13 def copytree(src, dst, symlinks=False, ignore=None):

14     """Recursively copy a directory tree using copy2().

15

16     The destination directory must not already exist.

17     If exception(s) occur, an Error is raised with a list of reasons.

18

19     If the optional symlinks flag is true, symbolic links in the

20     source tree result in symbolic links in the destination tree; if

21     it is false, the contents of the files pointed to by symbolic

22     links are copied.

23

24     The optional ignore argument is a callable. If given, it

25     is called with the `src` parameter, which is the directory

26     being visited by copytree(), and `names` which is the list of

27     `src` contents, as returned by os.listdir():

28

29         callable(src, names) -> ignored_names

30

31     Since copytree() is called recursively, the callable will be

32     called once for each directory that is copied. It returns a

33     list of names relative to the `src` directory that should

34     not be copied.

35

36     XXX Consider this example code rather than the ultimate tool.

37

38     """

39     names = os.listdir(src)

40     if ignore is not None:

41         ignored_names = ignore(src, names)

42     else:

43         ignored_names = set()

44

45     os.makedirs(dst)

46     errors = []

47     for name in names:

48         if name in ignored_names:

49             continue

50         srcname = os.path.join(src, name)

51         dstname = os.path.join(dst, name)

52         try:

53             if symlinks and os.path.islink(srcname):

54                 linkto = os.readlink(srcname)

55                 os.symlink(linkto, dstname)

56             elif os.path.isdir(srcname):

57                 copytree(srcname, dstname, symlinks, ignore)

58             else:

59                 # Will raise a SpecialFileError for unsupported file types

60                 copy2(srcname, dstname)

61         # catch the Error from the recursive copytree so that we can

62         # continue with other files

63         except Error, err:

64             errors.extend(err.args[0])

65         except EnvironmentError, why:

66             errors.append((srcname, dstname, str(why)))

67     try:

68         copystat(src, dst)

69     except OSError, why:

70         if WindowsError is not None and isinstance(why, WindowsError):

71             # Copying file access times may fail on Windows

72             pass

73         else:

74             errors.append((src, dst, str(why)))

75     if errors:

76         raise Error, errors

shutil.rmtree(path[, ignore_errors[, onerror]])
递归的去删除文件

 1 def rmtree(path, ignore_errors=False, onerror=None):

 2     """Recursively delete a directory tree.

 3

 4     If ignore_errors is set, errors are ignored; otherwise, if onerror

 5     is set, it is called to handle the error with arguments (func,

 6     path, exc_info) where func is os.listdir, os.remove, or os.rmdir;

 7     path is the argument to that function that caused it to fail; and

 8     exc_info is a tuple returned by sys.exc_info().  If ignore_errors

 9     is false and onerror is None, an exception is raised.

10

11     """

12     if ignore_errors:

13         def onerror(*args):

14             pass

15     elif onerror is None:

16         def onerror(*args):

17             raise

18     try:

19         if os.path.islink(path):

20             # symlinks to directories are forbidden, see bug #1669

21             raise OSError("Cannot call rmtree on a symbolic link")

22     except OSError:

23         onerror(os.path.islink, path, sys.exc_info())

24         # can't continue even if onerror hook returns

25         return

26     names = []

27     try:

28         names = os.listdir(path)

29     except os.error, err:

30         onerror(os.listdir, path, sys.exc_info())

31     for name in names:

32         fullname = os.path.join(path, name)

33         try:

34             mode = os.lstat(fullname).st_mode

35         except os.error:

36             mode = 0

37         if stat.S_ISDIR(mode):

38             rmtree(fullname, ignore_errors, onerror)

39         else:

40             try:

41                 os.remove(fullname)

42             except os.error, err:

43                 onerror(os.remove, fullname, sys.exc_info())

44     try:

45         os.rmdir(path)

46     except os.error:

47         onerror(os.rmdir, path, sys.exc_info())

shutil.move(src, dst)
递归的去移动文件

 1 def move(src, dst):

 2     """Recursively move a file or directory to another location. This is

 3     similar to the Unix "mv" command.

 4

 5     If the destination is a directory or a symlink to a directory, the source

 6     is moved inside the directory. The destination path must not already

 7     exist.

 8

 9     If the destination already exists but is not a directory, it may be

10     overwritten depending on os.rename() semantics.

11

12     If the destination is on our current filesystem, then rename() is used.

13     Otherwise, src is copied to the destination and then removed.

14     A lot more could be done here...  A look at a mv.c shows a lot of

15     the issues this implementation glosses over.

16

17     """

18     real_dst = dst

19     if os.path.isdir(dst):

20         if _samefile(src, dst):

21             # We might be on a case insensitive filesystem,

22             # perform the rename anyway.

23             os.rename(src, dst)

24             return

25

26         real_dst = os.path.join(dst, _basename(src))

27         if os.path.exists(real_dst):

28             raise Error, "Destination path '%s' already exists" % real_dst

29     try:

30         os.rename(src, real_dst)

31     except OSError:

32         if os.path.isdir(src):

33             if _destinsrc(src, dst):

34                 raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)

35             copytree(src, real_dst, symlinks=True)

36             rmtree(src)

37         else:

38             copy2(src, real_dst)

39             os.unlink(src)

shutil.make_archive(base_name, format,...)

创建压缩包并返回文件路径，例如：zip、tar

base_name：压缩包的文件名，也可以是压缩包的路径。只是文件名时，则保存至当前目录，否则保存至指定路径，
如：www =>保存至当前路径
如：/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format：压缩包种类，“zip”, “tar”, “bztar”，“gztar”
root_dir：要压缩的文件夹路径（默认当前目录）
owner：用户，默认当前用户
group：组，默认当前组
logger：用于记录日志，通常是logging.Logger对象

1 #将 /Users/wupeiqi/Downloads/test 下的文件打包放置当前程序目录

2

3 import shutil

4 ret = shutil.make_archive("wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

5

6

7 #将 /Users/wupeiqi/Downloads/test 下的文件打包放置 /Users/wupeiqi/目录

8 import shutil

9 ret = shutil.make_archive("/Users/wupeiqi/wwwwwwwwww", 'gztar', root_dir='/Users/wupeiqi/Downloads/test')

 1 def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0,

 2                  dry_run=0, owner=None, group=None, logger=None):

 3     """Create an archive file (eg. zip or tar).

 4

 5     'base_name' is the name of the file to create, minus any format-specific

 6     extension; 'format' is the archive format: one of "zip", "tar", "bztar"

 7     or "gztar".

 8

 9     'root_dir' is a directory that will be the root directory of the

10     archive; ie. we typically chdir into 'root_dir' before creating the

11     archive.  'base_dir' is the directory where we start archiving from;

12     ie. 'base_dir' will be the common prefix of all files and

13     directories in the archive.  'root_dir' and 'base_dir' both default

14     to the current directory.  Returns the name of the archive file.

15

16     'owner' and 'group' are used when creating a tar archive. By default,

17     uses the current owner and group.

18     """

19     save_cwd = os.getcwd()

20     if root_dir is not None:

21         if logger is not None:

22             logger.debug("changing into '%s'", root_dir)

23         base_name = os.path.abspath(base_name)

24         if not dry_run:

25             os.chdir(root_dir)

26

27     if base_dir is None:

28         base_dir = os.curdir

29

30     kwargs = {'dry_run': dry_run, 'logger': logger}

31

32     try:

33         format_info = _ARCHIVE_FORMATS[format]

34     except KeyError:

35         raise ValueError, "unknown archive format '%s'" % format

36

37     func = format_info[0]

38     for arg, val in format_info[1]:

39         kwargs[arg] = val

40

41     if format != 'zip':

42         kwargs['owner'] = owner

43         kwargs['group'] = group

44

45     try:

46         filename = func(base_name, base_dir, **kwargs)

47     finally:

48         if root_dir is not None:

49             if logger is not None:

50                 logger.debug("changing back to '%s'", save_cwd)

51             os.chdir(save_cwd)

52

53     return filename

shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的，详细：

 1 import zipfile

 2

 3 # 压缩

 4 z = zipfile.ZipFile('laxi.zip', 'w')

 5 z.write('a.log')

 6 z.write('data.data')

 7 z.close()

 8

 9 # 解压

10 z = zipfile.ZipFile('laxi.zip', 'r')

11 z.extractall()

12 z.close()

Zipfile 压缩解压

 1 import tarfile

 2

 3 # 压缩

 4 tar = tarfile.open('your.tar','w')

 5 tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')

 6 tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')

 7 tar.close()

 8

 9 # 解压

10 tar = tarfile.open('your.tar','r')

11 tar.extractall()  # 可设置解压地址

12 tar.close()

tarfile 压缩解压

  1 import tarfile

  2

  3 # 压缩

  4 tar = tarfile.open('your.tar','w')

  5 tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')

  6 tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')

  7 tar.close()

  8

  9 # 解压

 10 tar = tarfile.open('your.tar','r')

 11 tar.extractall()  # 可设置解压地址

 12 tar.close()

 13

 14 复制代码

 15 复制代码

 16

 17 class ZipFile(object):

 18     """ Class with methods to open, read, write, close, list zip files.

 19

 20     z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False)

 21

 22     file: Either the path to the file, or a file-like object.

 23           If it is a path, the file will be opened and closed by ZipFile.

 24     mode: The mode can be either read "r", write "w" or append "a".

 25     compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib).

 26     allowZip64: if True ZipFile will create files with ZIP64 extensions when

 27                 needed, otherwise it will raise an exception when this would

 28                 be necessary.

 29

 30     """

 31

 32     fp = None                   # Set here since __del__ checks it

 33

 34     def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False):

 35         """Open the ZIP file with mode read "r", write "w" or append "a"."""

 36         if mode not in ("r", "w", "a"):

 37             raise RuntimeError('ZipFile() requires mode "r", "w", or "a"')

 38

 39         if compression == ZIP_STORED:

 40             pass

 41         elif compression == ZIP_DEFLATED:

 42             if not zlib:

 43                 raise RuntimeError,\

 44                       "Compression requires the (missing) zlib module"

 45         else:

 46             raise RuntimeError, "That compression method is not supported"

 47

 48         self._allowZip64 = allowZip64

 49         self._didModify = False

 50         self.debug = 0  # Level of printing: 0 through 3

 51         self.NameToInfo = {}    # Find file info given name

 52         self.filelist = []      # List of ZipInfo instances for archive

 53         self.compression = compression  # Method of compression

 54         self.mode = key = mode.replace('b', '')[0]

 55         self.pwd = None

 56         self._comment = ''

 57

 58         # Check if we were passed a file-like object

 59         if isinstance(file, basestring):

 60             self._filePassed = 0

 61             self.filename = file

 62             modeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'}

 63             try:

 64                 self.fp = open(file, modeDict[mode])

 65             except IOError:

 66                 if mode == 'a':

 67                     mode = key = 'w'

 68                     self.fp = open(file, modeDict[mode])

 69                 else:

 70                     raise

 71         else:

 72             self._filePassed = 1

 73             self.fp = file

 74             self.filename = getattr(file, 'name', None)

 75

 76         try:

 77             if key == 'r':

 78                 self._RealGetContents()

 79             elif key == 'w':

 80                 # set the modified flag so central directory gets written

 81                 # even if no files are added to the archive

 82                 self._didModify = True

 83             elif key == 'a':

 84                 try:

 85                     # See if file is a zip file

 86                     self._RealGetContents()

 87                     # seek to start of directory and overwrite

 88                     self.fp.seek(self.start_dir, 0)

 89                 except BadZipfile:

 90                     # file is not a zip file, just append

 91                     self.fp.seek(0, 2)

 92

 93                     # set the modified flag so central directory gets written

 94                     # even if no files are added to the archive

 95                     self._didModify = True

 96             else:

 97                 raise RuntimeError('Mode must be "r", "w" or "a"')

 98         except:

 99             fp = self.fp

100             self.fp = None

101             if not self._filePassed:

102                 fp.close()

103             raise

104

105     def __enter__(self):

106         return self

107

108     def __exit__(self, type, value, traceback):

109         self.close()

110

111     def _RealGetContents(self):

112         """Read in the table of contents for the ZIP file."""

113         fp = self.fp

114         try:

115             endrec = _EndRecData(fp)

116         except IOError:

117             raise BadZipfile("File is not a zip file")

118         if not endrec:

119             raise BadZipfile, "File is not a zip file"

120         if self.debug > 1:

121             print endrec

122         size_cd = endrec[_ECD_SIZE]             # bytes in central directory

123         offset_cd = endrec[_ECD_OFFSET]         # offset of central directory

124         self._comment = endrec[_ECD_COMMENT]    # archive comment

125

126         # "concat" is zero, unless zip was concatenated to another file

127         concat = endrec[_ECD_LOCATION] - size_cd - offset_cd

128         if endrec[_ECD_SIGNATURE] == stringEndArchive64:

129             # If Zip64 extension structures are present, account for them

130             concat -= (sizeEndCentDir64 + sizeEndCentDir64Locator)

131

132         if self.debug > 2:

133             inferred = concat + offset_cd

134             print "given, inferred, offset", offset_cd, inferred, concat

135         # self.start_dir:  Position of start of central directory

136         self.start_dir = offset_cd + concat

137         fp.seek(self.start_dir, 0)

138         data = fp.read(size_cd)

139         fp = cStringIO.StringIO(data)

140         total = 0

141         while total < size_cd:

142             centdir = fp.read(sizeCentralDir)

143             if len(centdir) != sizeCentralDir:

144                 raise BadZipfile("Truncated central directory")

145             centdir = struct.unpack(structCentralDir, centdir)

146             if centdir[_CD_SIGNATURE] != stringCentralDir:

147                 raise BadZipfile("Bad magic number for central directory")

148             if self.debug > 2:

149                 print centdir

150             filename = fp.read(centdir[_CD_FILENAME_LENGTH])

151             # Create ZipInfo instance to store file information

152             x = ZipInfo(filename)

153             x.extra = fp.read(centdir[_CD_EXTRA_FIELD_LENGTH])

154             x.comment = fp.read(centdir[_CD_COMMENT_LENGTH])

155             x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET]

156             (x.create_version, x.create_system, x.extract_version, x.reserved,

157                 x.flag_bits, x.compress_type, t, d,

158                 x.CRC, x.compress_size, x.file_size) = centdir[1:12]

159             x.volume, x.internal_attr, x.external_attr = centdir[15:18]

160             # Convert date/time code to (year, month, day, hour, min, sec)

161             x._raw_time = t

162             x.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F,

163                                      t>>11, (t>>5)&0x3F, (t&0x1F) * 2 )

164

165             x._decodeExtra()

166             x.header_offset = x.header_offset + concat

167             x.filename = x._decodeFilename()

168             self.filelist.append(x)

169             self.NameToInfo[x.filename] = x

170

171             # update total bytes read from central directory

172             total = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH]

173                      + centdir[_CD_EXTRA_FIELD_LENGTH]

174                      + centdir[_CD_COMMENT_LENGTH])

175

176             if self.debug > 2:

177                 print "total", total

178

179

180     def namelist(self):

181         """Return a list of file names in the archive."""

182         l = []

183         for data in self.filelist:

184             l.append(data.filename)

185         return l

186

187     def infolist(self):

188         """Return a list of class ZipInfo instances for files in the

189         archive."""

190         return self.filelist

191

192     def printdir(self):

193         """Print a table of contents for the zip file."""

194         print "%-46s %19s %12s" % ("File Name", "Modified    ", "Size")

195         for zinfo in self.filelist:

196             date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6]

197             print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size)

198

199     def testzip(self):

200         """Read all the files and check the CRC."""

201         chunk_size = 2 ** 20

202         for zinfo in self.filelist:

203             try:

204                 # Read by chunks, to avoid an OverflowError or a

205                 # MemoryError with very large embedded files.

206                 with self.open(zinfo.filename, "r") as f:

207                     while f.read(chunk_size):     # Check CRC-32

208                         pass

209             except BadZipfile:

210                 return zinfo.filename

211

212     def getinfo(self, name):

213         """Return the instance of ZipInfo given 'name'."""

214         info = self.NameToInfo.get(name)

215         if info is None:

216             raise KeyError(

217                 'There is no item named %r in the archive' % name)

218

219         return info

220

221     def setpassword(self, pwd):

222         """Set default password for encrypted files."""

223         self.pwd = pwd

224

225     @property

226     def comment(self):

227         """The comment text associated with the ZIP file."""

228         return self._comment

229

230     @comment.setter

231     def comment(self, comment):

232         # check for valid comment length

233         if len(comment) > ZIP_MAX_COMMENT:

234             import warnings

235             warnings.warn('Archive comment is too long; truncating to %d bytes'

236                           % ZIP_MAX_COMMENT, stacklevel=2)

237             comment = comment[:ZIP_MAX_COMMENT]

238         self._comment = comment

239         self._didModify = True

240

241     def read(self, name, pwd=None):

242         """Return file bytes (as a string) for name."""

243         return self.open(name, "r", pwd).read()

244

245     def open(self, name, mode="r", pwd=None):

246         """Return file-like object for 'name'."""

247         if mode not in ("r", "U", "rU"):

248             raise RuntimeError, 'open() requires mode "r", "U", or "rU"'

249         if not self.fp:

250             raise RuntimeError, \

251                   "Attempt to read ZIP archive that was already closed"

252

253         # Only open a new file for instances where we were not

254         # given a file object in the constructor

255         if self._filePassed:

256             zef_file = self.fp

257             should_close = False

258         else:

259             zef_file = open(self.filename, 'rb')

260             should_close = True

261

262         try:

263             # Make sure we have an info object

264             if isinstance(name, ZipInfo):

265                 # 'name' is already an info object

266                 zinfo = name

267             else:

268                 # Get info object for name

269                 zinfo = self.getinfo(name)

270

271             zef_file.seek(zinfo.header_offset, 0)

272

273             # Skip the file header:

274             fheader = zef_file.read(sizeFileHeader)

275             if len(fheader) != sizeFileHeader:

276                 raise BadZipfile("Truncated file header")

277             fheader = struct.unpack(structFileHeader, fheader)

278             if fheader[_FH_SIGNATURE] != stringFileHeader:

279                 raise BadZipfile("Bad magic number for file header")

280

281             fname = zef_file.read(fheader[_FH_FILENAME_LENGTH])

282             if fheader[_FH_EXTRA_FIELD_LENGTH]:

283                 zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])

284

285             if fname != zinfo.orig_filename:

286                 raise BadZipfile, \

287                         'File name in directory "%s" and header "%s" differ.' % (

288                             zinfo.orig_filename, fname)

289

290             # check for encrypted flag & handle password

291             is_encrypted = zinfo.flag_bits & 0x1

292             zd = None

293             if is_encrypted:

294                 if not pwd:

295                     pwd = self.pwd

296                 if not pwd:

297                     raise RuntimeError, "File %s is encrypted, " \

298                         "password required for extraction" % name

299

300                 zd = _ZipDecrypter(pwd)

301                 # The first 12 bytes in the cypher stream is an encryption header

302                 #  used to strengthen the algorithm. The first 11 bytes are

303                 #  completely random, while the 12th contains the MSB of the CRC,

304                 #  or the MSB of the file time depending on the header type

305                 #  and is used to check the correctness of the password.

306                 bytes = zef_file.read(12)

307                 h = map(zd, bytes[0:12])

308                 if zinfo.flag_bits & 0x8:

309                     # compare against the file type from extended local headers

310                     check_byte = (zinfo._raw_time >> 8) & 0xff

311                 else:

312                     # compare against the CRC otherwise

313                     check_byte = (zinfo.CRC >> 24) & 0xff

314                 if ord(h[11]) != check_byte:

315                     raise RuntimeError("Bad password for file", name)

316

317             return ZipExtFile(zef_file, mode, zinfo, zd,

318                     close_fileobj=should_close)

319         except:

320             if should_close:

321                 zef_file.close()

322             raise

323

324     def extract(self, member, path=None, pwd=None):

325         """Extract a member from the archive to the current working directory,

326            using its full name. Its file information is extracted as accurately

327            as possible. `member' may be a filename or a ZipInfo object. You can

328            specify a different directory using `path'.

329         """

330         if not isinstance(member, ZipInfo):

331             member = self.getinfo(member)

332

333         if path is None:

334             path = os.getcwd()

335

336         return self._extract_member(member, path, pwd)

337

338     def extractall(self, path=None, members=None, pwd=None):

339         """Extract all members from the archive to the current working

340            directory. `path' specifies a different directory to extract to.

341            `members' is optional and must be a subset of the list returned

342            by namelist().

343         """

344         if members is None:

345             members = self.namelist()

346

347         for zipinfo in members:

348             self.extract(zipinfo, path, pwd)

349

350     def _extract_member(self, member, targetpath, pwd):

351         """Extract the ZipInfo object 'member' to a physical

352            file on the path targetpath.

353         """

354         # build the destination pathname, replacing

355         # forward slashes to platform specific separators.

356         arcname = member.filename.replace('/', os.path.sep)

357

358         if os.path.altsep:

359             arcname = arcname.replace(os.path.altsep, os.path.sep)

360         # interpret absolute pathname as relative, remove drive letter or

361         # UNC path, redundant separators, "." and ".." components.

362         arcname = os.path.splitdrive(arcname)[1]

363         arcname = os.path.sep.join(x for x in arcname.split(os.path.sep)

364                     if x not in ('', os.path.curdir, os.path.pardir))

365         if os.path.sep == '\\':

366             # filter illegal characters on Windows

367             illegal = ':<>|"?*'

368             if isinstance(arcname, unicode):

369                 table = {ord(c): ord('_') for c in illegal}

370             else:

371                 table = string.maketrans(illegal, '_' * len(illegal))

372             arcname = arcname.translate(table)

373             # remove trailing dots

374             arcname = (x.rstrip('.') for x in arcname.split(os.path.sep))

375             arcname = os.path.sep.join(x for x in arcname if x)

376

377         targetpath = os.path.join(targetpath, arcname)

378         targetpath = os.path.normpath(targetpath)

379

380         # Create all upper directories if necessary.

381         upperdirs = os.path.dirname(targetpath)

382         if upperdirs and not os.path.exists(upperdirs):

383             os.makedirs(upperdirs)

384

385         if member.filename[-1] == '/':

386             if not os.path.isdir(targetpath):

387                 os.mkdir(targetpath)

388             return targetpath

389

390         with self.open(member, pwd=pwd) as source, \

391              file(targetpath, "wb") as target:

392             shutil.copyfileobj(source, target)

393

394         return targetpath

395

396     def _writecheck(self, zinfo):

397         """Check for errors before writing a file to the archive."""

398         if zinfo.filename in self.NameToInfo:

399             import warnings

400             warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)

401         if self.mode not in ("w", "a"):

402             raise RuntimeError, 'write() requires mode "w" or "a"'

403         if not self.fp:

404             raise RuntimeError, \

405                   "Attempt to write ZIP archive that was already closed"

406         if zinfo.compress_type == ZIP_DEFLATED and not zlib:

407             raise RuntimeError, \

408                   "Compression requires the (missing) zlib module"

409         if zinfo.compress_type not in (ZIP_STORED, ZIP_DEFLATED):

410             raise RuntimeError, \

411                   "That compression method is not supported"

412         if not self._allowZip64:

413             requires_zip64 = None

414             if len(self.filelist) >= ZIP_FILECOUNT_LIMIT:

415                 requires_zip64 = "Files count"

416             elif zinfo.file_size > ZIP64_LIMIT:

417                 requires_zip64 = "Filesize"

418             elif zinfo.header_offset > ZIP64_LIMIT:

419                 requires_zip64 = "Zipfile size"

420             if requires_zip64:

421                 raise LargeZipFile(requires_zip64 +

422                                    " would require ZIP64 extensions")

423

424     def write(self, filename, arcname=None, compress_type=None):

425         """Put the bytes from filename into the archive under the name

426         arcname."""

427         if not self.fp:

428             raise RuntimeError(

429                   "Attempt to write to ZIP archive that was already closed")

430

431         st = os.stat(filename)

432         isdir = stat.S_ISDIR(st.st_mode)

433         mtime = time.localtime(st.st_mtime)

434         date_time = mtime[0:6]

435         # Create ZipInfo instance to store file information

436         if arcname is None:

437             arcname = filename

438         arcname = os.path.normpath(os.path.splitdrive(arcname)[1])

439         while arcname[0] in (os.sep, os.altsep):

440             arcname = arcname[1:]

441         if isdir:

442             arcname += '/'

443         zinfo = ZipInfo(arcname, date_time)

444         zinfo.external_attr = (st[0] & 0xFFFF) << 16L      # Unix attributes

445         if compress_type is None:

446             zinfo.compress_type = self.compression

447         else:

448             zinfo.compress_type = compress_type

449

450         zinfo.file_size = st.st_size

451         zinfo.flag_bits = 0x00

452         zinfo.header_offset = self.fp.tell()    # Start of header bytes

453

454         self._writecheck(zinfo)

455         self._didModify = True

456

457         if isdir:

458             zinfo.file_size = 0

459             zinfo.compress_size = 0

460             zinfo.CRC = 0

461             zinfo.external_attr |= 0x10  # MS-DOS directory flag

462             self.filelist.append(zinfo)

463             self.NameToInfo[zinfo.filename] = zinfo

464             self.fp.write(zinfo.FileHeader(False))

465             return

466

467         with open(filename, "rb") as fp:

468             # Must overwrite CRC and sizes with correct data later

469             zinfo.CRC = CRC = 0

470             zinfo.compress_size = compress_size = 0

471             # Compressed size can be larger than uncompressed size

472             zip64 = self._allowZip64 and \

473                     zinfo.file_size * 1.05 > ZIP64_LIMIT

474             self.fp.write(zinfo.FileHeader(zip64))

475             if zinfo.compress_type == ZIP_DEFLATED:

476                 cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,

477                      zlib.DEFLATED, -15)

478             else:

479                 cmpr = None

480             file_size = 0

481             while 1:

482                 buf = fp.read(1024 * 8)

483                 if not buf:

484                     break

485                 file_size = file_size + len(buf)

486                 CRC = crc32(buf, CRC) & 0xffffffff

487                 if cmpr:

488                     buf = cmpr.compress(buf)

489                     compress_size = compress_size + len(buf)

490                 self.fp.write(buf)

491         if cmpr:

492             buf = cmpr.flush()

493             compress_size = compress_size + len(buf)

494             self.fp.write(buf)

495             zinfo.compress_size = compress_size

496         else:

497             zinfo.compress_size = file_size

498         zinfo.CRC = CRC

499         zinfo.file_size = file_size

500         if not zip64 and self._allowZip64:

501             if file_size > ZIP64_LIMIT:

502                 raise RuntimeError('File size has increased during compressing')

503             if compress_size > ZIP64_LIMIT:

504                 raise RuntimeError('Compressed size larger than uncompressed size')

505         # Seek backwards and write file header (which will now include

506         # correct CRC and file sizes)

507         position = self.fp.tell()       # Preserve current position in file

508         self.fp.seek(zinfo.header_offset, 0)

509         self.fp.write(zinfo.FileHeader(zip64))

510         self.fp.seek(position, 0)

511         self.filelist.append(zinfo)

512         self.NameToInfo[zinfo.filename] = zinfo

513

514     def writestr(self, zinfo_or_arcname, bytes, compress_type=None):

515         """Write a file into the archive.  The contents is the string

516         'bytes'.  'zinfo_or_arcname' is either a ZipInfo instance or

517         the name of the file in the archive."""

518         if not isinstance(zinfo_or_arcname, ZipInfo):

519             zinfo = ZipInfo(filename=zinfo_or_arcname,

520                             date_time=time.localtime(time.time())[:6])

521

522             zinfo.compress_type = self.compression

523             if zinfo.filename[-1] == '/':

524                 zinfo.external_attr = 0o40775 << 16   # drwxrwxr-x

525                 zinfo.external_attr |= 0x10           # MS-DOS directory flag

526             else:

527                 zinfo.external_attr = 0o600 << 16     # ?rw-------

528         else:

529             zinfo = zinfo_or_arcname

530

531         if not self.fp:

532             raise RuntimeError(

533                   "Attempt to write to ZIP archive that was already closed")

534

535         if compress_type is not None:

536             zinfo.compress_type = compress_type

537

538         zinfo.file_size = len(bytes)            # Uncompressed size

539         zinfo.header_offset = self.fp.tell()    # Start of header bytes

540         self._writecheck(zinfo)

541         self._didModify = True

542         zinfo.CRC = crc32(bytes) & 0xffffffff       # CRC-32 checksum

543         if zinfo.compress_type == ZIP_DEFLATED:

544             co = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,

545                  zlib.DEFLATED, -15)

546             bytes = co.compress(bytes) + co.flush()

547             zinfo.compress_size = len(bytes)    # Compressed size

548         else:

549             zinfo.compress_size = zinfo.file_size

550         zip64 = zinfo.file_size > ZIP64_LIMIT or \

551                 zinfo.compress_size > ZIP64_LIMIT

552         if zip64 and not self._allowZip64:

553             raise LargeZipFile("Filesize would require ZIP64 extensions")

554         self.fp.write(zinfo.FileHeader(zip64))

555         self.fp.write(bytes)

556         if zinfo.flag_bits & 0x08:

557             # Write CRC and file sizes after the file data

558             fmt = '<LQQ' if zip64 else '<LLL'

559             self.fp.write(struct.pack(fmt, zinfo.CRC, zinfo.compress_size,

560                   zinfo.file_size))

561         self.fp.flush()

562         self.filelist.append(zinfo)

563         self.NameToInfo[zinfo.filename] = zinfo

564

565     def __del__(self):

566         """Call the "close()" method in case the user forgot."""

567         self.close()

568

569     def close(self):

570         """Close the file, and for mode "w" and "a" write the ending

571         records."""

572         if self.fp is None:

573             return

574

575         try:

576             if self.mode in ("w", "a") and self._didModify: # write ending records

577                 pos1 = self.fp.tell()

578                 for zinfo in self.filelist:         # write central directory

579                     dt = zinfo.date_time

580                     dosdate = (dt[0] - 1980) << 9 | dt[1] << 5 | dt[2]

581                     dostime = dt[3] << 11 | dt[4] << 5 | (dt[5] // 2)

582                     extra = []

583                     if zinfo.file_size > ZIP64_LIMIT \

584                             or zinfo.compress_size > ZIP64_LIMIT:

585                         extra.append(zinfo.file_size)

586                         extra.append(zinfo.compress_size)

587                         file_size = 0xffffffff

588                         compress_size = 0xffffffff

589                     else:

590                         file_size = zinfo.file_size

591                         compress_size = zinfo.compress_size

592

593                     if zinfo.header_offset > ZIP64_LIMIT:

594                         extra.append(zinfo.header_offset)

595                         header_offset = 0xffffffffL

596                     else:

597                         header_offset = zinfo.header_offset

598

599                     extra_data = zinfo.extra

600                     if extra:

601                         # Append a ZIP64 field to the extra's

602                         extra_data = struct.pack(

603                                 '<HH' + 'Q'*len(extra),

604                                 1, 8*len(extra), *extra) + extra_data

605

606                         extract_version = max(45, zinfo.extract_version)

607                         create_version = max(45, zinfo.create_version)

608                     else:

609                         extract_version = zinfo.extract_version

610                         create_version = zinfo.create_version

611

612                     try:

613                         filename, flag_bits = zinfo._encodeFilenameFlags()

614                         centdir = struct.pack(structCentralDir,

615                         stringCentralDir, create_version,

616                         zinfo.create_system, extract_version, zinfo.reserved,

617                         flag_bits, zinfo.compress_type, dostime, dosdate,

618                         zinfo.CRC, compress_size, file_size,

619                         len(filename), len(extra_data), len(zinfo.comment),

620                         0, zinfo.internal_attr, zinfo.external_attr,

621                         header_offset)

622                     except DeprecationWarning:

623                         print >>sys.stderr, (structCentralDir,

624                         stringCentralDir, create_version,

625                         zinfo.create_system, extract_version, zinfo.reserved,

626                         zinfo.flag_bits, zinfo.compress_type, dostime, dosdate,

627                         zinfo.CRC, compress_size, file_size,

628                         len(zinfo.filename), len(extra_data), len(zinfo.comment),

629                         0, zinfo.internal_attr, zinfo.external_attr,

630                         header_offset)

631                         raise

632                     self.fp.write(centdir)

633                     self.fp.write(filename)

634                     self.fp.write(extra_data)

635                     self.fp.write(zinfo.comment)

636

637                 pos2 = self.fp.tell()

638                 # Write end-of-zip-archive record

639                 centDirCount = len(self.filelist)

640                 centDirSize = pos2 - pos1

641                 centDirOffset = pos1

642                 requires_zip64 = None

643                 if centDirCount > ZIP_FILECOUNT_LIMIT:

644                     requires_zip64 = "Files count"

645                 elif centDirOffset > ZIP64_LIMIT:

646                     requires_zip64 = "Central directory offset"

647                 elif centDirSize > ZIP64_LIMIT:

648                     requires_zip64 = "Central directory size"

649                 if requires_zip64:

650                     # Need to write the ZIP64 end-of-archive records

651                     if not self._allowZip64:

652                         raise LargeZipFile(requires_zip64 +

653                                            " would require ZIP64 extensions")

654                     zip64endrec = struct.pack(

655                             structEndArchive64, stringEndArchive64,

656                             44, 45, 45, 0, 0, centDirCount, centDirCount,

657                             centDirSize, centDirOffset)

658                     self.fp.write(zip64endrec)

659

660                     zip64locrec = struct.pack(

661                             structEndArchive64Locator,

662                             stringEndArchive64Locator, 0, pos2, 1)

663                     self.fp.write(zip64locrec)

664                     centDirCount = min(centDirCount, 0xFFFF)

665                     centDirSize = min(centDirSize, 0xFFFFFFFF)

666                     centDirOffset = min(centDirOffset, 0xFFFFFFFF)

667

668                 endrec = struct.pack(structEndArchive, stringEndArchive,

669                                     0, 0, centDirCount, centDirCount,

670                                     centDirSize, centDirOffset, len(self._comment))

671                 self.fp.write(endrec)

672                 self.fp.write(self._comment)

673                 self.fp.flush()

674         finally:

675             fp = self.fp

676             self.fp = None

677             if not self._filePassed:

678                 fp.close()

zipfile

  1 class TarFile(object):

  2     """The TarFile Class provides an interface to tar archives.

  3     """

  4

  5     debug = 0                   # May be set from 0 (no msgs) to 3 (all msgs)

  6

  7     dereference = False         # If true, add content of linked file to the

  8                                 # tar file, else the link.

  9

 10     ignore_zeros = False        # If true, skips empty or invalid blocks and

 11                                 # continues processing.

 12

 13     errorlevel = 1              # If 0, fatal errors only appear in debug

 14                                 # messages (if debug >= 0). If > 0, errors

 15                                 # are passed to the caller as exceptions.

 16

 17     format = DEFAULT_FORMAT     # The format to use when creating an archive.

 18

 19     encoding = ENCODING         # Encoding for 8-bit character strings.

 20

 21     errors = None               # Error handler for unicode conversion.

 22

 23     tarinfo = TarInfo           # The default TarInfo class to use.

 24

 25     fileobject = ExFileObject   # The default ExFileObject class to use.

 26

 27     def __init__(self, name=None, mode="r", fileobj=None, format=None,

 28             tarinfo=None, dereference=None, ignore_zeros=None, encoding=None,

 29             errors=None, pax_headers=None, debug=None, errorlevel=None):

 30         """Open an (uncompressed) tar archive `name'. `mode' is either 'r' to

 31            read from an existing archive, 'a' to append data to an existing

 32            file or 'w' to create a new file overwriting an existing one. `mode'

 33            defaults to 'r'.

 34            If `fileobj' is given, it is used for reading or writing data. If it

 35            can be determined, `mode' is overridden by `fileobj's mode.

 36            `fileobj' is not closed, when TarFile is closed.

 37         """

 38         modes = {"r": "rb", "a": "r+b", "w": "wb"}

 39         if mode not in modes:

 40             raise ValueError("mode must be 'r', 'a' or 'w'")

 41         self.mode = mode

 42         self._mode = modes[mode]

 43

 44         if not fileobj:

 45             if self.mode == "a" and not os.path.exists(name):

 46                 # Create nonexistent files in append mode.

 47                 self.mode = "w"

 48                 self._mode = "wb"

 49             fileobj = bltn_open(name, self._mode)

 50             self._extfileobj = False

 51         else:

 52             if name is None and hasattr(fileobj, "name"):

 53                 name = fileobj.name

 54             if hasattr(fileobj, "mode"):

 55                 self._mode = fileobj.mode

 56             self._extfileobj = True

 57         self.name = os.path.abspath(name) if name else None

 58         self.fileobj = fileobj

 59

 60         # Init attributes.

 61         if format is not None:

 62             self.format = format

 63         if tarinfo is not None:

 64             self.tarinfo = tarinfo

 65         if dereference is not None:

 66             self.dereference = dereference

 67         if ignore_zeros is not None:

 68             self.ignore_zeros = ignore_zeros

 69         if encoding is not None:

 70             self.encoding = encoding

 71

 72         if errors is not None:

 73             self.errors = errors

 74         elif mode == "r":

 75             self.errors = "utf-8"

 76         else:

 77             self.errors = "strict"

 78

 79         if pax_headers is not None and self.format == PAX_FORMAT:

 80             self.pax_headers = pax_headers

 81         else:

 82             self.pax_headers = {}

 83

 84         if debug is not None:

 85             self.debug = debug

 86         if errorlevel is not None:

 87             self.errorlevel = errorlevel

 88

 89         # Init datastructures.

 90         self.closed = False

 91         self.members = []       # list of members as TarInfo objects

 92         self._loaded = False    # flag if all members have been read

 93         self.offset = self.fileobj.tell()

 94                                 # current position in the archive file

 95         self.inodes = {}        # dictionary caching the inodes of

 96                                 # archive members already added

 97

 98         try:

 99             if self.mode == "r":

100                 self.firstmember = None

101                 self.firstmember = self.next()

102

103             if self.mode == "a":

104                 # Move to the end of the archive,

105                 # before the first empty block.

106                 while True:

107                     self.fileobj.seek(self.offset)

108                     try:

109                         tarinfo = self.tarinfo.fromtarfile(self)

110                         self.members.append(tarinfo)

111                     except EOFHeaderError:

112                         self.fileobj.seek(self.offset)

113                         break

114                     except HeaderError, e:

115                         raise ReadError(str(e))

116

117             if self.mode in "aw":

118                 self._loaded = True

119

120                 if self.pax_headers:

121                     buf = self.tarinfo.create_pax_global_header(self.pax_headers.copy())

122                     self.fileobj.write(buf)

123                     self.offset += len(buf)

124         except:

125             if not self._extfileobj:

126                 self.fileobj.close()

127             self.closed = True

128             raise

129

130     def _getposix(self):

131         return self.format == USTAR_FORMAT

132     def _setposix(self, value):

133         import warnings

134         warnings.warn("use the format attribute instead", DeprecationWarning,

135                       2)

136         if value:

137             self.format = USTAR_FORMAT

138         else:

139             self.format = GNU_FORMAT

140     posix = property(_getposix, _setposix)

141

142     #--------------------------------------------------------------------------

143     # Below are the classmethods which act as alternate constructors to the

144     # TarFile class. The open() method is the only one that is needed for

145     # public use; it is the "super"-constructor and is able to select an

146     # adequate "sub"-constructor for a particular compression using the mapping

147     # from OPEN_METH.

148     #

149     # This concept allows one to subclass TarFile without losing the comfort of

150     # the super-constructor. A sub-constructor is registered and made available

151     # by adding it to the mapping in OPEN_METH.

152

153     @classmethod

154     def open(cls, name=None, mode="r", fileobj=None, bufsize=RECORDSIZE, **kwargs):

155         """Open a tar archive for reading, writing or appending. Return

156            an appropriate TarFile class.

157

158            mode:

159            'r' or 'r:*' open for reading with transparent compression

160            'r:'         open for reading exclusively uncompressed

161            'r:gz'       open for reading with gzip compression

162            'r:bz2'      open for reading with bzip2 compression

163            'a' or 'a:'  open for appending, creating the file if necessary

164            'w' or 'w:'  open for writing without compression

165            'w:gz'       open for writing with gzip compression

166            'w:bz2'      open for writing with bzip2 compression

167

168            'r|*'        open a stream of tar blocks with transparent compression

169            'r|'         open an uncompressed stream of tar blocks for reading

170            'r|gz'       open a gzip compressed stream of tar blocks

171            'r|bz2'      open a bzip2 compressed stream of tar blocks

172            'w|'         open an uncompressed stream for writing

173            'w|gz'       open a gzip compressed stream for writing

174            'w|bz2'      open a bzip2 compressed stream for writing

175         """

176

177         if not name and not fileobj:

178             raise ValueError("nothing to open")

179

180         if mode in ("r", "r:*"):

181             # Find out which *open() is appropriate for opening the file.

182             for comptype in cls.OPEN_METH:

183                 func = getattr(cls, cls.OPEN_METH[comptype])

184                 if fileobj is not None:

185                     saved_pos = fileobj.tell()

186                 try:

187                     return func(name, "r", fileobj, **kwargs)

188                 except (ReadError, CompressionError), e:

189                     if fileobj is not None:

190                         fileobj.seek(saved_pos)

191                     continue

192             raise ReadError("file could not be opened successfully")

193

194         elif ":" in mode:

195             filemode, comptype = mode.split(":", 1)

196             filemode = filemode or "r"

197             comptype = comptype or "tar"

198

199             # Select the *open() function according to

200             # given compression.

201             if comptype in cls.OPEN_METH:

202                 func = getattr(cls, cls.OPEN_METH[comptype])

203             else:

204                 raise CompressionError("unknown compression type %r" % comptype)

205             return func(name, filemode, fileobj, **kwargs)

206

207         elif "|" in mode:

208             filemode, comptype = mode.split("|", 1)

209             filemode = filemode or "r"

210             comptype = comptype or "tar"

211

212             if filemode not in ("r", "w"):

213                 raise ValueError("mode must be 'r' or 'w'")

214

215             stream = _Stream(name, filemode, comptype, fileobj, bufsize)

216             try:

217                 t = cls(name, filemode, stream, **kwargs)

218             except:

219                 stream.close()

220                 raise

221             t._extfileobj = False

222             return t

223

224         elif mode in ("a", "w"):

225             return cls.taropen(name, mode, fileobj, **kwargs)

226

227         raise ValueError("undiscernible mode")

228

229     @classmethod

230     def taropen(cls, name, mode="r", fileobj=None, **kwargs):

231         """Open uncompressed tar archive name for reading or writing.

232         """

233         if mode not in ("r", "a", "w"):

234             raise ValueError("mode must be 'r', 'a' or 'w'")

235         return cls(name, mode, fileobj, **kwargs)

236

237     @classmethod

238     def gzopen(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):

239         """Open gzip compressed tar archive name for reading or writing.

240            Appending is not allowed.

241         """

242         if mode not in ("r", "w"):

243             raise ValueError("mode must be 'r' or 'w'")

244

245         try:

246             import gzip

247             gzip.GzipFile

248         except (ImportError, AttributeError):

249             raise CompressionError("gzip module is not available")

250

251         try:

252             fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj)

253         except OSError:

254             if fileobj is not None and mode == 'r':

255                 raise ReadError("not a gzip file")

256             raise

257

258         try:

259             t = cls.taropen(name, mode, fileobj, **kwargs)

260         except IOError:

261             fileobj.close()

262             if mode == 'r':

263                 raise ReadError("not a gzip file")

264             raise

265         except:

266             fileobj.close()

267             raise

268         t._extfileobj = False

269         return t

270

271     @classmethod

272     def bz2open(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):

273         """Open bzip2 compressed tar archive name for reading or writing.

274            Appending is not allowed.

275         """

276         if mode not in ("r", "w"):

277             raise ValueError("mode must be 'r' or 'w'.")

278

279         try:

280             import bz2

281         except ImportError:

282             raise CompressionError("bz2 module is not available")

283

284         if fileobj is not None:

285             fileobj = _BZ2Proxy(fileobj, mode)

286         else:

287             fileobj = bz2.BZ2File(name, mode, compresslevel=compresslevel)

288

289         try:

290             t = cls.taropen(name, mode, fileobj, **kwargs)

291         except (IOError, EOFError):

292             fileobj.close()

293             if mode == 'r':

294                 raise ReadError("not a bzip2 file")

295             raise

296         except:

297             fileobj.close()

298             raise

299         t._extfileobj = False

300         return t

301

302     # All *open() methods are registered here.

303     OPEN_METH = {

304         "tar": "taropen",   # uncompressed tar

305         "gz":  "gzopen",    # gzip compressed tar

306         "bz2": "bz2open"    # bzip2 compressed tar

307     }

308

309     #--------------------------------------------------------------------------

310     # The public methods which TarFile provides:

311

312     def close(self):

313         """Close the TarFile. In write-mode, two finishing zero blocks are

314            appended to the archive.

315         """

316         if self.closed:

317             return

318

319         if self.mode in "aw":

320             self.fileobj.write(NUL * (BLOCKSIZE * 2))

321             self.offset += (BLOCKSIZE * 2)

322             # fill up the end with zero-blocks

323             # (like option -b20 for tar does)

324             blocks, remainder = divmod(self.offset, RECORDSIZE)

325             if remainder > 0:

326                 self.fileobj.write(NUL * (RECORDSIZE - remainder))

327

328         if not self._extfileobj:

329             self.fileobj.close()

330         self.closed = True

331

332     def getmember(self, name):

333         """Return a TarInfo object for member `name'. If `name' can not be

334            found in the archive, KeyError is raised. If a member occurs more

335            than once in the archive, its last occurrence is assumed to be the

336            most up-to-date version.

337         """

338         tarinfo = self._getmember(name)

339         if tarinfo is None:

340             raise KeyError("filename %r not found" % name)

341         return tarinfo

342

343     def getmembers(self):

344         """Return the members of the archive as a list of TarInfo objects. The

345            list has the same order as the members in the archive.

346         """

347         self._check()

348         if not self._loaded:    # if we want to obtain a list of

349             self._load()        # all members, we first have to

350                                 # scan the whole archive.

351         return self.members

352

353     def getnames(self):

354         """Return the members of the archive as a list of their names. It has

355            the same order as the list returned by getmembers().

356         """

357         return [tarinfo.name for tarinfo in self.getmembers()]

358

359     def gettarinfo(self, name=None, arcname=None, fileobj=None):

360         """Create a TarInfo object for either the file `name' or the file

361            object `fileobj' (using os.fstat on its file descriptor). You can

362            modify some of the TarInfo's attributes before you add it using

363            addfile(). If given, `arcname' specifies an alternative name for the

364            file in the archive.

365         """

366         self._check("aw")

367

368         # When fileobj is given, replace name by

369         # fileobj's real name.

370         if fileobj is not None:

371             name = fileobj.name

372

373         # Building the name of the member in the archive.

374         # Backward slashes are converted to forward slashes,

375         # Absolute paths are turned to relative paths.

376         if arcname is None:

377             arcname = name

378         drv, arcname = os.path.splitdrive(arcname)

379         arcname = arcname.replace(os.sep, "/")

380         arcname = arcname.lstrip("/")

381

382         # Now, fill the TarInfo object with

383         # information specific for the file.

384         tarinfo = self.tarinfo()

385         tarinfo.tarfile = self

386

387         # Use os.stat or os.lstat, depending on platform

388         # and if symlinks shall be resolved.

389         if fileobj is None:

390             if hasattr(os, "lstat") and not self.dereference:

391                 statres = os.lstat(name)

392             else:

393                 statres = os.stat(name)

394         else:

395             statres = os.fstat(fileobj.fileno())

396         linkname = ""

397

398         stmd = statres.st_mode

399         if stat.S_ISREG(stmd):

400             inode = (statres.st_ino, statres.st_dev)

401             if not self.dereference and statres.st_nlink > 1 and \

402                     inode in self.inodes and arcname != self.inodes[inode]:

403                 # Is it a hardlink to an already

404                 # archived file?

405                 type = LNKTYPE

406                 linkname = self.inodes[inode]

407             else:

408                 # The inode is added only if its valid.

409                 # For win32 it is always 0.

410                 type = REGTYPE

411                 if inode[0]:

412                     self.inodes[inode] = arcname

413         elif stat.S_ISDIR(stmd):

414             type = DIRTYPE

415         elif stat.S_ISFIFO(stmd):

416             type = FIFOTYPE

417         elif stat.S_ISLNK(stmd):

418             type = SYMTYPE

419             linkname = os.readlink(name)

420         elif stat.S_ISCHR(stmd):

421             type = CHRTYPE

422         elif stat.S_ISBLK(stmd):

423             type = BLKTYPE

424         else:

425             return None

426

427         # Fill the TarInfo object with all

428         # information we can get.

429         tarinfo.name = arcname

430         tarinfo.mode = stmd

431         tarinfo.uid = statres.st_uid

432         tarinfo.gid = statres.st_gid

433         if type == REGTYPE:

434             tarinfo.size = statres.st_size

435         else:

436             tarinfo.size = 0L

437         tarinfo.mtime = statres.st_mtime

438         tarinfo.type = type

439         tarinfo.linkname = linkname

440         if pwd:

441             try:

442                 tarinfo.uname = pwd.getpwuid(tarinfo.uid)[0]

443             except KeyError:

444                 pass

445         if grp:

446             try:

447                 tarinfo.gname = grp.getgrgid(tarinfo.gid)[0]

448             except KeyError:

449                 pass

450

451         if type in (CHRTYPE, BLKTYPE):

452             if hasattr(os, "major") and hasattr(os, "minor"):

453                 tarinfo.devmajor = os.major(statres.st_rdev)

454                 tarinfo.devminor = os.minor(statres.st_rdev)

455         return tarinfo

456

457     def list(self, verbose=True):

458         """Print a table of contents to sys.stdout. If `verbose' is False, only

459            the names of the members are printed. If it is True, an `ls -l'-like

460            output is produced.

461         """

462         self._check()

463

464         for tarinfo in self:

465             if verbose:

466                 print filemode(tarinfo.mode),

467                 print "%s/%s" % (tarinfo.uname or tarinfo.uid,

468                                  tarinfo.gname or tarinfo.gid),

469                 if tarinfo.ischr() or tarinfo.isblk():

470                     print "%10s" % ("%d,%d" \

471                                     % (tarinfo.devmajor, tarinfo.devminor)),

472                 else:

473                     print "%10d" % tarinfo.size,

474                 print "%d-%02d-%02d %02d:%02d:%02d" \

475                       % time.localtime(tarinfo.mtime)[:6],

476

477             print tarinfo.name + ("/" if tarinfo.isdir() else ""),

478

479             if verbose:

480                 if tarinfo.issym():

481                     print "->", tarinfo.linkname,

482                 if tarinfo.islnk():

483                     print "link to", tarinfo.linkname,

484             print

485

486     def add(self, name, arcname=None, recursive=True, exclude=None, filter=None):

487         """Add the file `name' to the archive. `name' may be any type of file

488            (directory, fifo, symbolic link, etc.). If given, `arcname'

489            specifies an alternative name for the file in the archive.

490            Directories are added recursively by default. This can be avoided by

491            setting `recursive' to False. `exclude' is a function that should

492            return True for each filename to be excluded. `filter' is a function

493            that expects a TarInfo object argument and returns the changed

494            TarInfo object, if it returns None the TarInfo object will be

495            excluded from the archive.

496         """

497         self._check("aw")

498

499         if arcname is None:

500             arcname = name

501

502         # Exclude pathnames.

503         if exclude is not None:

504             import warnings

505             warnings.warn("use the filter argument instead",

506                     DeprecationWarning, 2)

507             if exclude(name):

508                 self._dbg(2, "tarfile: Excluded %r" % name)

509                 return

510

511         # Skip if somebody tries to archive the archive...

512         if self.name is not None and os.path.abspath(name) == self.name:

513             self._dbg(2, "tarfile: Skipped %r" % name)

514             return

515

516         self._dbg(1, name)

517

518         # Create a TarInfo object from the file.

519         tarinfo = self.gettarinfo(name, arcname)

520

521         if tarinfo is None:

522             self._dbg(1, "tarfile: Unsupported type %r" % name)

523             return

524

525         # Change or exclude the TarInfo object.

526         if filter is not None:

527             tarinfo = filter(tarinfo)

528             if tarinfo is None:

529                 self._dbg(2, "tarfile: Excluded %r" % name)

530                 return

531

532         # Append the tar header and data to the archive.

533         if tarinfo.isreg():

534             with bltn_open(name, "rb") as f:

535                 self.addfile(tarinfo, f)

536

537         elif tarinfo.isdir():

538             self.addfile(tarinfo)

539             if recursive:

540                 for f in os.listdir(name):

541                     self.add(os.path.join(name, f), os.path.join(arcname, f),

542                             recursive, exclude, filter)

543

544         else:

545             self.addfile(tarinfo)

546

547     def addfile(self, tarinfo, fileobj=None):

548         """Add the TarInfo object `tarinfo' to the archive. If `fileobj' is

549            given, tarinfo.size bytes are read from it and added to the archive.

550            You can create TarInfo objects using gettarinfo().

551            On Windows platforms, `fileobj' should always be opened with mode

552            'rb' to avoid irritation about the file size.

553         """

554         self._check("aw")

555

556         tarinfo = copy.copy(tarinfo)

557

558         buf = tarinfo.tobuf(self.format, self.encoding, self.errors)

559         self.fileobj.write(buf)

560         self.offset += len(buf)

561

562         # If there's data to follow, append it.

563         if fileobj is not None:

564             copyfileobj(fileobj, self.fileobj, tarinfo.size)

565             blocks, remainder = divmod(tarinfo.size, BLOCKSIZE)

566             if remainder > 0:

567                 self.fileobj.write(NUL * (BLOCKSIZE - remainder))

568                 blocks += 1

569             self.offset += blocks * BLOCKSIZE

570

571         self.members.append(tarinfo)

572

573     def extractall(self, path=".", members=None):

574         """Extract all members from the archive to the current working

575            directory and set owner, modification time and permissions on

576            directories afterwards. `path' specifies a different directory

577            to extract to. `members' is optional and must be a subset of the

578            list returned by getmembers().

579         """

580         directories = []

581

582         if members is None:

583             members = self

584

585         for tarinfo in members:

586             if tarinfo.isdir():

587                 # Extract directories with a safe mode.

588                 directories.append(tarinfo)

589                 tarinfo = copy.copy(tarinfo)

590                 tarinfo.mode = 0700

591             self.extract(tarinfo, path)

592

593         # Reverse sort directories.

594         directories.sort(key=operator.attrgetter('name'))

595         directories.reverse()

596

597         # Set correct owner, mtime and filemode on directories.

598         for tarinfo in directories:

599             dirpath = os.path.join(path, tarinfo.name)

600             try:

601                 self.chown(tarinfo, dirpath)

602                 self.utime(tarinfo, dirpath)

603                 self.chmod(tarinfo, dirpath)

604             except ExtractError, e:

605                 if self.errorlevel > 1:

606                     raise

607                 else:

608                     self._dbg(1, "tarfile: %s" % e)

609

610     def extract(self, member, path=""):

611         """Extract a member from the archive to the current working directory,

612            using its full name. Its file information is extracted as accurately

613            as possible. `member' may be a filename or a TarInfo object. You can

614            specify a different directory using `path'.

615         """

616         self._check("r")

617

618         if isinstance(member, basestring):

619             tarinfo = self.getmember(member)

620         else:

621             tarinfo = member

622

623         # Prepare the link target for makelink().

624         if tarinfo.islnk():

625             tarinfo._link_target = os.path.join(path, tarinfo.linkname)

626

627         try:

628             self._extract_member(tarinfo, os.path.join(path, tarinfo.name))

629         except EnvironmentError, e:

630             if self.errorlevel > 0:

631                 raise

632             else:

633                 if e.filename is None:

634                     self._dbg(1, "tarfile: %s" % e.strerror)

635                 else:

636                     self._dbg(1, "tarfile: %s %r" % (e.strerror, e.filename))

637         except ExtractError, e:

638             if self.errorlevel > 1:

639                 raise

640             else:

641                 self._dbg(1, "tarfile: %s" % e)

642

643     def extractfile(self, member):

644         """Extract a member from the archive as a file object. `member' may be

645            a filename or a TarInfo object. If `member' is a regular file, a

646            file-like object is returned. If `member' is a link, a file-like

647            object is constructed from the link's target. If `member' is none of

648            the above, None is returned.

649            The file-like object is read-only and provides the following

650            methods: read(), readline(), readlines(), seek() and tell()

651         """

652         self._check("r")

653

654         if isinstance(member, basestring):

655             tarinfo = self.getmember(member)

656         else:

657             tarinfo = member

658

659         if tarinfo.isreg():

660             return self.fileobject(self, tarinfo)

661

662         elif tarinfo.type not in SUPPORTED_TYPES:

663             # If a member's type is unknown, it is treated as a

664             # regular file.

665             return self.fileobject(self, tarinfo)

666

667         elif tarinfo.islnk() or tarinfo.issym():

668             if isinstance(self.fileobj, _Stream):

669                 # A small but ugly workaround for the case that someone tries

670                 # to extract a (sym)link as a file-object from a non-seekable

671                 # stream of tar blocks.

672                 raise StreamError("cannot extract (sym)link as file object")

673             else:

674                 # A (sym)link's file object is its target's file object.

675                 return self.extractfile(self._find_link_target(tarinfo))

676         else:

677             # If there's no data associated with the member (directory, chrdev,

678             # blkdev, etc.), return None instead of a file object.

679             return None

680

681     def _extract_member(self, tarinfo, targetpath):

682         """Extract the TarInfo object tarinfo to a physical

683            file called targetpath.

684         """

685         # Fetch the TarInfo object for the given name

686         # and build the destination pathname, replacing

687         # forward slashes to platform specific separators.

688         targetpath = targetpath.rstrip("/")

689         targetpath = targetpath.replace("/", os.sep)

690

691         # Create all upper directories.

692         upperdirs = os.path.dirname(targetpath)

693         if upperdirs and not os.path.exists(upperdirs):

694             # Create directories that are not part of the archive with

695             # default permissions.

696             os.makedirs(upperdirs)

697

698         if tarinfo.islnk() or tarinfo.issym():

699             self._dbg(1, "%s -> %s" % (tarinfo.name, tarinfo.linkname))

700         else:

701             self._dbg(1, tarinfo.name)

702

703         if tarinfo.isreg():

704             self.makefile(tarinfo, targetpath)

705         elif tarinfo.isdir():

706             self.makedir(tarinfo, targetpath)

707         elif tarinfo.isfifo():

708             self.makefifo(tarinfo, targetpath)

709         elif tarinfo.ischr() or tarinfo.isblk():

710             self.makedev(tarinfo, targetpath)

711         elif tarinfo.islnk() or tarinfo.issym():

712             self.makelink(tarinfo, targetpath)

713         elif tarinfo.type not in SUPPORTED_TYPES:

714             self.makeunknown(tarinfo, targetpath)

715         else:

716             self.makefile(tarinfo, targetpath)

717

718         self.chown(tarinfo, targetpath)

719         if not tarinfo.issym():

720             self.chmod(tarinfo, targetpath)

721             self.utime(tarinfo, targetpath)

722

723     #--------------------------------------------------------------------------

724     # Below are the different file methods. They are called via

725     # _extract_member() when extract() is called. They can be replaced in a

726     # subclass to implement other functionality.

727

728     def makedir(self, tarinfo, targetpath):

729         """Make a directory called targetpath.

730         """

731         try:

732             # Use a safe mode for the directory, the real mode is set

733             # later in _extract_member().

734             os.mkdir(targetpath, 0700)

735         except EnvironmentError, e:

736             if e.errno != errno.EEXIST:

737                 raise

738

739     def makefile(self, tarinfo, targetpath):

740         """Make a file called targetpath.

741         """

742         source = self.extractfile(tarinfo)

743         try:

744             with bltn_open(targetpath, "wb") as target:

745                 copyfileobj(source, target)

746         finally:

747             source.close()

748

749     def makeunknown(self, tarinfo, targetpath):

750         """Make a file from a TarInfo object with an unknown type

751            at targetpath.

752         """

753         self.makefile(tarinfo, targetpath)

754         self._dbg(1, "tarfile: Unknown file type %r, " \

755                      "extracted as regular file." % tarinfo.type)

756

757     def makefifo(self, tarinfo, targetpath):

758         """Make a fifo called targetpath.

759         """

760         if hasattr(os, "mkfifo"):

761             os.mkfifo(targetpath)

762         else:

763             raise ExtractError("fifo not supported by system")

764

765     def makedev(self, tarinfo, targetpath):

766         """Make a character or block device called targetpath.

767         """

768         if not hasattr(os, "mknod") or not hasattr(os, "makedev"):

769             raise ExtractError("special devices not supported by system")

770

771         mode = tarinfo.mode

772         if tarinfo.isblk():

773             mode |= stat.S_IFBLK

774         else:

775             mode |= stat.S_IFCHR

776

777         os.mknod(targetpath, mode,

778                  os.makedev(tarinfo.devmajor, tarinfo.devminor))

779

780     def makelink(self, tarinfo, targetpath):

781         """Make a (symbolic) link called targetpath. If it cannot be created

782           (platform limitation), we try to make a copy of the referenced file

783           instead of a link.

784         """

785         if hasattr(os, "symlink") and hasattr(os, "link"):

786             # For systems that support symbolic and hard links.

787             if tarinfo.issym():

788                 if os.path.lexists(targetpath):

789                     os.unlink(targetpath)

790                 os.symlink(tarinfo.linkname, targetpath)

791             else:

792                 # See extract().

793                 if os.path.exists(tarinfo._link_target):

794                     if os.path.lexists(targetpath):

795                         os.unlink(targetpath)

796                     os.link(tarinfo._link_target, targetpath)

797                 else:

798                     self._extract_member(self._find_link_target(tarinfo), targetpath)

799         else:

800             try:

801                 self._extract_member(self._find_link_target(tarinfo), targetpath)

802             except KeyError:

803                 raise ExtractError("unable to resolve link inside archive")

804

805     def chown(self, tarinfo, targetpath):

806         """Set owner of targetpath according to tarinfo.

807         """

808         if pwd and hasattr(os, "geteuid") and os.geteuid() == 0:

809             # We have to be root to do so.

810             try:

811                 g = grp.getgrnam(tarinfo.gname)[2]

812             except KeyError:

813                 g = tarinfo.gid

814             try:

815                 u = pwd.getpwnam(tarinfo.uname)[2]

816             except KeyError:

817                 u = tarinfo.uid

818             try:

819                 if tarinfo.issym() and hasattr(os, "lchown"):

820                     os.lchown(targetpath, u, g)

821                 else:

822                     if sys.platform != "os2emx":

823                         os.chown(targetpath, u, g)

824             except EnvironmentError, e:

825                 raise ExtractError("could not change owner")

826

827     def chmod(self, tarinfo, targetpath):

828         """Set file permissions of targetpath according to tarinfo.

829         """

830         if hasattr(os, 'chmod'):

831             try:

832                 os.chmod(targetpath, tarinfo.mode)

833             except EnvironmentError, e:

834                 raise ExtractError("could not change mode")

835

836     def utime(self, tarinfo, targetpath):

837         """Set modification time of targetpath according to tarinfo.

838         """

839         if not hasattr(os, 'utime'):

840             return

841         try:

842             os.utime(targetpath, (tarinfo.mtime, tarinfo.mtime))

843         except EnvironmentError, e:

844             raise ExtractError("could not change modification time")

845

846     #--------------------------------------------------------------------------

847     def next(self):

848         """Return the next member of the archive as a TarInfo object, when

849            TarFile is opened for reading. Return None if there is no more

850            available.

851         """

852         self._check("ra")

853         if self.firstmember is not None:

854             m = self.firstmember

855             self.firstmember = None

856             return m

857

858         # Read the next block.

859         self.fileobj.seek(self.offset)

860         tarinfo = None

861         while True:

862             try:

863                 tarinfo = self.tarinfo.fromtarfile(self)

864             except EOFHeaderError, e:

865                 if self.ignore_zeros:

866                     self._dbg(2, "0x%X: %s" % (self.offset, e))

867                     self.offset += BLOCKSIZE

868                     continue

869             except InvalidHeaderError, e:

870                 if self.ignore_zeros:

871                     self._dbg(2, "0x%X: %s" % (self.offset, e))

872                     self.offset += BLOCKSIZE

873                     continue

874                 elif self.offset == 0:

875                     raise ReadError(str(e))

876             except EmptyHeaderError:

877                 if self.offset == 0:

878                     raise ReadError("empty file")

879             except TruncatedHeaderError, e:

880                 if self.offset == 0:

881                     raise ReadError(str(e))

882             except SubsequentHeaderError, e:

883                 raise ReadError(str(e))

884             break

885

886         if tarinfo is not None:

887             self.members.append(tarinfo)

888         else:

889             self._loaded = True

890

891         return tarinfo

892

893     #--------------------------------------------------------------------------

894     # Little helper methods:

895

896     def _getmember(self, name, tarinfo=None, normalize=False):

897         """Find an archive member by name from bottom to top.

898            If tarinfo is given, it is used as the starting point.

899         """

900         # Ensure that all members have been loaded.

901         members = self.getmembers()

902

903         # Limit the member search list up to tarinfo.

904         if tarinfo is not None:

905             members = members[:members.index(tarinfo)]

906

907         if normalize:

908             name = os.path.normpath(name)

909

910         for member in reversed(members):

911             if normalize:

912                 member_name = os.path.normpath(member.name)

913             else:

914                 member_name = member.name

915

916             if name == member_name:

917                 return member

918

919     def _load(self):

920         """Read through the entire archive file and look for readable

921            members.

922         """

923         while True:

924             tarinfo = self.next()

925             if tarinfo is None:

926                 break

927         self._loaded = True

928

929     def _check(self, mode=None):

930         """Check if TarFile is still open, and if the operation's mode

931            corresponds to TarFile's mode.

932         """

933         if self.closed:

934             raise IOError("%s is closed" % self.__class__.__name__)

935         if mode is not None and self.mode not in mode:

936             raise IOError("bad operation for mode %r" % self.mode)

937

938     def _find_link_target(self, tarinfo):

939         """Find the target member of a symlink or hardlink member in the

940            archive.

941         """

942         if tarinfo.issym():

943             # Always search the entire archive.

944             linkname = "/".join(filter(None, (os.path.dirname(tarinfo.name), tarinfo.linkname)))

945             limit = None

946         else:

947             # Search the archive before the link, because a hard link is

948             # just a reference to an already archived file.

949             linkname = tarinfo.linkname

950             limit = tarinfo

951

952         member = self._getmember(linkname, tarinfo=limit, normalize=True)

953         if member is None:

954             raise KeyError("linkname %r not found" % linkname)

955         return member

956

957     def __iter__(self):

958         """Provide an iterator object.

959         """

960         if self._loaded:

961             return iter(self.members)

962         else:

963             return TarIter(self)

964

965     def _dbg(self, level, msg):

966         """Write debugging output to sys.stderr.

967         """

968         if level <= self.debug:

969             print >> sys.stderr, msg

970

971     def __enter__(self):

972         self._check()

973         return self

974

975     def __exit__(self, type, value, traceback):

976         if type is None:

977             self.close()

978         else:

979             # An exception occurred. We must not call close() because

980             # it would try to write end-of-archive blocks and padding.

981             if not self._extfileobj:

982                 self.fileobj.close()

983             self.closed = True

984 # class TarFile

Tarfile

六、shelve模块

shelve模块是一个简单的k,v将内存数据通过文件持久化的模块，可以持久化任何pickle可支持的python数据格式

 1 import shelve

 2

 3 d = shelve.open('shelve_test') #打开一个文件

 4

 5 class Test(object):

 6     def __init__(self,n):

 7         self.n = n

 8

 9

10 t = Test(123)

11 t2 = Test(123334)

12

13 name = ["alex","rain","test"]

14 d["test"] = name #持久化列表

15 d["t1"] = t      #持久化类

16 d["t2"] = t2

17

18 d.close()

七、XML模块

xml是实现不同语言或程序之间进行数据交换的协议，跟json差不多，但json使用起来更简单，不过，古时候，在json还没诞生的黑暗年代，大家只能选择用xml呀，至今很多传统公司如金融行业的很多系统的接口还主要是xml。

xml的格式如下，就是通过<>节点来区别数据结构的:

 1 <?xml version="1.0"?>

 2 <data>

 3     <country name="Liechtenstein">

 4         <rank updated="yes">2</rank>

 5         <year>2008</year>

 6         <gdppc>141100</gdppc>

 7         <neighbor name="Austria" direction="E"/>

 8         <neighbor name="Switzerland" direction="W"/>

 9     </country>

10     <country name="Singapore">

11         <rank updated="yes">5</rank>

12         <year>2011</year>

13         <gdppc>59900</gdppc>

14         <neighbor name="Malaysia" direction="N"/>

15     </country>

16     <country name="Panama">

17         <rank updated="yes">69</rank>

18         <year>2011</year>

19         <gdppc>13600</gdppc>

20         <neighbor name="Costa Rica" direction="W"/>

21         <neighbor name="Colombia" direction="E"/>

22     </country>

23 </data>

xml协议在各个语言里的都是支持的，在python中可以用以下模块操作xml

 1 import xml.etree.ElementTree as ET

 2

 3 tree = ET.parse("xmltest.xml")

 4 root = tree.getroot()

 5 print(root.tag)

 6

 7 #遍历xml文档

 8 for child in root:

 9     print(child.tag, child.attrib)

10     for i in child:

11         print(i.tag,i.text)

12

13 #只遍历year 节点

14 for node in root.iter('year'):

15     print(node.tag,node.text)

修改和删除xml文档内容

 1 import xml.etree.ElementTree as ET

 2

 3 tree = ET.parse("xmltest.xml")

 4 root = tree.getroot()

 5

 6 #修改

 7 for node in root.iter('year'):

 8     new_year = int(node.text) + 1

 9     node.text = str(new_year)

10     node.set("updated","yes")

11

12 tree.write("xmltest.xml")

13

14

15 #删除node

16 for country in root.findall('country'):

17    rank = int(country.find('rank').text)

18    if rank > 50:

19      root.remove(country)

20

21 tree.write('output.xml')

自己创建xml文档

 1 import xml.etree.ElementTree as ET

 2

 3

 4 new_xml = ET.Element("namelist")

 5 name = ET.SubElement(new_xml,"name",attrib={"enrolled":"yes"})

 6 age = ET.SubElement(name,"age",attrib={"checked":"no"})

 7 sex = ET.SubElement(name,"sex")

 8 sex.text = '33'

 9 name2 = ET.SubElement(new_xml,"name",attrib={"enrolled":"no"})

10 age = ET.SubElement(name2,"age")

11 age.text = '19'

12

13 et = ET.ElementTree(new_xml) #生成文档对象

14 et.write("test.xml", encoding="utf-8",xml_declaration=True)

15

16 ET.dump(new_xml) #打印生成的格式

八、ConfigParser模块

用于生成和修改常见配置文档，当前模块的名称在 python 3.x 版本中变更为 configparser。

来看一个好多软件的常见文档格式如下

 1 [DEFAULT]

 2 ServerAliveInterval = 45

 3 Compression = yes

 4 CompressionLevel = 9

 5 ForwardX11 = yes

 6

 7 [bitbucket.org]

 8 User = hg

 9

10 [topsecret.server.com]

11 Port = 50022

12 ForwardX11 = no

如果想用python生成一个这样的文档怎么做呢？

 1 import configparser

 2

 3 config = configparser.ConfigParser()

 4 config["DEFAULT"] = {'ServerAliveInterval': '45',

 5                       'Compression': 'yes',

 6                      'CompressionLevel': '9'}

 7

 8 config['bitbucket.org'] = {}

 9 config['bitbucket.org']['User'] = 'hg'

10 config['topsecret.server.com'] = {}

11 topsecret = config['topsecret.server.com']

12 topsecret['Host Port'] = '50022'     # mutates the parser

13 topsecret['ForwardX11'] = 'no'  # same here

14 config['DEFAULT']['ForwardX11'] = 'yes'

15 with open('example.ini', 'w') as configfile:

16    config.write(configfile)

写完了还可以再读出来

 1 >>> import configparser

 2 >>> config = configparser.ConfigParser()

 3 >>> config.sections()

 4 []

 5 >>> config.read('example.ini')

 6 ['example.ini']

 7 >>> config.sections()

 8 ['bitbucket.org', 'topsecret.server.com']

 9 >>> 'bitbucket.org' in config

10 True

11 >>> 'bytebong.com' in config

12 False

13 >>> config['bitbucket.org']['User']

14 'hg'

15 >>> config['DEFAULT']['Compression']

16 'yes'

17 >>> topsecret = config['topsecret.server.com']

18 >>> topsecret['ForwardX11']

19 'no'

20 >>> topsecret['Port']

21 '50022'

22 >>> for key in config['bitbucket.org']: print(key)

23 ...

24 user

25 compressionlevel

26 serveraliveinterval

27 compression

28 forwardx11

29 >>> config['bitbucket.org']['ForwardX11']

30 'yes'

configparser增删改查语法

 1 [section1]

 2 k1 = v1

 3 k2:v2

 4

 5 [section2]

 6 k1 = v1

 7

 8 import ConfigParser

 9

10 config = ConfigParser.ConfigParser()

11 config.read('i.cfg')

12

13 # ########## 读 ##########

14 #secs = config.sections()

15 #print secs

16 #options = config.options('group2')

17 #print options

18

19 #item_list = config.items('group2')

20 #print item_list

21

22 #val = config.get('group1','key')

23 #val = config.getint('group1','key')

24

25 # ########## 改写 ##########

26 #sec = config.remove_section('group1')

27 #config.write(open('i.cfg', "w"))

28

29 #sec = config.has_section('wupeiqi')

30 #sec = config.add_section('wupeiqi')

31 #config.write(open('i.cfg', "w"))

32

33

34 #config.set('group2','k1',11111)

35 #config.write(open('i.cfg', "w"))

36

37 #config.remove_option('group2','age')

38 #config.write(open('i.cfg', "w"))

hashlib模块　　

用于加密相关的操作，3.x里代替了md5模块和sha模块，主要提供 SHA1, SHA224, SHA256, SHA384, SHA512 ，MD5 算法

 1 import hashlib

 2

 3 m = hashlib.md5()

 4 m.update(b"Hello")

 5 m.update(b"It's me")

 6 print(m.digest())

 7 m.update(b"It's been a long time since last time we ...")

 8

 9 print(m.digest()) #2进制格式hash

10 print(len(m.hexdigest())) #16进制格式hash

11 '''

12 def digest(self, *args, **kwargs): # real signature unknown

13     """ Return the digest value as a string of binary data. """

14     pass

15

16 def hexdigest(self, *args, **kwargs): # real signature unknown

17     """ Return the digest value as a string of hexadecimal digits. """

18     pass

19

20 '''

21 import hashlib

22

23 # ######## md5 ########

24

25 hash = hashlib.md5()

26 hash.update('admin')

27 print(hash.hexdigest())

28

29 # ######## sha1 ########

30

31 hash = hashlib.sha1()

32 hash.update('admin')

33 print(hash.hexdigest())

34

35 # ######## sha256 ########

36

37 hash = hashlib.sha256()

38 hash.update('admin')

39 print(hash.hexdigest())

40

41

42 # ######## sha384 ########

43

44 hash = hashlib.sha384()

45 hash.update('admin')

46 print(hash.hexdigest())

47

48 # ######## sha512 ########

49

50 hash = hashlib.sha512()

51 hash.update('admin')

52 print(hash.hexdigest())

python 还有一个 hmac 模块，它内部对我们创建 key 和内容再进行处理然后再加密

散列消息鉴别码，简称HMAC，是一种基于消息鉴别码MAC（Message Authentication Code）的鉴别机制。使用HMAC时,消息通讯的双方，通过验证消息中加入的鉴别密钥K来鉴别消息的真伪；

一般用于网络通信中消息加密，前提是双方先要约定好key,就像接头暗号一样，然后消息发送把用key把消息加密，接收方用key ＋消息明文再加密，拿加密后的值跟发送者的相对比是否相等，这样就能验证消息的真实性，及发送者的合法性了。

1 import hmac

2 h = hmac.new(b'天王盖地虎', b'宝塔镇河妖')

3 print h.hexdigest()

看这里https://www.tbs-certificates.co.uk/FAQ/en/sha256.html

Python也可以很容易的处理ymal文档格式，只不过需要安装一个模块，参考文档：http://pyyaml.org/wiki/PyYAMLDocumentation

九、Subprocess模块

 1 #执行命令，返回命令执行状态 ， 0 or 非0

 2 >>> retcode = subprocess.call(["ls", "-l"])

 3

 4 #执行命令，如果命令结果为0，就正常返回，否则抛异常

 5 >>> subprocess.check_call(["ls", "-l"])

 6 0

 7

 8 #接收字符串格式命令，返回元组形式，第1个元素是执行状态，第2个是命令结果

 9 >>> subprocess.getstatusoutput('ls /bin/ls')

10 (0, '/bin/ls')

11

12 #接收字符串格式命令，并返回结果

13 >>> subprocess.getoutput('ls /bin/ls')

14 '/bin/ls'

15

16 #执行命令，并返回结果，注意是返回结果，不是打印，下例结果返回给res

17 >>> res=subprocess.check_output(['ls','-l'])

18 >>> res

19 b'total 0\ndrwxr-xr-x 12 alex staff 408 Nov 2 11:05 OldBoyCRM\n'

20

21 #上面那些方法，底层都是封装的subprocess.Popen

22 poll()

23 Check if child process has terminated. Returns returncode

24

25 wait()

26 Wait for child process to terminate. Returns returncode attribute.

27

28

29 terminate() 杀掉所启动进程

30 communicate() 等待任务结束

31

32 stdin 标准输入

33

34 stdout 标准输出

35

36 stderr 标准错误

37

38 pid

39 The process ID of the child process.

40

41 #例子

42 >>> p = subprocess.Popen("df -h|grep disk",stdin=subprocess.PIPE,stdout=subprocess.PIPE,shell=True)

43 >>> p.stdout.read()

44 b'/dev/disk1 465Gi 64Gi 400Gi 14% 16901472 104938142 14% /\n'

 1 >>> subprocess.run(["ls", "-l"])  # doesn't capture output

 2 CompletedProcess(args=['ls', '-l'], returncode=0)

 3

 4 >>> subprocess.run("exit 1", shell=True, check=True)

 5 Traceback (most recent call last):

 6   ...

 7 subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1

 8

 9 >>> subprocess.run(["ls", "-l", "/dev/null"], stdout=subprocess.PIPE)

10 CompletedProcess(args=['ls', '-l', '/dev/null'], returncode=0,

11 stdout=b'crw-rw-rw- 1 root root 1, 3 Jan 23 16:23 /dev/null\n')

调用subprocess.run(...)是推荐的常用方法，在大多数情况下能满足需求，但如果你可能需要进行一些复杂的与系统的交互的话，你还可以用subprocess.Popen(),语法如下：

1 p = subprocess.Popen("find / -size +1000000 -exec ls -shl {} \;",shell=True,stdout=subprocess.PIPE)

2 print(p.stdout.read())

可用参数：

args：shell命令，可以是字符串或者序列类型（如：list，元组）
bufsize：指定缓冲。0 无缓冲,1 行缓冲,其他缓冲区大小,负值系统缓冲
stdin, stdout, stderr：分别表示程序的标准输入、输出、错误句柄
preexec_fn：只在Unix平台下有效，用于指定一个可执行对象（callable object），它将在子进程运行之前被调用
close_sfs：在windows平台下，如果close_fds被设置为True，则新创建的子进程将不会继承父进程的输入、输出、错误管道。
所以不能将close_fds设置为True同时重定向子进程的标准输入、输出与错误(stdin, stdout, stderr)。
shell：同上
cwd：用于设置子进程的当前目录
env：用于指定子进程的环境变量。如果env = None，子进程的环境变量将从父进程中继承。
universal_newlines：不同系统的换行符不同，True -> 同意使用 \n
startupinfo与createionflags只在windows下有效
将被传递给底层的CreateProcess()函数，用于设置子进程的一些属性，如：主窗口的外观，进程的优先级等等

终端输入的命令分为两种：

输入即可得到输出，如：ifconfig
输入进行某环境，依赖再输入，如：python

需要交互的命令示例

 1 import subprocess

 2

 3 obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

 4 obj.stdin.write('print 1 \n ')

 5 obj.stdin.write('print 2 \n ')

 6 obj.stdin.write('print 3 \n ')

 7 obj.stdin.write('print 4 \n ')

 8

 9 out_error_list = obj.communicate(timeout=10)

10 print out_error_list

subprocess实现sudo 自动输入密码

 1 import subprocess

 2

 3 def mypass():

 4     mypass = '123' #or get the password from anywhere

 5     return mypass

 6

 7 echo = subprocess.Popen(['echo',mypass()],

 8                         stdout=subprocess.PIPE,

 9                         )

10

11 sudo = subprocess.Popen(['sudo','-S','iptables','-L'],

12                         stdin=echo.stdout,

13                         stdout=subprocess.PIPE,

14                         )

15

16 end_of_pipe = sudo.stdout

17

18 print "Password ok \n Iptables Chains %s" % end_of_pipe.read()

十、logging模块

很多程序都有记录日志的需求，并且日志中包含的信息即有正常的程序访问日志，还可能有错误、警告等信息输出，python的logging模块提供了标准的日志接口，你可以通过它存储各种格式的日志，logging的日志可以分为 debug(), info(), warning(), error() and critical() 5个级别，下面我们看一下怎么用。

最简单用法

1 import logging

2

3 logging.warning("user [alex] attempted wrong password more than 3 times")

4 logging.critical("server is down")

5

6 #输出

7 WARNING:root:user [alex] attempted wrong password more than 3 times

8 CRITICAL:root:server is down

看一下这几个日志级别分别代表什么意思

Level	When it’s used
`DEBUG`	Detailed information, typically of interest only when diagnosing problems.
`INFO`	Confirmation that things are working as expected.
`WARNING`	An indication that something unexpected happened, or indicative of some problem in the near future (e.g. ‘disk space low’). The software is still working as expected.
`ERROR`	Due to a more serious problem, the software has not been able to perform some function.
`CRITICAL`	A serious error, indicating that the program itself may be unable to continue running.

如果想把日志写到文件里，也很简单

1 import logging

2

3 logging.basicConfig(filename='example.log',level=logging.INFO)

4 logging.debug('This message should go to the log file')

5 logging.info('So should this')

6 logging.warning('And this, too')

其中下面这句中的level=loggin.INFO意思是，把日志纪录级别设置为INFO，也就是说，只有比日志是INFO或比INFO级别更高的日志才会被纪录到文件里，在这个例子，第一条日志是不会被纪录的，如果希望纪录debug的日志，那把日志级别改成DEBUG就行了。

1 logging.basicConfig(filename='example.log',level=logging.INFO)

感觉上面的日志格式忘记加上时间啦，日志不知道时间怎么行呢，下面就来加上!

1 import logging

2 logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p')

3 logging.warning('is when this event was logged.')

4

5 #输出

6 12/12/2010 11:46:36 AM is when this event was logged.

日志格式

%(name)s	Logger的名字
%(levelno)s	数字形式的日志级别
%(levelname)s	文本形式的日志级别
%(pathname)s	调用日志输出函数的模块的完整路径名，可能没有
%(filename)s	调用日志输出函数的模块的文件名
%(module)s	调用日志输出函数的模块名
%(funcName)s	调用日志输出函数的函数名
%(lineno)d	调用日志输出函数的语句所在的代码行
%(created)f	当前时间，用UNIX标准的表示时间的浮点数表示
%(relativeCreated)d	输出日志信息时的，自Logger创建以来的毫秒数
%(asctime)s	字符串形式的当前时间。默认格式是 “2003-07-08 16:49:45,896”。逗号后面的是毫秒
%(thread)d	线程ID。可能没有
%(threadName)s	线程名。可能没有
%(process)d	进程ID。可能没有
%(message)s	用户输出的消息

Python 使用logging模块记录日志涉及四个主要类，使用官方文档中的概括最为合适：

logger提供了应用程序可以直接使用的接口；

handler将(logger创建的)日志记录发送到合适的目的输出；

filter提供了细度设备来决定输出哪条日志记录；

formatter决定日志记录的最终输出格式。

logger

每个程序在输出信息之前都要获得一个Logger。Logger通常对应了程序的模块名，比如聊天工具的图形界面模块可以这样获得它的Logger：

LOG=logging.getLogger(”chat.gui”)

而核心模块可以这样：

LOG=logging.getLogger(”chat.kernel”)

Logger.setLevel(lel):指定最低的日志级别，低于lel的级别将被忽略。debug是最低的内置级别，critical为最高

Logger.addFilter(filt)、Logger.removeFilter(filt):添加或删除指定的filter

Logger.addHandler(hdlr)、Logger.removeHandler(hdlr)：增加或删除指定的handler

Logger.debug()、Logger.info()、Logger.warning()、Logger.error()、Logger.critical()：可以设置的日志级别

handler

handler对象负责发送相关的信息到指定目的地。Python的日志系统有多种Handler可以使用。有些Handler可以把信息输出到控制台，有些Logger可以把信息输出到文件，还有些 Handler可以把信息发送到网络上。如果觉得不够用，还可以编写自己的Handler。可以通过addHandler()方法添加多个多handler

Handler.setLevel(lel):指定被处理的信息级别，低于lel级别的信息将被忽略

Handler.setFormatter()：给这个handler选择一个格式

Handler.addFilter(filt)、Handler.removeFilter(filt)：新增或删除一个filter对象

每个Logger可以附加多个Handler。接下来我们就来介绍一些常用的Handler：

1) logging.StreamHandler

使用这个Handler可以向类似与sys.stdout或者sys.stderr的任何文件对象(file object)输出信息。它的构造函数是：

StreamHandler([strm])

其中strm参数是一个文件对象。默认是sys.stderr

2) logging.FileHandler

和StreamHandler类似，用于向一个文件输出日志信息。不过FileHandler会帮你打开这个文件。它的构造函数是：

FileHandler(filename[,mode])

filename是文件名，必须指定一个文件名。

mode是文件的打开方式。参见Python内置函数open()的用法。默认是’a'，即添加到文件末尾。

3) logging.handlers.RotatingFileHandler

这个Handler类似于上面的FileHandler，但是它可以管理文件大小。当文件达到一定大小之后，它会自动将当前日志文件改名，然后创建 一个新的同名日志文件继续输出。比如日志文件是chat.log。当chat.log达到指定的大小之后，RotatingFileHandler自动把 文件改名为chat.log.1。不过，如果chat.log.1已经存在，会先把chat.log.1重命名为chat.log.2。。。最后重新创建 chat.log，继续输出日志信息。它的构造函数是：

RotatingFileHandler( filename[, mode[, maxBytes[, backupCount]]])

其中filename和mode两个参数和FileHandler一样。

maxBytes用于指定日志文件的最大文件大小。如果maxBytes为0，意味着日志文件可以无限大，这时上面描述的重命名过程就不会发生。

backupCount用于指定保留的备份文件的个数。比如，如果指定为2，当上面描述的重命名过程发生时，原有的chat.log.2并不会被更名，而是被删除。

4) logging.handlers.TimedRotatingFileHandler

这个Handler和RotatingFileHandler类似，不过，它没有通过判断文件大小来决定何时重新创建日志文件，而是间隔一定时间就 自动创建新的日志文件。重命名的过程与RotatingFileHandler类似，不过新的文件不是附加数字，而是当前时间。它的构造函数是：

TimedRotatingFileHandler( filename [,when [,interval [,backupCount]]])

其中filename参数和backupCount参数和RotatingFileHandler具有相同的意义。

interval是时间间隔。

when参数是一个字符串。表示时间间隔的单位，不区分大小写。它有以下取值：

S 秒

M 分

H 小时

D 天

W 每星期（interval==0时代表星期一）

midnight 每天凌晨

 1 import logging

 2

 3 #create logger

 4 logger = logging.getLogger('TEST-LOG')

 5 logger.setLevel(logging.DEBUG)

 6

 7

 8 # create console handler and set level to debug

 9 ch = logging.StreamHandler()

10 ch.setLevel(logging.DEBUG)

11

12 # create file handler and set level to warning

13 fh = logging.FileHandler("access.log")

14 fh.setLevel(logging.WARNING)

15 # create formatter

16 formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')

17

18 # add formatter to ch and fh

19 ch.setFormatter(formatter)

20 fh.setFormatter(formatter)

21

22 # add ch and fh to logger

23 logger.addHandler(ch)

24 logger.addHandler(fh)

25

26 # 'application' code

27 logger.debug('debug message')

28 logger.info('info message')

29 logger.warn('warn message')

30 logger.error('error message')

31 logger.critical('critical message')

文件自动截断例子

 1 import logging

 2

 3 from logging import handlers

 4

 5 logger = logging.getLogger(__name__)

 6

 7 log_file = "timelog.log"

 8 #fh = handlers.RotatingFileHandler(filename=log_file,maxBytes=10,backupCount=3)

 9 fh = handlers.TimedRotatingFileHandler(filename=log_file,when="S",interval=5,backupCount=3)

10

11

12 formatter = logging.Formatter('%(asctime)s %(module)s:%(lineno)d %(message)s')

13

14 fh.setFormatter(formatter)

15

16 logger.addHandler(fh)

17

18

19 logger.warning("test1")

20 logger.warning("test12")

21 logger.warning("test13")

22 logger.warning("test14")

十一、re模块

常用正则表达式符号

 1 '.'     默认匹配除\n之外的任意一个字符，若指定flag DOTALL,则匹配任意字符，包括换行

 2 '^'     匹配字符开头，若指定flags MULTILINE,这种也可以匹配上(r"^a","\nabc\neee",flags=re.MULTILINE)

 3 '$'     匹配字符结尾，或e.search("foo$","bfoo\nsdfsf",flags=re.MULTILINE).group()也可以

 4 '*'     匹配*号前的字符0次或多次，re.findall("ab*","cabb3abcbbac")  结果为['abb', 'ab', 'a']

 5 '+'     匹配前一个字符1次或多次，re.findall("ab+","ab+cd+abb+bba") 结果['ab', 'abb']

 6 '?'     匹配前一个字符1次或0次

 7 '{m}'   匹配前一个字符m次

 8 '{n,m}' 匹配前一个字符n到m次，re.findall("ab{1,3}","abb abc abbcbbb") 结果'abb', 'ab', 'abb']

 9 '|'     匹配|左或|右的字符，re.search("abc|ABC","ABCBabcCD").group() 结果'ABC'

10 '(...)' 分组匹配，re.search("(abc){2}a(123|456)c", "abcabca456c").group() 结果 abcabca456c

11

12

13 '\A'    只从字符开头匹配，re.search("\Aabc","alexabc") 是匹配不到的

14 '\Z'    匹配字符结尾，同$

15 '\d'    匹配数字0-9

16 '\D'    匹配非数字

17 '\w'    匹配[A-Za-z0-9]

18 '\W'    匹配非[A-Za-z0-9]

19 's'     匹配空白字符、\t、\n、\r , re.search("\s+","ab\tc1\n3").group() 结果 '\t'

20

21 '(?P<name>...)' 分组匹配 re.search("(?P<province>[0-9]{4})(?P<city>[0-9]{2})(?P<birthday>[0-9]{4})","371481199306143242").groupdict("city") 结果{'province': '3714', 'city': '81', 'birthday': '1993'}

最常用的匹配

re.match 从头开始匹配

re.search 匹配包含

re.findall 把所有匹配到的字符放到以列表中的元素返回

re.splitall 以匹配到的字符当做列表分隔符

re.sub 匹配字符并替换

反斜杠的困扰
与大多数编程语言相同，正则表达式里使用"\"作为转义字符，这就可能造成反斜杠困扰。假如你需要匹配文本中的字符"\"，那么使用编程语言表示的正则表达式里将需要4个反斜杠"\\\\"：前两个和后两个分别用于在编程语言里转义成反斜杠，转换成两个反斜杠后再在正则表达式里转义成一个反斜杠。Python里的原生字符串很好地解决了这个问题，这个例子中的正则表达式可以使用r"\\"表示。同样，匹配一个数字的"\\d"可以写成r"\d"。有了原生字符串，你再也不用担心是不是漏写了反斜杠，写出来的表达式也更直观。

1 re.I(re.IGNORECASE): 忽略大小写（括号内是完整写法，下同）

2 M(MULTILINE): 多行模式，改变'^'和'$'的行为（参见上图）

3 S(DOTALL): 点任意匹配模式，改变'.'的行为

python3-day5的更多相关文章

Python3.5学习之旅——day5
模块初识一.定义在python中,模块是用来实现某一特定功能的代码集合.其本质上就是以‘.py’结尾的python文件.例如某文件名为test.py,则模块名为test. 二.导入方法我们在这一 ...
python3.x Day5 面向对象
类:类是指:对具有相同属性的事物的抽象.蓝图.原型.在类中定义了这些事物都具备的属性和共同的方法. 对象:一个对象就是一个类实例化以后的实例,一个类必须经过实例化后才能在程序中被使用,一个类可以实例化 ...
python3.x Day5 socket编程
socket编程: socket 是大多应用层的底层的封装,实际封装的就是发送,接收,但其实很复杂,在传输层协议之上(TCP/IP,UDP) 既然是网络通讯,一般按照服务端,客户端来处理:服务端: ...
python3.x Day5 异常处理
异常处理: 预计可能会发生的异常,明确如果发生,如何处理,不过一般不参与业务逻辑,也不要一次性捕捉全部异常,不然可能程序就不可控了. data={} mmm=[] try: #捕获异常, data[& ...
python3.x Day5 subprocess模块！！
subprocess模块: # subprocess用来替换多个旧模块和函数 os.system os.spawn* os.popen* popen2.* commands.* subprocess简 ...
Python基础篇-day5
本节目录: 1.生成器 1.1 列表推导式方法 1.2 函数法--适用复杂的推导方法2.迭代器3.装饰器 3.1 单一验证方式(调用不传参数) 3.2 单一验证方式(调用传参数) 3.3 多种验证方式 ...
Python Day5 模块包
一:区分Python文件的2种用途 1个Python文件的2种用途 1.1 当作脚本执行: if __name__ == '__main__': 1.2 当作模块导入使用 if ...
python3之os、sys
os模块 # 显示当前使用平台:"nt":windows;"posix":Linux >>> os.name 'nt' # 当前工作目录 &g ...
python自动化运维之路~DAY5
python自动化运维之路~DAY5 作者:尹正杰版权声明:原创作品,谢绝转载!否则将追究法律责任. 一.模块的分类模块,用一砣代码实现了某个功能的代码集合. 类似于函数式编程和面向过程编程,函数 ...
Day5 函数递归，匿名、内置行数，模块和包，开发规范
一.递归与二分法一.递归 1.递归调用的定义递归调用:在调用一个函数的过程中,直接或间接地调用了函数本身 2.递归分为两类:直接与间接 #直接 def func(): print('from fu ...

随机推荐

Flink-1.10中的StreamingFileSink相关特性
一切新知识的学习,都离不开官网得相关阅读,那么StreamingFileSink的官网介绍呢? https://ci.apache.org/projects/flink/flink-docs-rele ...
【C艹】关于sort用法之重构cmp(comp)函数的笔记
众所周知,balabalabalabala············. 所以掌握sort函数(库文件:<algorithm>)的用法还是很有必要的. 一般选手只会简单地用用sort排一排数组 ...
seo增加外链的方法
http://www.wocaoseo.com/thread-128-1-1.html 今天给大家介绍一下本人发外链的一点经验吧.好多新手都感觉,发个外链真的好难哦.其实之前我也是这样认为的,发外链好 ...
Unity WebGL WebSocket
在线示例 http://39.105.150.229/UnityWebSocket/ 快速开始安装环境 Unity 2018.3 或更高. 无其他SDK依赖. 安装方法通过 OpenUPM 安装 ...
发生错误 1069 sqlserver
---------------------------SQL Server 服务管理器---------------------------发生错误 1069 - (由于登录失败而无法启动服务.),此 ...
Android反解符号表工具
cd ~/android-ndk-r13b/toolchains/arm-linux-androideabi-4.9/prebuilt/darwin-x86_64/bin ./arm-linux-an ...
Fragment 1.2.0 更新记录
1.官方地址 https://developer.android.com/jetpack/androidx/releases/fragment 2.引入方法 dependencies { def fr ...
unity接入安卓SDK,与安卓相互通信
.接SDK是个什么样的活计? SDK的工作流程: 1. 从unity端出发,向安卓发起一系列的请求(unity call android). 2. 安卓端收到unity端调用,然后在具体调用SDK的一 ...
CocosCreator游戏开发（五）实现技能按钮
在上一篇中,已经顺利的实现了通过摇杆控件来控制角色移动的例子这一篇内容中,主要来实现通过摇杆来操作技能施法位置的功能代码效果如下: 在最初的想法中,我是想将摇杆与技能施法范围以及施法位置做成一个组 ...
Java8 Strean api
Stream 遍历数据集的高级迭代器.使用StreamApi让代码: 声明式:更简洁,更易读: 可复合:更灵活: 可并行:性能更好: 使用流流的使用一般包括三件事: 一个数据源(如集合)来执行一个查 ...

python3-day5

subprocess实现sudo 自动输入密码

python3-day5的更多相关文章

随机推荐

热门专题