什么是模块:

模块就是一个功能的集合。

模块就和乐高积木差不多,你用这些模块组合出一个模型,然后也可以用这个模块加上其他的模块组合成一个新的模型

模块的种类:

1、内置模块(python自带的比如os、file等模块)

2、自定义模块,自己写的模块

3、第三方模块

模块的导入:

  1. import module #导入模块下的全部模块
  2. from module.xx.xx import * #导入模块下的全部模块
  3. from module.xx.xx import xx #导入某块下的指定模块
  4. from module.xx.xx import xx as rename #导入指定模块并给他设置别名

内置模块

1、os用于提供系统级别的操作

  1. os.getcwd() 获取当前工作目录,即当前python脚本工作的目录路径
  2. os.chdir("dirname") 改变当前脚本工作目录;相当于shellcd
  3. os.curdir 返回当前目录: ('.')
  4. os.pardir 获取当前目录的父目录字符串名:('..')
  5. os.makedirs('dirname1/dirname2') 可生成多层递归目录
  6. os.removedirs('dirname1') 若目录为空,则删除,并递归到上一级目录,如若也为空,则删除,依此类推
  7. os.mkdir('dirname') 生成单级目录;相当于shellmkdir dirname
  8. os.rmdir('dirname') 删除单级空目录,若目录不为空则无法删除,报错;相当于shellrmdir dirname
  9. os.listdir('dirname') 列出指定目录下的所有文件和子目录,包括隐藏文件,并以列表方式打印
  10. os.remove() 删除一个文件
  11. os.rename("oldname","newname") 重命名文件/目录
  12. os.stat('path/filename') 获取文件/目录信息
  13. os.sep 输出操作系统特定的路径分隔符,win下为"\\",Linux下为"/"
  14. os.linesep 输出当前平台使用的行终止符,win下为"\t\n",Linux下为"\n"
  15. os.pathsep 输出用于分割文件路径的字符串
  16. os.name 输出字符串指示当前使用平台。win->'nt'; Linux->'posix'
  17. os.system("bash command") 运行shell命令,直接显示
  18. os.environ 获取系统环境变量
  19. os.path.abspath(path) 返回path规范化的绝对路径
  20. os.path.split(path) path分割成目录和文件名二元组返回
  21. os.path.dirname(path) 返回path的目录。其实就是os.path.split(path)的第一个元素
  22. os.path.basename(path) 返回path最后的文件名。如何path以/或\结尾,那么就会返回空值。即os.path.split(path)的第二个元素
  23. os.path.exists(path) 如果path存在,返回True;如果path不存在,返回False
  24. os.path.isabs(path) 如果path是绝对路径,返回True
  25. os.path.isfile(path) 如果path是一个存在的文件,返回True。否则返回False
  26. os.path.isdir(path) 如果path是一个存在的目录,则返回True。否则返回False
  27. os.path.join(path1[, path2[, ...]]) 将多个路径组合后返回,第一个绝对路径之前的参数将被忽略
  28. os.path.getatime(path) 返回path所指向的文件或者目录的最后存取时间
  29. os.path.getmtime(path) 返回path所指向的文件或者目录的最后修改时间

2、sys用于提供对解释器相关的操作

  1. sys.argv 命令行参数List,第一个元素是程序本身路径
  2. sys.exit(n) 退出程序,正常退出时exit(0)
  3. sys.version 获取Python解释程序的版本信息
  4. sys.maxint 最大的Int
  5. sys.path 返回模块的搜索路径,初始化时使用PYTHONPATH环境变量的值
  6. sys.platform 返回操作系统平台名称
  7. sys.stdout.write('please:')
  8. val = sys.stdin.readline()[:-1]

3、hashlib

  1. mport hashlib
  2.  
  3. # ######## md5 ########
  4.  
  5. hash = hashlib.md5()
  6. hash.update('shuaige')
  7. print hash.hexdigest()
  8.  
  9. >>> import hashlib
  10. >>> hash = hashlib.md5()
  11. >>> hash.update('shuaige')
  12. >>> print hash.hexdigest()
  13. 37d2b9990df5a6843caf19352fee42a6
  14.  
  15. # ######## sha1 ########
  16.  
  17. hash = hashlib.sha1()
  18. hash.update('shuaige')
  19. print hash.hexdigest()
  20.  
  21. >>> hash = hashlib.sha1()
  22. >>> hash.update('shuaige')
  23. >>> print hash.hexdigest()
  24. fdb58cf91e7291b67815440281e4154e87747b68
  25. # ######## sha256 ########
  26.  
  27. hash = hashlib.sha256()
  28. hash.update('shuaige')
  29. print hash.hexdigest()
  30.  
  31. >>> hash = hashlib.sha256()
  32. >>> hash.update('shuaige')
  33. >>> print hash.hexdigest()
  34. 0dc6e2b03447ac1fde5a8ae5f9d609b2b37f26f5c8aeec5d244dde6184fde90d
  35. # ######## sha384 ########
  36.  
  37. hash = hashlib.sha384()
  38. hash.update('shuaige')
  39. print hash.hexdigest()
  40.  
  41. >>> hash = hashlib.sha384()
  42. >>> hash.update('shuaige')
  43. >>> print hash.hexdigest()
  44. 6dd79c38dc27c8f69411c2e77face2209606e08702fcbe7c5f73bb9e6a9ef1f58890156604ad6c71581dc5b6f7aea85e
  45. # ######## sha512 ########
  46.  
  47. hash = hashlib.sha512()
  48. hash.update('shuaige')
  49. print hash.hexdigest()
  50.  
  51. >>> hash = hashlib.sha512()
  52. >>> hash.update('shuaige')
  53. >>> print hash.hexdigest()
  54. e9c94882cb0d9c61919f4d4c539a8bafe5f5a0708d214fbd50343c1e96a01ebb732883d0b0b36bff1e542cff69071395f511650944561807488700c71fb06338

以上加密算法虽然依然非常厉害,但时候存在缺陷,即:通过撞库可以反解。所以,有必要对加密算法中添加自定义key再来做加密。

  1. hash = hashlib.md5('898oaFs09f')
  2. hash.update('shuaige')
  3. print hash.hexdigest()
  4.  
  5. -----------------------------------------------------------------------------------
  6.  
  7. >>> hash = hashlib.md5('898oaFs09f') #这里把自定义的信息加上然后在进行加密
  8. >>> hash.update('shuaige')
  9. >>> print hash.hexdigest()
  10. 6d1233c4e14a52379c6bc7a045411dc3

还有厉害的加密方法:python 还有一个 hmac 模块,它内部对我们创建 key 和 内容 再进行处理然后再加密

  1. import hmac
  2. h = hmac.new('shuaige')
  3. h.update('hello laoshi')
  4. print h.hexdigest()

4、json 和 pickle

用于序列化的两个模块

json,用于字符串 和 python数据类型间进行转换
pickle,用于python特有的类型 和 python的数据类型间进行转换

json模块提供了四个功能:dumps、dump、loads、load
pickle模块提供了四个功能:dumps、dump、loads、load

json dumps把数据类型转换成字符串 dump把数据类型转换成字符串并存储在文件中  loads把字符串转换成数据类型  load把文件打开从字符串转换成数据类型

pickle同理

现在有个场景在不同设备之间进行数据交换很low的方式就是传文件,dumps可以直接把服务器A中内存的东西发给其他服务器,比如B服务器、
在很多场景下如果用pickle的话那A的和B的程序都的是python程序这个是不现实的,很多时候都是不同的程序之间的内存交换怎么办?就用到了json(和html类似的语言)
并且josn能dump的结果更可读,那么有人就问了,那还用pickle做什么不直接用josn,是这样的josn只能把常用的数据类型序列化(列表、字典、列表、字符串、数字、),比如日期格式、类对象!josn就不行了。
为什么他不能序列化上面的东西呢?因为josn是跨语言的!

例子如下:

  1. #!/usr/bin/env python
  2. #-*- coding:utf-8 -*-
  3. __author__ = 'luotianshuai'
  4.  
  5. import json
  6.  
  7. test_dic = {'name':'luotianshuai','age':18}
  8. print '未dumps前类型为:',type(test_dic)
  9. #dumps 将数据通过特殊的形式转换为所有程序语言都识别的字符串
  10. json_str = json.dumps(test_dic)
  11. print 'dumps后的类型为:',type(json_str)
  12.  
  13. #loads 将字符串通过特殊的形式转为python是数据类型
  14.  
  15. new_dic = json.loads(json_str)
  16. print '重新loads加载为数据类型:',type(new_dic)
  17.  
  18. print '*' * 50
  19.  
  20. #dump 将数据通过特殊的形式转换为所有语言都识别的字符串并写入文件
  21. with open('test.txt','w') as openfile:
  22. json.dump(new_dic,openfile)
  23. print 'dump为文件完成!!!!!'
  24. #load 从文件读取字符串并转换为python的数据类型
  25.  
  26. with open('test.txt','rb') as loadfile:
  27. load_dic = json.load(loadfile)
  28. print 'load 并赋值给load_dic后的数据类型:',type(load_dic)

5、ConfigParser

用于对特定的配置进行操作,当前模块的名称在 python 3.x 版本中变更为 configparser。

  1. # 注释1
  2. ; 注释2
  3.  
  4. [section1]
  5. k1 = v1
  6. k2:v2
  7.  
  8. [section2]
  9. k1 = v1
  10.  
  11. import ConfigParser
  12.  
  13. config = ConfigParser.ConfigParser()
  14. config.read('i.cfg')
  15.  
  16. # ########## 读 ##########
  17. #secs = config.sections()
  18. #print secs
  19. #options = config.options('group2')
  20. #print options
  21.  
  22. #item_list = config.items('group2')
  23. #print item_list
  24.  
  25. #val = config.get('group1','key')
  26. #val = config.getint('group1','key')
  27.  
  28. # ########## 改写 ##########
  29. #sec = config.remove_section('group1')
  30. #config.write(open('i.cfg', "w"))
  31.  
  32. #sec = config.has_section('shuaige')
  33. #sec = config.add_section('shuaige')
  34. #config.write(open('i.cfg', "w"))
  35.  
  36. #config.set('group2','k1',11111)
  37. #config.write(open('i.cfg', "w"))
  38.  
  39. #config.remove_option('group2','age')
  40. #config.write(open('i.cfg', "w"))

6、操作系统相关命令:

可以执行shell命令的相关模块和函数有:

  • os.system
  • os.spawn*
  • os.popen*          --废弃
  • popen2.*           --废弃
  • commands.*      --废弃,3.x中被移除
  1. #!/usr/bin/env python
  2. #-*- coding:utf-8 -*-
  3.  
  4. import commands
  5. testmodel = commands.getoutput('fdisk -l ') #获取用户的输出结果(结果以字符串存储)
  6. type(testmodel)
  7. #<type 'str'>
  8.  
  9. commands.getstatus('/etc/passwd') #判断文件是否存在,存在返回不存在报错
  10. #'-rw-r--r-- 1 root root 1196 Oct 19 20:15 /etc/passwd'
  11. commands.getstatus('/etc/passwdsd')
  12. #'ls: cannot access /etc/passwdsd: No such file or directory'
  13.  
  14. result = commands.getstatusoutput('cmd') #获取用户的输出结果和状态正确状态为:0

以上执行shell命令的相关的模块和函数的功能均在 subprocess 模块中实现,并提供了更丰富的功能。建议以后使用此方法来执行系统命令:

call  :执行命令,输出和状态码,如果正确状态码为0,错误为大于0的值!

  1. import subprocess
  2. subprocess.call(["ls",'-l','/etc/'],shell=False) #使用python执行shell命令shell=False
    subprocess.call(‘ls -l /etc/ ’,shell=True) #使用原生的shell执行命令shell=True
  3. #一般建议统一使用python执行shell命名除非python没有的,在建议使用shell原生执行

check_call:执行命令,如果执行状态码是 0 ,则返回0,否则抛异常

  1. import subprocess
  2.  
  3. >>>subprocess.call(["ls",'-l','/etc/'],shell=False) #执行成功返回状态码0
  4. >>>subprocess.call(["ls",'-l','/etc/sdfsdf'],shell=False) #执行错误直接报异常
  5.  
  6. >>> subprocess.check_call('exit 1' ,shell=True)
  7. Traceback (most recent call last):
  8. File "<stdin>", line 1, in <module>
  9. File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
  10. raise CalledProcessError(retcode, cmd)
  11. subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1
  12. >>> subprocess.check_call('exit 0' ,shell=True)
  13. 0
  14. >>>

check_output:执行命令,如果状态码是 0 ,则返回执行结果,否则抛异常

  1. >>> subprocess.check_output(["echo", "Hello World!"]) #执行结果成功状态码是0直接返回结果
  2. 'Hello World!\n'
  3. >>> subprocess.check_output(["echo1", "Hello World!"]) #执行结果失败状态码不为0直接报错
  4. Traceback (most recent call last):
  5. File "<stdin>", line 1, in <module>
  6. File "/usr/lib/python2.7/subprocess.py", line 566, in check_output
  7. process = Popen(stdout=PIPE, *popenargs, **kwargs)
  8. File "/usr/lib/python2.7/subprocess.py", line 710, in __init__
  9. errread, errwrite)
  10. File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child
  11. raise child_exception
  12. OSError: [Errno 2] No such file or directory

subprocess.Popen(...) :用于执行复杂的系统命令

  1. '''
  2. 参数:
  3. args:shell命令,可以是字符串或者序列类型(如:list,元组)
  4. bufsize:指定缓冲。0 无缓冲,1 行缓冲,其他 缓冲区大小,负值 系统缓冲
  5. stdin, stdout, stderr:分别表示程序的标准输入、输出、错误句柄
  6. preexec_fn:只在Unix平台下有效,用于指定一个可执行对象(callable object),它将在子进程运行之前被调用
  7. close_sfs:在windows平台下,如果close_fds被设置为True,则新创建的子进程将不会继承父进程的输入、输出、错误管道。
  8. 所以不能将close_fds设置为True同时重定向子进程的标准输入、输出与错误(stdin, stdout, stderr)。
  9. shell:同上
  10. cwd:用于设置子进程的当前目录
  11. env:用于指定子进程的环境变量。如果env = None,子进程的环境变量将从父进程中继承。
  12. universal_newlines:不同系统的换行符不同,True -> 同意使用 \n
  13. startupinfo与createionflags只在windows下有效
  14. 将被传递给底层的CreateProcess()函数,用于设置子进程的一些属性,如:主窗口的外观,进程的优先级等等
  15. '''
  16.  
  17. import subprocess
  18. ret1 = subprocess.Popen(["mkdir","t1"])
  19. ret2 = subprocess.Popen("mkdir t2", shell=True)
  1. import subprocess
  2.  
  3. obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) #启动一个交互的的程序,但是你的有标准的输入和输出、错误,类似一个管道
  4. #stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE 类似管道
  5. #这个的作用是,你可以用python把外部的程序长期启动了!
  6. obj.stdin.write('print 1 \n ')
  7. obj.stdin.write('print 2 \n ')
  8. obj.stdin.write('print 3 \n ')
  9. obj.stdin.write('print 4 \n ')
  10. obj.stdin.close() #关闭标准输入
  11.  
  12. #这里输入完成了是不是的把他的输出读出来?
  13. cmd_out = obj.stdout.read() #获取启动的进程的标准输出
  14. obj.stdout.close() #关闭标准输出
  15. cmd_error = obj.stderr.read() #获取启动的进程的标准错误
  16. obj.stderr.close() #关闭启动程序的标准错误
  17.  
  18. print cmd_out #打印标准输出 (空的?)
  19. print cmd_error #打印标准错误
  20.  
  21. '''
  22. #>>> obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  23. #>>> obj.stdin.write('print 1 \n ')
  24. #>>> obj.stdin.write('print 2 \n ')
  25. #>>> obj.stdin.write('print 3 \n ')
  26. #>>> obj.stdin.write('print 4 \n ')
  27. #Traceback (most recent call last):
  28. # File "<stdin>", line 1, in <module>
  29. #IOError: [Errno 32] Broken pipe #这里是因为,pipe管道最大的能允许保存的内容为64k如果大于64k就会出现问题,线面的communicate()方法就会把输出放到内存
  30. '''
  31.  
  32. #tim@tim:~$ ps -ef |grep -i python
  33. #root 2290 2280 0 21:38 pts/0 00:00:00 python
  34. #root 2313 2290 0 21:47 pts/0 00:00:00 [python] <defunct> #这里会产生一个僵尸进程,直接使用obj.wait() 原因请看下面的
  35. #tim 2317 2292 0 21:48 pts/3 00:00:00 grep --color=auto -i python
  36. #tim@tim:~$
  37.  
  38. import subprocess
  39. obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  40. obj.stdin.write('print 1 \n ')
  41. obj.stdin.write('print 2 \n ')
  42. obj.stdin.write('print 3 \n ')
  43. obj.stdin.write('print 4 \n ')
  44.  
  45. out_error_list = obj.communicate()
  46. print out_error_list
  47.  
  48. import subprocess
  49. obj = subprocess.Popen(["python"], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  50. out_error_list = obj.communicate('print "hello"')
  51. print out_error_list

6、shutil 高级的 文件、文件夹、压缩包 处理模块

高级的 文件、文件夹、压缩包 处理模块

shutil.copyfileobj(fsrc, fdst[, length])
将文件内容拷贝到另一个文件中,可以部分内容

  1. >>> s = file('test.py','rb')
  2. >>> d = file('new.py','wb')
  3. >>> shutil.copy
  4. shutil.copy( shutil.copyfile( shutil.copymode( shutil.copytree(
  5. shutil.copy2( shutil.copyfileobj( shutil.copystat(
  6. >>> shutil.copy
  7. shutil.copy( shutil.copyfile( shutil.copymode( shutil.copytree(
  8. shutil.copy2( shutil.copyfileobj( shutil.copystat(
  9. >>> shutil.copyfileobj(s,d)
  10. >>> d.close()
  11. >>> exit()
  12. root@tim:/opt# ls
  13. new.py test.py
  14. root@tim:/opt#
  15.  
  16. '''
  17. def copyfileobj(fsrc, fdst, length=16*1024):
  18. """copy data from file-like object fsrc to file-like object fdst"""
  19. while 1:
  20. buf = fsrc.read(length)
  21. if not buf:
  22. break
  23. fdst.write(buf)
  24. '''

shutil.copyfile(src, dst)

拷贝文件

  1. >>> import shutil
  2. >>> shutil.copyfile('new.py','newnew.py')
  3. >>> import subprocess
  4. subprocess.Popen(['ls','-l'])
  5. <subprocess.Popen object at 0x7f533f61c090>
  6. >>> total 12
  7. -rw-r--r-- 1 root root 53 Dec 7 22:38 newnew.py
  8. -rw-r--r-- 1 root root 53 Dec 7 22:35 new.py
  9. -rwxr-xr-x 1 root root 53 Dec 7 22:32 test.py
  10.  
  11. '''
  12. def copyfile(src, dst):
  13. """Copy data from src to dst"""
  14. if _samefile(src, dst):
  15. raise Error("`%s` and `%s` are the same file" % (src, dst))
  16.  
  17. for fn in [src, dst]:
  18. try:
  19. st = os.stat(fn)
  20. except OSError:
  21. # File most likely does not exist
  22. pass
  23. else:
  24. # XXX What about other special files? (sockets, devices...)
  25. if stat.S_ISFIFO(st.st_mode):
  26. raise SpecialFileError("`%s` is a named pipe" % fn)
  27.  
  28. with open(src, 'rb') as fsrc:
  29. with open(dst, 'wb') as fdst:
  30. copyfileobj(fsrc, fdst)
  31. '''

shutil.copymode(src, dst)

仅拷贝权限。内容、组、用户均不变

  1. >>> subprocess.Popen("ls -l ",shell=True)
  2. <subprocess.Popen object at 0x7f533f606fd0>
  3. >>> total 12
  4. -rw-r--r-- 1 root root 53 Dec 7 22:38 newnew.py
  5. -rwxr-xr-x 1 root root 53 Dec 7 22:35 new.py
  6. -rwxr-xr-x 1 root root 53 Dec 7 22:32 test.py
  7.  
  8. >>> shutil.copymode('new.py','newnew.py')
  9. >>> subprocess.Popen("ls -l ",shell=True)
  10. <subprocess.Popen object at 0x7f533f61c050>
  11. >>> total 12
  12. -rwxr-xr-x 1 root root 53 Dec 7 22:38 newnew.py#权限已经同步成new.py的了
  13. -rwxr-xr-x 1 root root 53 Dec 7 22:35 new.py#
  14. -rwxr-xr-x 1 root root 53 Dec 7 22:32 test.py
  15.  
  16. >>>
  17.  
  18. '''
  19. def copymode(src, dst):
  20. """Copy mode bits from src to dst"""
  21. if hasattr(os, 'chmod'):
  22. st = os.stat(src)
  23. mode = stat.S_IMODE(st.st_mode)
  24. os.chmod(dst, mode)
  25. '''

shutil.copystat(src, dst)

拷贝状态的信息,包括:mode bits, atime, mtime, flags

  1. >>> subprocess.Popen("ls -l ",shell=True)
  2. <subprocess.Popen object at 0x7f533f606fd0>
  3. >>> total 12
  4. -rwxr-xr-x 1 root root 53 Dec 7 22:38 newnew.py
  5. -rwxr-xr-x 1 root root 53 Dec 7 22:35 new.py
  6. -rwxr-xr-x 1 root root 53 Dec 7 22:32 test.py
  7.  
  8. >>> shutil.copystat('test.py','new.py')
  9. >>> subprocess.Popen("ls -l ",shell=True)
  10. <subprocess.Popen object at 0x7f533f61c050>
  11. >>> total 12
  12. -rwxr-xr-x 1 root root 53 Dec 7 22:38 newnew.py
  13. -rwxr-xr-x 1 root root 53 Dec 7 22:32 new.py
  14. -rwxr-xr-x 1 root root 53 Dec 7 22:32 test.py
  15.  
  16. '''
  17. def copystat(src, dst):
  18. """Copy all stat info (mode bits, atime, mtime, flags) from src to dst"""
  19. st = os.stat(src)
  20. mode = stat.S_IMODE(st.st_mode)
  21. if hasattr(os, 'utime'):
  22. os.utime(dst, (st.st_atime, st.st_mtime))
  23. if hasattr(os, 'chmod'):
  24. os.chmod(dst, mode)
  25. if hasattr(os, 'chflags') and hasattr(st, 'st_flags'):
  26. try:
  27. os.chflags(dst, st.st_flags)
  28. except OSError, why:
  29. for err in 'EOPNOTSUPP', 'ENOTSUP':
  30. if hasattr(errno, err) and why.errno == getattr(errno, err):
  31. break
  32. else:
  33. raise
  34. '''

shutil.copy(src, dst)

拷贝文件和权限

  1. >>> subprocess.Popen(['touch','shuaige.py'])
  2. <subprocess.Popen object at 0x7f533f606fd0>
  3. >>> subprocess.Popen(['ls','-l'])
  4. <subprocess.Popen object at 0x7f533f61c310>
  5. >>> total 12
  6. -rwxr-xr-x 1 root root 53 Dec 7 22:38 newnew.py
  7. -rwxr-xr-x 1 root root 53 Dec 7 22:32 new.py
  8. -rw-r--r-- 1 root root 0 Dec 7 22:54 shuaige.py
  9. -rwxr-xr-x 1 root root 53 Dec 7 22:32 test.py
  10.  
  11. >>> shutil.copy('shuaige.py','shuaigenew.py')
  12. >>> subprocess.Popen(['ls','-l'])
  13. <subprocess.Popen object at 0x7f533f606fd0>
  14. >>> total 12
  15. -rwxr-xr-x 1 root root 53 Dec 7 22:38 newnew.py
  16. -rwxr-xr-x 1 root root 53 Dec 7 22:32 new.py
  17. -rw-r--r-- 1 root root 0 Dec 7 22:54 shuaigenew.py
  18. -rw-r--r-- 1 root root 0 Dec 7 22:54 shuaige.py
  19. -rwxr-xr-x 1 root root 53 Dec 7 22:32 test.py
  20.  
  21. '''
  22. def copy(src, dst):
  23. """Copy data and mode bits ("cp src dst").
  24.  
  25. The destination may be a directory.
  26.  
  27. """
  28. if os.path.isdir(dst):
  29. dst = os.path.join(dst, os.path.basename(src))
  30. copyfile(src, dst)
  31. copymode(src, dst)
  32. '''

shutil.copy2(src, dst)

拷贝文件和状态信息

  1. >>> subprocess.Popen(['ls','-l'])
  2. <subprocess.Popen object at 0x7f533f606fd0>
  3. >>> total 12
  4. -rwxr-xr-x 1 root root 53 Dec 7 22:38 newnew.py
  5. -rwxr-xr-x 1 root root 53 Dec 7 22:32 new.py
  6. -rw-r--r-- 1 root root 0 Dec 7 22:54 shuaigenew.py
  7. -rw-r--r-- 1 root root 0 Dec 7 22:54 shuaige.py
  8. -rwxr-xr-x 1 root root 53 Dec 7 22:32 test.py
  9.  
  10. >>> shutil.copy2('newnew.py','newcopy2.py')
  11. >>> subprocess.Popen(['ls','-l'])
  12. <subprocess.Popen object at 0x7f533f61c310>
  13. >>> total 16
  14. -rwxr-xr-x 1 root root 53 Dec 7 22:38 newcopy2.py
  15. -rwxr-xr-x 1 root root 53 Dec 7 22:38 newnew.py
  16. -rwxr-xr-x 1 root root 53 Dec 7 22:32 new.py
  17. -rw-r--r-- 1 root root 0 Dec 7 22:54 shuaigenew.py
  18. -rw-r--r-- 1 root root 0 Dec 7 22:54 shuaige.py
  19. -rwxr-xr-x 1 root root 53 Dec 7 22:32 test.py
  20.  
  21. >>> subprocess.Popen(['date'])
  22. <subprocess.Popen object at 0x7f533f606fd0>
  23. >>> Mon Dec 7 22:57:38 CST 2015
  24. '''
  25. def copy2(src, dst):
  26. """Copy data and all stat info ("cp -p src dst").
  27.  
  28. The destination may be a directory.
  29.  
  30. """
  31. if os.path.isdir(dst):
  32. dst = os.path.join(dst, os.path.basename(src))
  33. copyfile(src, dst)
  34. copystat(src, dst)
  35. '''

shutil.ignore_patterns(*patterns)
shutil.copytree(src, dst, symlinks=False, ignore=None)

递归的去拷贝文件 例如:copytree(source, destination, ignore=ignore_patterns('*.pyc', 'tmp*')

  1. #ubuntu 默认的可能没有安装tree,安装下即可apt-get install tree
  2. #
  3. root@tim:/opt# tree 1/
  4. 1/
  5. └── 2
  6. └── 3
  7. └── 4
  8. └── 5
  9.  
  10. >>> shutil.copytree("","")
  11. root@tim:/opt# tree 0
  12. 0
  13. └── 2
  14. └── 3
  15. └── 4
  16. └── 5
  17.  
  18. 4 directories, 0 files
  1. def ignore_patterns(*patterns):
  2. """Function that can be used as copytree() ignore parameter.
  3.  
  4. Patterns is a sequence of glob-style patterns
  5. that are used to exclude files"""
  6. def _ignore_patterns(path, names):
  7. ignored_names = []
  8. for pattern in patterns:
  9. ignored_names.extend(fnmatch.filter(names, pattern))
  10. return set(ignored_names)
  11. return _ignore_patterns
  12.  
  13. def copytree(src, dst, symlinks=False, ignore=None):
  14. """Recursively copy a directory tree using copy2().
  15.  
  16. The destination directory must not already exist.
  17. If exception(s) occur, an Error is raised with a list of reasons.
  18.  
  19. If the optional symlinks flag is true, symbolic links in the
  20. source tree result in symbolic links in the destination tree; if
  21. it is false, the contents of the files pointed to by symbolic
  22. links are copied.
  23.  
  24. The optional ignore argument is a callable. If given, it
  25. is called with the `src` parameter, which is the directory
  26. being visited by copytree(), and `names` which is the list of
  27. `src` contents, as returned by os.listdir():
  28.  
  29. callable(src, names) -> ignored_names
  30.  
  31. Since copytree() is called recursively, the callable will be
  32. called once for each directory that is copied. It returns a
  33. list of names relative to the `src` directory that should
  34. not be copied.
  35.  
  36. XXX Consider this example code rather than the ultimate tool.
  37.  
  38. """
  39. names = os.listdir(src)
  40. if ignore is not None:
  41. ignored_names = ignore(src, names)
  42. else:
  43. ignored_names = set()
  44.  
  45. os.makedirs(dst)
  46. errors = []
  47. for name in names:
  48. if name in ignored_names:
  49. continue
  50. srcname = os.path.join(src, name)
  51. dstname = os.path.join(dst, name)
  52. try:
  53. if symlinks and os.path.islink(srcname):
  54. linkto = os.readlink(srcname)
  55. os.symlink(linkto, dstname)
  56. elif os.path.isdir(srcname):
  57. copytree(srcname, dstname, symlinks, ignore)
  58. else:
  59. # Will raise a SpecialFileError for unsupported file types
  60. copy2(srcname, dstname)
  61. # catch the Error from the recursive copytree so that we can
  62. # continue with other files
  63. except Error, err:
  64. errors.extend(err.args[0])
  65. except EnvironmentError, why:
  66. errors.append((srcname, dstname, str(why)))
  67. try:
  68. copystat(src, dst)
  69. except OSError, why:
  70. if WindowsError is not None and isinstance(why, WindowsError):
  71. # Copying file access times may fail on Windows
  72. pass
  73. else:
  74. errors.append((src, dst, str(why)))
  75. if errors:
  76. raise Error, errors

shutil.copytree

shutil.rmtree(path[, ignore_errors[, onerror]])

递归的去删除文件

  1. >>> subprocess.Popen(["mkdir","1/2/3/4/5/6"]
  2. ... )
  3. <subprocess.Popen object at 0x7f0c175c6e10>
  4. >>> mkdir: cannot create directory 1/2/3/4/5/6’: No such file or directory
  5.  
  6. >>> subprocess.Popen(["mkdir","-p","1/2/3/4/5/6"]
  7. ...
  8. ... )
  9. <subprocess.Popen object at 0x7f0c1754ef90>
  10. >>> subprocess.P
  11. subprocess.PIPE subprocess.Popen(
  12. >>> subprocess.Popen(["tree",""])
  13. <subprocess.Popen object at 0x7f0c175c6e10>
  14. >>> 1
  15. └── 2
  16. └── 3
  17. └── 4
  18. └── 5
  19. └── 6
  20.  
  21. 5 directories, 0 files
  22.  
  23. >>> shutil.rmtree("")
  24. >>> subprocess.Popen(['ls','-l'])
  25. <subprocess.Popen object at 0x7f0c1754ef90>
  26. >>> total 28
  27. -rw-r--r-- 1 root root 82 Oct 22 03:27 newshuaige
  28. -rw-r--r-- 1 root root 82 Oct 22 03:23 shuaige
  29. drwxr-xr-x 2 root root 4096 Oct 22 02:12 t2
  30. drwxr-xr-x 2 root root 4096 Oct 22 02:14 t3
  31. drwxr-xr-x 2 root root 4096 Oct 22 02:14 t5
  32. drwxr-xr-x 2 root root 4096 Oct 22 02:14 t6
  33. drwxrwxrwx 3 root root 4096 Oct 21 21:22 tim
  1. def rmtree(path, ignore_errors=False, onerror=None):
  2. """Recursively delete a directory tree.
  3.  
  4. If ignore_errors is set, errors are ignored; otherwise, if onerror
  5. is set, it is called to handle the error with arguments (func,
  6. path, exc_info) where func is os.listdir, os.remove, or os.rmdir;
  7. path is the argument to that function that caused it to fail; and
  8. exc_info is a tuple returned by sys.exc_info(). If ignore_errors
  9. is false and onerror is None, an exception is raised.
  10.  
  11. """
  12. if ignore_errors:
  13. def onerror(*args):
  14. pass
  15. elif onerror is None:
  16. def onerror(*args):
  17. raise
  18. try:
  19. if os.path.islink(path):
  20. # symlinks to directories are forbidden, see bug #1669
  21. raise OSError("Cannot call rmtree on a symbolic link")
  22. except OSError:
  23. onerror(os.path.islink, path, sys.exc_info())
  24. # can't continue even if onerror hook returns
  25. return
  26. names = []
  27. try:
  28. names = os.listdir(path)
  29. except os.error, err:
  30. onerror(os.listdir, path, sys.exc_info())
  31. for name in names:
  32. fullname = os.path.join(path, name)
  33. try:
  34. mode = os.lstat(fullname).st_mode
  35. except os.error:
  36. mode = 0
  37. if stat.S_ISDIR(mode):
  38. rmtree(fullname, ignore_errors, onerror)
  39. else:
  40. try:
  41. os.remove(fullname)
  42. except os.error, err:
  43. onerror(os.remove, fullname, sys.exc_info())
  44. try:
  45. os.rmdir(path)
  46. except os.error:
  47. onerror(os.rmdir, path, sys.exc_info())

shutil.rmtree

shutil.move(src, dst)
递归的去移动文件

  1. >>> shutil.move("shuaige","tianshuai") #文件
  2. >>> shutil.move("t2","testd")#目录
  3. >>> subprocess.Popen(['ls','-l'])
  4. <subprocess.Popen object at 0x7f0c17562050>
  5. >>> total 28
  6. -rw-r--r-- 1 root root 82 Oct 22 03:27 newshuaige
  7. drwxr-xr-x 2 root root 4096 Oct 22 02:14 t3
  8. drwxr-xr-x 2 root root 4096 Oct 22 02:14 t5
  9. drwxr-xr-x 2 root root 4096 Oct 22 02:14 t6
  10. drwxr-xr-x 2 root root 4096 Oct 22 02:12 testd
  11. -rw-r--r-- 1 root root 82 Oct 22 03:23 tianshuai
  12. drwxrwxrwx 3 root root 4096 Oct 21 21:22 tim
  1. def move(src, dst):
  2. """Recursively move a file or directory to another location. This is
  3. similar to the Unix "mv" command.
  4.  
  5. If the destination is a directory or a symlink to a directory, the source
  6. is moved inside the directory. The destination path must not already
  7. exist.
  8.  
  9. If the destination already exists but is not a directory, it may be
  10. overwritten depending on os.rename() semantics.
  11.  
  12. If the destination is on our current filesystem, then rename() is used.
  13. Otherwise, src is copied to the destination and then removed.
  14. A lot more could be done here... A look at a mv.c shows a lot of
  15. the issues this implementation glosses over.
  16.  
  17. """
  18. real_dst = dst
  19. if os.path.isdir(dst):
  20. if _samefile(src, dst):
  21. # We might be on a case insensitive filesystem,
  22. # perform the rename anyway.
  23. os.rename(src, dst)
  24. return
  25.  
  26. real_dst = os.path.join(dst, _basename(src))
  27. if os.path.exists(real_dst):
  28. raise Error, "Destination path '%s' already exists" % real_dst
  29. try:
  30. os.rename(src, real_dst)
  31. except OSError:
  32. if os.path.isdir(src):
  33. if _destinsrc(src, dst):
  34. raise Error, "Cannot move a directory '%s' into itself '%s'." % (src, dst)
  35. copytree(src, real_dst, symlinks=True)
  36. rmtree(src)
  37. else:
  38. copy2(src, real_dst)
  39. os.unlink(src)

shutil.move

shutil.make_archive(base_name, format,...)

创建压缩包并返回文件路径,例如:zip、tar

base_name: 压缩包的文件名,也可以是压缩包的路径。只是文件名时,则保存至当前目录,否则保存至指定路径,
    如:www                        =>保存至当前路径
    如:/Users/wupeiqi/www =>保存至/Users/wupeiqi/
format: 压缩包种类,“zip”, “tar”, “bztar”,“gztar”
root_dir: 要压缩的文件夹路径(默认当前目录)
owner: 用户,默认当前用户
group: 组,默认当前组
logger: 用于记录日志,通常是logging.Logger对象

  1. >>> shutil.make_archive("compress",format="zip",root_dir="")
  2. '/opt/compress.zip'
  3. >>>
  4.  
  5. root@tim:/opt# ls
  6. 1 compress.zip newshuaige t3 t5 t6 testd tianshuai tim
  7.  
  8. root@tim:/opt# unzip compress.zip
  9. Archive: compress.zip
  10. warning [compress.zip]: zipfile is empty
  11.  
  12. #使用shutil压缩zip模式的话默认空目录不压缩!
  13.  
  14. root@tim:/opt# tree 1
  15. 1
  16. └── 2
  17. └── 3
  18. └── 4
  19. └── 5
  20. ├── 6
  21.    └── 7
  22. └── testzip
  23.  
  24. 6 directories, 1 file
  25.  
  26. >>> shutil.make_archive("compress",format="zip",root_dir="")
  27. '/opt/compress.zip'
  28. >>>
  29.  
  30. root@tim:/opt# unzip compress.zip
  31. Archive: compress.zip
  32. inflating: 2/3/4/5/testzip
  33. root@tim:/opt#
  1. def make_archive(base_name, format, root_dir=None, base_dir=None, verbose=0,
  2. dry_run=0, owner=None, group=None, logger=None):
  3. """Create an archive file (eg. zip or tar).
  4.  
  5. 'base_name' is the name of the file to create, minus any format-specific
  6. extension; 'format' is the archive format: one of "zip", "tar", "bztar"
  7. or "gztar".
  8.  
  9. 'root_dir' is a directory that will be the root directory of the
  10. archive; ie. we typically chdir into 'root_dir' before creating the
  11. archive. 'base_dir' is the directory where we start archiving from;
  12. ie. 'base_dir' will be the common prefix of all files and
  13. directories in the archive. 'root_dir' and 'base_dir' both default
  14. to the current directory. Returns the name of the archive file.
  15.  
  16. 'owner' and 'group' are used when creating a tar archive. By default,
  17. uses the current owner and group.
  18. """
  19. save_cwd = os.getcwd()
  20. if root_dir is not None:
  21. if logger is not None:
  22. logger.debug("changing into '%s'", root_dir)
  23. base_name = os.path.abspath(base_name)
  24. if not dry_run:
  25. os.chdir(root_dir)
  26.  
  27. if base_dir is None:
  28. base_dir = os.curdir
  29.  
  30. kwargs = {'dry_run': dry_run, 'logger': logger}
  31.  
  32. try:
  33. format_info = _ARCHIVE_FORMATS[format]
  34. except KeyError:
  35. raise ValueError, "unknown archive format '%s'" % format
  36.  
  37. func = format_info[0]
  38. for arg, val in format_info[1]:
  39. kwargs[arg] = val
  40.  
  41. if format != 'zip':
  42. kwargs['owner'] = owner
  43. kwargs['group'] = group
  44.  
  45. try:
  46. filename = func(base_name, base_dir, **kwargs)
  47. finally:
  48. if root_dir is not None:
  49. if logger is not None:
  50. logger.debug("changing back to '%s'", save_cwd)
  51. os.chdir(save_cwd)
  52.  
  53. return filename

shutil.archive

shutil 对压缩包的处理是调用 ZipFile 和 TarFile 两个模块来进行的,详细

  1. import zipfile
  2.  
  3. # 压缩
  4. z = zipfile.ZipFile('laxi.zip', 'w')
  5. z.write('a.log')
  6. z.write('data.data')
  7. z.close()
  8.  
  9. # 解压
  10. z = zipfile.ZipFile('laxi.zip', 'r')
  11. z.extractall()
  12. z.close()
  13.  
  14. import tarfile
  15.  
  16. # 压缩
  17. tar = tarfile.open('your.tar','w')
  18. tar.add('/Users/wupeiqi/PycharmProjects/bbs2.zip', arcname='bbs2.zip')
  19. tar.add('/Users/wupeiqi/PycharmProjects/cmdb.zip', arcname='cmdb.zip')
  20. tar.close()
  21.  
  22. # 解压
  23. tar = tarfile.open('your.tar','r')
  24. tar.extractall() # 可设置解压地址
  25. tar.close()
  1. class ZipFile(object):
  2. """ Class with methods to open, read, write, close, list zip files.
  3.  
  4. z = ZipFile(file, mode="r", compression=ZIP_STORED, allowZip64=False)
  5.  
  6. file: Either the path to the file, or a file-like object.
  7. If it is a path, the file will be opened and closed by ZipFile.
  8. mode: The mode can be either read "r", write "w" or append "a".
  9. compression: ZIP_STORED (no compression) or ZIP_DEFLATED (requires zlib).
  10. allowZip64: if True ZipFile will create files with ZIP64 extensions when
  11. needed, otherwise it will raise an exception when this would
  12. be necessary.
  13.  
  14. """
  15.  
  16. fp = None # Set here since __del__ checks it
  17.  
  18. def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=False):
  19. """Open the ZIP file with mode read "r", write "w" or append "a"."""
  20. if mode not in ("r", "w", "a"):
  21. raise RuntimeError('ZipFile() requires mode "r", "w", or "a"')
  22.  
  23. if compression == ZIP_STORED:
  24. pass
  25. elif compression == ZIP_DEFLATED:
  26. if not zlib:
  27. raise RuntimeError,\
  28. "Compression requires the (missing) zlib module"
  29. else:
  30. raise RuntimeError, "That compression method is not supported"
  31.  
  32. self._allowZip64 = allowZip64
  33. self._didModify = False
  34. self.debug = 0 # Level of printing: 0 through 3
  35. self.NameToInfo = {} # Find file info given name
  36. self.filelist = [] # List of ZipInfo instances for archive
  37. self.compression = compression # Method of compression
  38. self.mode = key = mode.replace('b', '')[0]
  39. self.pwd = None
  40. self._comment = ''
  41.  
  42. # Check if we were passed a file-like object
  43. if isinstance(file, basestring):
  44. self._filePassed = 0
  45. self.filename = file
  46. modeDict = {'r' : 'rb', 'w': 'wb', 'a' : 'r+b'}
  47. try:
  48. self.fp = open(file, modeDict[mode])
  49. except IOError:
  50. if mode == 'a':
  51. mode = key = 'w'
  52. self.fp = open(file, modeDict[mode])
  53. else:
  54. raise
  55. else:
  56. self._filePassed = 1
  57. self.fp = file
  58. self.filename = getattr(file, 'name', None)
  59.  
  60. try:
  61. if key == 'r':
  62. self._RealGetContents()
  63. elif key == 'w':
  64. # set the modified flag so central directory gets written
  65. # even if no files are added to the archive
  66. self._didModify = True
  67. elif key == 'a':
  68. try:
  69. # See if file is a zip file
  70. self._RealGetContents()
  71. # seek to start of directory and overwrite
  72. self.fp.seek(self.start_dir, 0)
  73. except BadZipfile:
  74. # file is not a zip file, just append
  75. self.fp.seek(0, 2)
  76.  
  77. # set the modified flag so central directory gets written
  78. # even if no files are added to the archive
  79. self._didModify = True
  80. else:
  81. raise RuntimeError('Mode must be "r", "w" or "a"')
  82. except:
  83. fp = self.fp
  84. self.fp = None
  85. if not self._filePassed:
  86. fp.close()
  87. raise
  88.  
  89. def __enter__(self):
  90. return self
  91.  
  92. def __exit__(self, type, value, traceback):
  93. self.close()
  94.  
  95. def _RealGetContents(self):
  96. """Read in the table of contents for the ZIP file."""
  97. fp = self.fp
  98. try:
  99. endrec = _EndRecData(fp)
  100. except IOError:
  101. raise BadZipfile("File is not a zip file")
  102. if not endrec:
  103. raise BadZipfile, "File is not a zip file"
  104. if self.debug > 1:
  105. print endrec
  106. size_cd = endrec[_ECD_SIZE] # bytes in central directory
  107. offset_cd = endrec[_ECD_OFFSET] # offset of central directory
  108. self._comment = endrec[_ECD_COMMENT] # archive comment
  109.  
  110. # "concat" is zero, unless zip was concatenated to another file
  111. concat = endrec[_ECD_LOCATION] - size_cd - offset_cd
  112. if endrec[_ECD_SIGNATURE] == stringEndArchive64:
  113. # If Zip64 extension structures are present, account for them
  114. concat -= (sizeEndCentDir64 + sizeEndCentDir64Locator)
  115.  
  116. if self.debug > 2:
  117. inferred = concat + offset_cd
  118. print "given, inferred, offset", offset_cd, inferred, concat
  119. # self.start_dir: Position of start of central directory
  120. self.start_dir = offset_cd + concat
  121. fp.seek(self.start_dir, 0)
  122. data = fp.read(size_cd)
  123. fp = cStringIO.StringIO(data)
  124. total = 0
  125. while total < size_cd:
  126. centdir = fp.read(sizeCentralDir)
  127. if len(centdir) != sizeCentralDir:
  128. raise BadZipfile("Truncated central directory")
  129. centdir = struct.unpack(structCentralDir, centdir)
  130. if centdir[_CD_SIGNATURE] != stringCentralDir:
  131. raise BadZipfile("Bad magic number for central directory")
  132. if self.debug > 2:
  133. print centdir
  134. filename = fp.read(centdir[_CD_FILENAME_LENGTH])
  135. # Create ZipInfo instance to store file information
  136. x = ZipInfo(filename)
  137. x.extra = fp.read(centdir[_CD_EXTRA_FIELD_LENGTH])
  138. x.comment = fp.read(centdir[_CD_COMMENT_LENGTH])
  139. x.header_offset = centdir[_CD_LOCAL_HEADER_OFFSET]
  140. (x.create_version, x.create_system, x.extract_version, x.reserved,
  141. x.flag_bits, x.compress_type, t, d,
  142. x.CRC, x.compress_size, x.file_size) = centdir[1:12]
  143. x.volume, x.internal_attr, x.external_attr = centdir[15:18]
  144. # Convert date/time code to (year, month, day, hour, min, sec)
  145. x._raw_time = t
  146. x.date_time = ( (d>>9)+1980, (d>>5)&0xF, d&0x1F,
  147. t>>11, (t>>5)&0x3F, (t&0x1F) * 2 )
  148.  
  149. x._decodeExtra()
  150. x.header_offset = x.header_offset + concat
  151. x.filename = x._decodeFilename()
  152. self.filelist.append(x)
  153. self.NameToInfo[x.filename] = x
  154.  
  155. # update total bytes read from central directory
  156. total = (total + sizeCentralDir + centdir[_CD_FILENAME_LENGTH]
  157. + centdir[_CD_EXTRA_FIELD_LENGTH]
  158. + centdir[_CD_COMMENT_LENGTH])
  159.  
  160. if self.debug > 2:
  161. print "total", total
  162.  
  163. def namelist(self):
  164. """Return a list of file names in the archive."""
  165. l = []
  166. for data in self.filelist:
  167. l.append(data.filename)
  168. return l
  169.  
  170. def infolist(self):
  171. """Return a list of class ZipInfo instances for files in the
  172. archive."""
  173. return self.filelist
  174.  
  175. def printdir(self):
  176. """Print a table of contents for the zip file."""
  177. print "%-46s %19s %12s" % ("File Name", "Modified ", "Size")
  178. for zinfo in self.filelist:
  179. date = "%d-%02d-%02d %02d:%02d:%02d" % zinfo.date_time[:6]
  180. print "%-46s %s %12d" % (zinfo.filename, date, zinfo.file_size)
  181.  
  182. def testzip(self):
  183. """Read all the files and check the CRC."""
  184. chunk_size = 2 ** 20
  185. for zinfo in self.filelist:
  186. try:
  187. # Read by chunks, to avoid an OverflowError or a
  188. # MemoryError with very large embedded files.
  189. with self.open(zinfo.filename, "r") as f:
  190. while f.read(chunk_size): # Check CRC-32
  191. pass
  192. except BadZipfile:
  193. return zinfo.filename
  194.  
  195. def getinfo(self, name):
  196. """Return the instance of ZipInfo given 'name'."""
  197. info = self.NameToInfo.get(name)
  198. if info is None:
  199. raise KeyError(
  200. 'There is no item named %r in the archive' % name)
  201.  
  202. return info
  203.  
  204. def setpassword(self, pwd):
  205. """Set default password for encrypted files."""
  206. self.pwd = pwd
  207.  
  208. @property
  209. def comment(self):
  210. """The comment text associated with the ZIP file."""
  211. return self._comment
  212.  
  213. @comment.setter
  214. def comment(self, comment):
  215. # check for valid comment length
  216. if len(comment) > ZIP_MAX_COMMENT:
  217. import warnings
  218. warnings.warn('Archive comment is too long; truncating to %d bytes'
  219. % ZIP_MAX_COMMENT, stacklevel=2)
  220. comment = comment[:ZIP_MAX_COMMENT]
  221. self._comment = comment
  222. self._didModify = True
  223.  
  224. def read(self, name, pwd=None):
  225. """Return file bytes (as a string) for name."""
  226. return self.open(name, "r", pwd).read()
  227.  
  228. def open(self, name, mode="r", pwd=None):
  229. """Return file-like object for 'name'."""
  230. if mode not in ("r", "U", "rU"):
  231. raise RuntimeError, 'open() requires mode "r", "U", or "rU"'
  232. if not self.fp:
  233. raise RuntimeError, \
  234. "Attempt to read ZIP archive that was already closed"
  235.  
  236. # Only open a new file for instances where we were not
  237. # given a file object in the constructor
  238. if self._filePassed:
  239. zef_file = self.fp
  240. should_close = False
  241. else:
  242. zef_file = open(self.filename, 'rb')
  243. should_close = True
  244.  
  245. try:
  246. # Make sure we have an info object
  247. if isinstance(name, ZipInfo):
  248. # 'name' is already an info object
  249. zinfo = name
  250. else:
  251. # Get info object for name
  252. zinfo = self.getinfo(name)
  253.  
  254. zef_file.seek(zinfo.header_offset, 0)
  255.  
  256. # Skip the file header:
  257. fheader = zef_file.read(sizeFileHeader)
  258. if len(fheader) != sizeFileHeader:
  259. raise BadZipfile("Truncated file header")
  260. fheader = struct.unpack(structFileHeader, fheader)
  261. if fheader[_FH_SIGNATURE] != stringFileHeader:
  262. raise BadZipfile("Bad magic number for file header")
  263.  
  264. fname = zef_file.read(fheader[_FH_FILENAME_LENGTH])
  265. if fheader[_FH_EXTRA_FIELD_LENGTH]:
  266. zef_file.read(fheader[_FH_EXTRA_FIELD_LENGTH])
  267.  
  268. if fname != zinfo.orig_filename:
  269. raise BadZipfile, \
  270. 'File name in directory "%s" and header "%s" differ.' % (
  271. zinfo.orig_filename, fname)
  272.  
  273. # check for encrypted flag & handle password
  274. is_encrypted = zinfo.flag_bits & 0x1
  275. zd = None
  276. if is_encrypted:
  277. if not pwd:
  278. pwd = self.pwd
  279. if not pwd:
  280. raise RuntimeError, "File %s is encrypted, " \
  281. "password required for extraction" % name
  282.  
  283. zd = _ZipDecrypter(pwd)
  284. # The first 12 bytes in the cypher stream is an encryption header
  285. # used to strengthen the algorithm. The first 11 bytes are
  286. # completely random, while the 12th contains the MSB of the CRC,
  287. # or the MSB of the file time depending on the header type
  288. # and is used to check the correctness of the password.
  289. bytes = zef_file.read(12)
  290. h = map(zd, bytes[0:12])
  291. if zinfo.flag_bits & 0x8:
  292. # compare against the file type from extended local headers
  293. check_byte = (zinfo._raw_time >> 8) & 0xff
  294. else:
  295. # compare against the CRC otherwise
  296. check_byte = (zinfo.CRC >> 24) & 0xff
  297. if ord(h[11]) != check_byte:
  298. raise RuntimeError("Bad password for file", name)
  299.  
  300. return ZipExtFile(zef_file, mode, zinfo, zd,
  301. close_fileobj=should_close)
  302. except:
  303. if should_close:
  304. zef_file.close()
  305. raise
  306.  
  307. def extract(self, member, path=None, pwd=None):
  308. """Extract a member from the archive to the current working directory,
  309. using its full name. Its file information is extracted as accurately
  310. as possible. `member' may be a filename or a ZipInfo object. You can
  311. specify a different directory using `path'.
  312. """
  313. if not isinstance(member, ZipInfo):
  314. member = self.getinfo(member)
  315.  
  316. if path is None:
  317. path = os.getcwd()
  318.  
  319. return self._extract_member(member, path, pwd)
  320.  
  321. def extractall(self, path=None, members=None, pwd=None):
  322. """Extract all members from the archive to the current working
  323. directory. `path' specifies a different directory to extract to.
  324. `members' is optional and must be a subset of the list returned
  325. by namelist().
  326. """
  327. if members is None:
  328. members = self.namelist()
  329.  
  330. for zipinfo in members:
  331. self.extract(zipinfo, path, pwd)
  332.  
  333. def _extract_member(self, member, targetpath, pwd):
  334. """Extract the ZipInfo object 'member' to a physical
  335. file on the path targetpath.
  336. """
  337. # build the destination pathname, replacing
  338. # forward slashes to platform specific separators.
  339. arcname = member.filename.replace('/', os.path.sep)
  340.  
  341. if os.path.altsep:
  342. arcname = arcname.replace(os.path.altsep, os.path.sep)
  343. # interpret absolute pathname as relative, remove drive letter or
  344. # UNC path, redundant separators, "." and ".." components.
  345. arcname = os.path.splitdrive(arcname)[1]
  346. arcname = os.path.sep.join(x for x in arcname.split(os.path.sep)
  347. if x not in ('', os.path.curdir, os.path.pardir))
  348. if os.path.sep == '\\':
  349. # filter illegal characters on Windows
  350. illegal = ':<>|"?*'
  351. if isinstance(arcname, unicode):
  352. table = {ord(c): ord('_') for c in illegal}
  353. else:
  354. table = string.maketrans(illegal, '_' * len(illegal))
  355. arcname = arcname.translate(table)
  356. # remove trailing dots
  357. arcname = (x.rstrip('.') for x in arcname.split(os.path.sep))
  358. arcname = os.path.sep.join(x for x in arcname if x)
  359.  
  360. targetpath = os.path.join(targetpath, arcname)
  361. targetpath = os.path.normpath(targetpath)
  362.  
  363. # Create all upper directories if necessary.
  364. upperdirs = os.path.dirname(targetpath)
  365. if upperdirs and not os.path.exists(upperdirs):
  366. os.makedirs(upperdirs)
  367.  
  368. if member.filename[-1] == '/':
  369. if not os.path.isdir(targetpath):
  370. os.mkdir(targetpath)
  371. return targetpath
  372.  
  373. with self.open(member, pwd=pwd) as source, \
  374. file(targetpath, "wb") as target:
  375. shutil.copyfileobj(source, target)
  376.  
  377. return targetpath
  378.  
  379. def _writecheck(self, zinfo):
  380. """Check for errors before writing a file to the archive."""
  381. if zinfo.filename in self.NameToInfo:
  382. import warnings
  383. warnings.warn('Duplicate name: %r' % zinfo.filename, stacklevel=3)
  384. if self.mode not in ("w", "a"):
  385. raise RuntimeError, 'write() requires mode "w" or "a"'
  386. if not self.fp:
  387. raise RuntimeError, \
  388. "Attempt to write ZIP archive that was already closed"
  389. if zinfo.compress_type == ZIP_DEFLATED and not zlib:
  390. raise RuntimeError, \
  391. "Compression requires the (missing) zlib module"
  392. if zinfo.compress_type not in (ZIP_STORED, ZIP_DEFLATED):
  393. raise RuntimeError, \
  394. "That compression method is not supported"
  395. if not self._allowZip64:
  396. requires_zip64 = None
  397. if len(self.filelist) >= ZIP_FILECOUNT_LIMIT:
  398. requires_zip64 = "Files count"
  399. elif zinfo.file_size > ZIP64_LIMIT:
  400. requires_zip64 = "Filesize"
  401. elif zinfo.header_offset > ZIP64_LIMIT:
  402. requires_zip64 = "Zipfile size"
  403. if requires_zip64:
  404. raise LargeZipFile(requires_zip64 +
  405. " would require ZIP64 extensions")
  406.  
  407. def write(self, filename, arcname=None, compress_type=None):
  408. """Put the bytes from filename into the archive under the name
  409. arcname."""
  410. if not self.fp:
  411. raise RuntimeError(
  412. "Attempt to write to ZIP archive that was already closed")
  413.  
  414. st = os.stat(filename)
  415. isdir = stat.S_ISDIR(st.st_mode)
  416. mtime = time.localtime(st.st_mtime)
  417. date_time = mtime[0:6]
  418. # Create ZipInfo instance to store file information
  419. if arcname is None:
  420. arcname = filename
  421. arcname = os.path.normpath(os.path.splitdrive(arcname)[1])
  422. while arcname[0] in (os.sep, os.altsep):
  423. arcname = arcname[1:]
  424. if isdir:
  425. arcname += '/'
  426. zinfo = ZipInfo(arcname, date_time)
  427. zinfo.external_attr = (st[0] & 0xFFFF) << 16L # Unix attributes
  428. if compress_type is None:
  429. zinfo.compress_type = self.compression
  430. else:
  431. zinfo.compress_type = compress_type
  432.  
  433. zinfo.file_size = st.st_size
  434. zinfo.flag_bits = 0x00
  435. zinfo.header_offset = self.fp.tell() # Start of header bytes
  436.  
  437. self._writecheck(zinfo)
  438. self._didModify = True
  439.  
  440. if isdir:
  441. zinfo.file_size = 0
  442. zinfo.compress_size = 0
  443. zinfo.CRC = 0
  444. zinfo.external_attr |= 0x10 # MS-DOS directory flag
  445. self.filelist.append(zinfo)
  446. self.NameToInfo[zinfo.filename] = zinfo
  447. self.fp.write(zinfo.FileHeader(False))
  448. return
  449.  
  450. with open(filename, "rb") as fp:
  451. # Must overwrite CRC and sizes with correct data later
  452. zinfo.CRC = CRC = 0
  453. zinfo.compress_size = compress_size = 0
  454. # Compressed size can be larger than uncompressed size
  455. zip64 = self._allowZip64 and \
  456. zinfo.file_size * 1.05 > ZIP64_LIMIT
  457. self.fp.write(zinfo.FileHeader(zip64))
  458. if zinfo.compress_type == ZIP_DEFLATED:
  459. cmpr = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
  460. zlib.DEFLATED, -15)
  461. else:
  462. cmpr = None
  463. file_size = 0
  464. while 1:
  465. buf = fp.read(1024 * 8)
  466. if not buf:
  467. break
  468. file_size = file_size + len(buf)
  469. CRC = crc32(buf, CRC) & 0xffffffff
  470. if cmpr:
  471. buf = cmpr.compress(buf)
  472. compress_size = compress_size + len(buf)
  473. self.fp.write(buf)
  474. if cmpr:
  475. buf = cmpr.flush()
  476. compress_size = compress_size + len(buf)
  477. self.fp.write(buf)
  478. zinfo.compress_size = compress_size
  479. else:
  480. zinfo.compress_size = file_size
  481. zinfo.CRC = CRC
  482. zinfo.file_size = file_size
  483. if not zip64 and self._allowZip64:
  484. if file_size > ZIP64_LIMIT:
  485. raise RuntimeError('File size has increased during compressing')
  486. if compress_size > ZIP64_LIMIT:
  487. raise RuntimeError('Compressed size larger than uncompressed size')
  488. # Seek backwards and write file header (which will now include
  489. # correct CRC and file sizes)
  490. position = self.fp.tell() # Preserve current position in file
  491. self.fp.seek(zinfo.header_offset, 0)
  492. self.fp.write(zinfo.FileHeader(zip64))
  493. self.fp.seek(position, 0)
  494. self.filelist.append(zinfo)
  495. self.NameToInfo[zinfo.filename] = zinfo
  496.  
  497. def writestr(self, zinfo_or_arcname, bytes, compress_type=None):
  498. """Write a file into the archive. The contents is the string
  499. 'bytes'. 'zinfo_or_arcname' is either a ZipInfo instance or
  500. the name of the file in the archive."""
  501. if not isinstance(zinfo_or_arcname, ZipInfo):
  502. zinfo = ZipInfo(filename=zinfo_or_arcname,
  503. date_time=time.localtime(time.time())[:6])
  504.  
  505. zinfo.compress_type = self.compression
  506. if zinfo.filename[-1] == '/':
  507. zinfo.external_attr = 0o40775 << 16 # drwxrwxr-x
  508. zinfo.external_attr |= 0x10 # MS-DOS directory flag
  509. else:
  510. zinfo.external_attr = 0o600 << 16 # ?rw-------
  511. else:
  512. zinfo = zinfo_or_arcname
  513.  
  514. if not self.fp:
  515. raise RuntimeError(
  516. "Attempt to write to ZIP archive that was already closed")
  517.  
  518. if compress_type is not None:
  519. zinfo.compress_type = compress_type
  520.  
  521. zinfo.file_size = len(bytes) # Uncompressed size
  522. zinfo.header_offset = self.fp.tell() # Start of header bytes
  523. self._writecheck(zinfo)
  524. self._didModify = True
  525. zinfo.CRC = crc32(bytes) & 0xffffffff # CRC-32 checksum
  526. if zinfo.compress_type == ZIP_DEFLATED:
  527. co = zlib.compressobj(zlib.Z_DEFAULT_COMPRESSION,
  528. zlib.DEFLATED, -15)
  529. bytes = co.compress(bytes) + co.flush()
  530. zinfo.compress_size = len(bytes) # Compressed size
  531. else:
  532. zinfo.compress_size = zinfo.file_size
  533. zip64 = zinfo.file_size > ZIP64_LIMIT or \
  534. zinfo.compress_size > ZIP64_LIMIT
  535. if zip64 and not self._allowZip64:
  536. raise LargeZipFile("Filesize would require ZIP64 extensions")
  537. self.fp.write(zinfo.FileHeader(zip64))
  538. self.fp.write(bytes)
  539. if zinfo.flag_bits & 0x08:
  540. # Write CRC and file sizes after the file data
  541. fmt = '<LQQ' if zip64 else '<LLL'
  542. self.fp.write(struct.pack(fmt, zinfo.CRC, zinfo.compress_size,
  543. zinfo.file_size))
  544. self.fp.flush()
  545. self.filelist.append(zinfo)
  546. self.NameToInfo[zinfo.filename] = zinfo
  547.  
  548. def __del__(self):
  549. """Call the "close()" method in case the user forgot."""
  550. self.close()
  551.  
  552. def close(self):
  553. """Close the file, and for mode "w" and "a" write the ending
  554. records."""
  555. if self.fp is None:
  556. return
  557.  
  558. try:
  559. if self.mode in ("w", "a") and self._didModify: # write ending records
  560. pos1 = self.fp.tell()
  561. for zinfo in self.filelist: # write central directory
  562. dt = zinfo.date_time
  563. dosdate = (dt[0] - 1980) << 9 | dt[1] << 5 | dt[2]
  564. dostime = dt[3] << 11 | dt[4] << 5 | (dt[5] // 2)
  565. extra = []
  566. if zinfo.file_size > ZIP64_LIMIT \
  567. or zinfo.compress_size > ZIP64_LIMIT:
  568. extra.append(zinfo.file_size)
  569. extra.append(zinfo.compress_size)
  570. file_size = 0xffffffff
  571. compress_size = 0xffffffff
  572. else:
  573. file_size = zinfo.file_size
  574. compress_size = zinfo.compress_size
  575.  
  576. if zinfo.header_offset > ZIP64_LIMIT:
  577. extra.append(zinfo.header_offset)
  578. header_offset = 0xffffffffL
  579. else:
  580. header_offset = zinfo.header_offset
  581.  
  582. extra_data = zinfo.extra
  583. if extra:
  584. # Append a ZIP64 field to the extra's
  585. extra_data = struct.pack(
  586. '<HH' + 'Q'*len(extra),
  587. 1, 8*len(extra), *extra) + extra_data
  588.  
  589. extract_version = max(45, zinfo.extract_version)
  590. create_version = max(45, zinfo.create_version)
  591. else:
  592. extract_version = zinfo.extract_version
  593. create_version = zinfo.create_version
  594.  
  595. try:
  596. filename, flag_bits = zinfo._encodeFilenameFlags()
  597. centdir = struct.pack(structCentralDir,
  598. stringCentralDir, create_version,
  599. zinfo.create_system, extract_version, zinfo.reserved,
  600. flag_bits, zinfo.compress_type, dostime, dosdate,
  601. zinfo.CRC, compress_size, file_size,
  602. len(filename), len(extra_data), len(zinfo.comment),
  603. 0, zinfo.internal_attr, zinfo.external_attr,
  604. header_offset)
  605. except DeprecationWarning:
  606. print >>sys.stderr, (structCentralDir,
  607. stringCentralDir, create_version,
  608. zinfo.create_system, extract_version, zinfo.reserved,
  609. zinfo.flag_bits, zinfo.compress_type, dostime, dosdate,
  610. zinfo.CRC, compress_size, file_size,
  611. len(zinfo.filename), len(extra_data), len(zinfo.comment),
  612. 0, zinfo.internal_attr, zinfo.external_attr,
  613. header_offset)
  614. raise
  615. self.fp.write(centdir)
  616. self.fp.write(filename)
  617. self.fp.write(extra_data)
  618. self.fp.write(zinfo.comment)
  619.  
  620. pos2 = self.fp.tell()
  621. # Write end-of-zip-archive record
  622. centDirCount = len(self.filelist)
  623. centDirSize = pos2 - pos1
  624. centDirOffset = pos1
  625. requires_zip64 = None
  626. if centDirCount > ZIP_FILECOUNT_LIMIT:
  627. requires_zip64 = "Files count"
  628. elif centDirOffset > ZIP64_LIMIT:
  629. requires_zip64 = "Central directory offset"
  630. elif centDirSize > ZIP64_LIMIT:
  631. requires_zip64 = "Central directory size"
  632. if requires_zip64:
  633. # Need to write the ZIP64 end-of-archive records
  634. if not self._allowZip64:
  635. raise LargeZipFile(requires_zip64 +
  636. " would require ZIP64 extensions")
  637. zip64endrec = struct.pack(
  638. structEndArchive64, stringEndArchive64,
  639. 44, 45, 45, 0, 0, centDirCount, centDirCount,
  640. centDirSize, centDirOffset)
  641. self.fp.write(zip64endrec)
  642.  
  643. zip64locrec = struct.pack(
  644. structEndArchive64Locator,
  645. stringEndArchive64Locator, 0, pos2, 1)
  646. self.fp.write(zip64locrec)
  647. centDirCount = min(centDirCount, 0xFFFF)
  648. centDirSize = min(centDirSize, 0xFFFFFFFF)
  649. centDirOffset = min(centDirOffset, 0xFFFFFFFF)
  650.  
  651. endrec = struct.pack(structEndArchive, stringEndArchive,
  652. 0, 0, centDirCount, centDirCount,
  653. centDirSize, centDirOffset, len(self._comment))
  654. self.fp.write(endrec)
  655. self.fp.write(self._comment)
  656. self.fp.flush()
  657. finally:
  658. fp = self.fp
  659. self.fp = None
  660. if not self._filePassed:
  661. fp.close()
  662.  
  663. ZipFile

ZipFile

  1. class TarFile(object):
  2. """The TarFile Class provides an interface to tar archives.
  3. """
  4.  
  5. debug = 0 # May be set from 0 (no msgs) to 3 (all msgs)
  6.  
  7. dereference = False # If true, add content of linked file to the
  8. # tar file, else the link.
  9.  
  10. ignore_zeros = False # If true, skips empty or invalid blocks and
  11. # continues processing.
  12.  
  13. errorlevel = 1 # If 0, fatal errors only appear in debug
  14. # messages (if debug >= 0). If > 0, errors
  15. # are passed to the caller as exceptions.
  16.  
  17. format = DEFAULT_FORMAT # The format to use when creating an archive.
  18.  
  19. encoding = ENCODING # Encoding for 8-bit character strings.
  20.  
  21. errors = None # Error handler for unicode conversion.
  22.  
  23. tarinfo = TarInfo # The default TarInfo class to use.
  24.  
  25. fileobject = ExFileObject # The default ExFileObject class to use.
  26.  
  27. def __init__(self, name=None, mode="r", fileobj=None, format=None,
  28. tarinfo=None, dereference=None, ignore_zeros=None, encoding=None,
  29. errors=None, pax_headers=None, debug=None, errorlevel=None):
  30. """Open an (uncompressed) tar archive `name'. `mode' is either 'r' to
  31. read from an existing archive, 'a' to append data to an existing
  32. file or 'w' to create a new file overwriting an existing one. `mode'
  33. defaults to 'r'.
  34. If `fileobj' is given, it is used for reading or writing data. If it
  35. can be determined, `mode' is overridden by `fileobj's mode.
  36. `fileobj' is not closed, when TarFile is closed.
  37. """
  38. modes = {"r": "rb", "a": "r+b", "w": "wb"}
  39. if mode not in modes:
  40. raise ValueError("mode must be 'r', 'a' or 'w'")
  41. self.mode = mode
  42. self._mode = modes[mode]
  43.  
  44. if not fileobj:
  45. if self.mode == "a" and not os.path.exists(name):
  46. # Create nonexistent files in append mode.
  47. self.mode = "w"
  48. self._mode = "wb"
  49. fileobj = bltn_open(name, self._mode)
  50. self._extfileobj = False
  51. else:
  52. if name is None and hasattr(fileobj, "name"):
  53. name = fileobj.name
  54. if hasattr(fileobj, "mode"):
  55. self._mode = fileobj.mode
  56. self._extfileobj = True
  57. self.name = os.path.abspath(name) if name else None
  58. self.fileobj = fileobj
  59.  
  60. # Init attributes.
  61. if format is not None:
  62. self.format = format
  63. if tarinfo is not None:
  64. self.tarinfo = tarinfo
  65. if dereference is not None:
  66. self.dereference = dereference
  67. if ignore_zeros is not None:
  68. self.ignore_zeros = ignore_zeros
  69. if encoding is not None:
  70. self.encoding = encoding
  71.  
  72. if errors is not None:
  73. self.errors = errors
  74. elif mode == "r":
  75. self.errors = "utf-8"
  76. else:
  77. self.errors = "strict"
  78.  
  79. if pax_headers is not None and self.format == PAX_FORMAT:
  80. self.pax_headers = pax_headers
  81. else:
  82. self.pax_headers = {}
  83.  
  84. if debug is not None:
  85. self.debug = debug
  86. if errorlevel is not None:
  87. self.errorlevel = errorlevel
  88.  
  89. # Init datastructures.
  90. self.closed = False
  91. self.members = [] # list of members as TarInfo objects
  92. self._loaded = False # flag if all members have been read
  93. self.offset = self.fileobj.tell()
  94. # current position in the archive file
  95. self.inodes = {} # dictionary caching the inodes of
  96. # archive members already added
  97.  
  98. try:
  99. if self.mode == "r":
  100. self.firstmember = None
  101. self.firstmember = self.next()
  102.  
  103. if self.mode == "a":
  104. # Move to the end of the archive,
  105. # before the first empty block.
  106. while True:
  107. self.fileobj.seek(self.offset)
  108. try:
  109. tarinfo = self.tarinfo.fromtarfile(self)
  110. self.members.append(tarinfo)
  111. except EOFHeaderError:
  112. self.fileobj.seek(self.offset)
  113. break
  114. except HeaderError, e:
  115. raise ReadError(str(e))
  116.  
  117. if self.mode in "aw":
  118. self._loaded = True
  119.  
  120. if self.pax_headers:
  121. buf = self.tarinfo.create_pax_global_header(self.pax_headers.copy())
  122. self.fileobj.write(buf)
  123. self.offset += len(buf)
  124. except:
  125. if not self._extfileobj:
  126. self.fileobj.close()
  127. self.closed = True
  128. raise
  129.  
  130. def _getposix(self):
  131. return self.format == USTAR_FORMAT
  132. def _setposix(self, value):
  133. import warnings
  134. warnings.warn("use the format attribute instead", DeprecationWarning,
  135. 2)
  136. if value:
  137. self.format = USTAR_FORMAT
  138. else:
  139. self.format = GNU_FORMAT
  140. posix = property(_getposix, _setposix)
  141.  
  142. #--------------------------------------------------------------------------
  143. # Below are the classmethods which act as alternate constructors to the
  144. # TarFile class. The open() method is the only one that is needed for
  145. # public use; it is the "super"-constructor and is able to select an
  146. # adequate "sub"-constructor for a particular compression using the mapping
  147. # from OPEN_METH.
  148. #
  149. # This concept allows one to subclass TarFile without losing the comfort of
  150. # the super-constructor. A sub-constructor is registered and made available
  151. # by adding it to the mapping in OPEN_METH.
  152.  
  153. @classmethod
  154. def open(cls, name=None, mode="r", fileobj=None, bufsize=RECORDSIZE, **kwargs):
  155. """Open a tar archive for reading, writing or appending. Return
  156. an appropriate TarFile class.
  157.  
  158. mode:
  159. 'r' or 'r:*' open for reading with transparent compression
  160. 'r:' open for reading exclusively uncompressed
  161. 'r:gz' open for reading with gzip compression
  162. 'r:bz2' open for reading with bzip2 compression
  163. 'a' or 'a:' open for appending, creating the file if necessary
  164. 'w' or 'w:' open for writing without compression
  165. 'w:gz' open for writing with gzip compression
  166. 'w:bz2' open for writing with bzip2 compression
  167.  
  168. 'r|*' open a stream of tar blocks with transparent compression
  169. 'r|' open an uncompressed stream of tar blocks for reading
  170. 'r|gz' open a gzip compressed stream of tar blocks
  171. 'r|bz2' open a bzip2 compressed stream of tar blocks
  172. 'w|' open an uncompressed stream for writing
  173. 'w|gz' open a gzip compressed stream for writing
  174. 'w|bz2' open a bzip2 compressed stream for writing
  175. """
  176.  
  177. if not name and not fileobj:
  178. raise ValueError("nothing to open")
  179.  
  180. if mode in ("r", "r:*"):
  181. # Find out which *open() is appropriate for opening the file.
  182. for comptype in cls.OPEN_METH:
  183. func = getattr(cls, cls.OPEN_METH[comptype])
  184. if fileobj is not None:
  185. saved_pos = fileobj.tell()
  186. try:
  187. return func(name, "r", fileobj, **kwargs)
  188. except (ReadError, CompressionError), e:
  189. if fileobj is not None:
  190. fileobj.seek(saved_pos)
  191. continue
  192. raise ReadError("file could not be opened successfully")
  193.  
  194. elif ":" in mode:
  195. filemode, comptype = mode.split(":", 1)
  196. filemode = filemode or "r"
  197. comptype = comptype or "tar"
  198.  
  199. # Select the *open() function according to
  200. # given compression.
  201. if comptype in cls.OPEN_METH:
  202. func = getattr(cls, cls.OPEN_METH[comptype])
  203. else:
  204. raise CompressionError("unknown compression type %r" % comptype)
  205. return func(name, filemode, fileobj, **kwargs)
  206.  
  207. elif "|" in mode:
  208. filemode, comptype = mode.split("|", 1)
  209. filemode = filemode or "r"
  210. comptype = comptype or "tar"
  211.  
  212. if filemode not in ("r", "w"):
  213. raise ValueError("mode must be 'r' or 'w'")
  214.  
  215. stream = _Stream(name, filemode, comptype, fileobj, bufsize)
  216. try:
  217. t = cls(name, filemode, stream, **kwargs)
  218. except:
  219. stream.close()
  220. raise
  221. t._extfileobj = False
  222. return t
  223.  
  224. elif mode in ("a", "w"):
  225. return cls.taropen(name, mode, fileobj, **kwargs)
  226.  
  227. raise ValueError("undiscernible mode")
  228.  
  229. @classmethod
  230. def taropen(cls, name, mode="r", fileobj=None, **kwargs):
  231. """Open uncompressed tar archive name for reading or writing.
  232. """
  233. if mode not in ("r", "a", "w"):
  234. raise ValueError("mode must be 'r', 'a' or 'w'")
  235. return cls(name, mode, fileobj, **kwargs)
  236.  
  237. @classmethod
  238. def gzopen(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
  239. """Open gzip compressed tar archive name for reading or writing.
  240. Appending is not allowed.
  241. """
  242. if mode not in ("r", "w"):
  243. raise ValueError("mode must be 'r' or 'w'")
  244.  
  245. try:
  246. import gzip
  247. gzip.GzipFile
  248. except (ImportError, AttributeError):
  249. raise CompressionError("gzip module is not available")
  250.  
  251. try:
  252. fileobj = gzip.GzipFile(name, mode, compresslevel, fileobj)
  253. except OSError:
  254. if fileobj is not None and mode == 'r':
  255. raise ReadError("not a gzip file")
  256. raise
  257.  
  258. try:
  259. t = cls.taropen(name, mode, fileobj, **kwargs)
  260. except IOError:
  261. fileobj.close()
  262. if mode == 'r':
  263. raise ReadError("not a gzip file")
  264. raise
  265. except:
  266. fileobj.close()
  267. raise
  268. t._extfileobj = False
  269. return t
  270.  
  271. @classmethod
  272. def bz2open(cls, name, mode="r", fileobj=None, compresslevel=9, **kwargs):
  273. """Open bzip2 compressed tar archive name for reading or writing.
  274. Appending is not allowed.
  275. """
  276. if mode not in ("r", "w"):
  277. raise ValueError("mode must be 'r' or 'w'.")
  278.  
  279. try:
  280. import bz2
  281. except ImportError:
  282. raise CompressionError("bz2 module is not available")
  283.  
  284. if fileobj is not None:
  285. fileobj = _BZ2Proxy(fileobj, mode)
  286. else:
  287. fileobj = bz2.BZ2File(name, mode, compresslevel=compresslevel)
  288.  
  289. try:
  290. t = cls.taropen(name, mode, fileobj, **kwargs)
  291. except (IOError, EOFError):
  292. fileobj.close()
  293. if mode == 'r':
  294. raise ReadError("not a bzip2 file")
  295. raise
  296. except:
  297. fileobj.close()
  298. raise
  299. t._extfileobj = False
  300. return t
  301.  
  302. # All *open() methods are registered here.
  303. OPEN_METH = {
  304. "tar": "taropen", # uncompressed tar
  305. "gz": "gzopen", # gzip compressed tar
  306. "bz2": "bz2open" # bzip2 compressed tar
  307. }
  308.  
  309. #--------------------------------------------------------------------------
  310. # The public methods which TarFile provides:
  311.  
  312. def close(self):
  313. """Close the TarFile. In write-mode, two finishing zero blocks are
  314. appended to the archive.
  315. """
  316. if self.closed:
  317. return
  318.  
  319. if self.mode in "aw":
  320. self.fileobj.write(NUL * (BLOCKSIZE * 2))
  321. self.offset += (BLOCKSIZE * 2)
  322. # fill up the end with zero-blocks
  323. # (like option -b20 for tar does)
  324. blocks, remainder = divmod(self.offset, RECORDSIZE)
  325. if remainder > 0:
  326. self.fileobj.write(NUL * (RECORDSIZE - remainder))
  327.  
  328. if not self._extfileobj:
  329. self.fileobj.close()
  330. self.closed = True
  331.  
  332. def getmember(self, name):
  333. """Return a TarInfo object for member `name'. If `name' can not be
  334. found in the archive, KeyError is raised. If a member occurs more
  335. than once in the archive, its last occurrence is assumed to be the
  336. most up-to-date version.
  337. """
  338. tarinfo = self._getmember(name)
  339. if tarinfo is None:
  340. raise KeyError("filename %r not found" % name)
  341. return tarinfo
  342.  
  343. def getmembers(self):
  344. """Return the members of the archive as a list of TarInfo objects. The
  345. list has the same order as the members in the archive.
  346. """
  347. self._check()
  348. if not self._loaded: # if we want to obtain a list of
  349. self._load() # all members, we first have to
  350. # scan the whole archive.
  351. return self.members
  352.  
  353. def getnames(self):
  354. """Return the members of the archive as a list of their names. It has
  355. the same order as the list returned by getmembers().
  356. """
  357. return [tarinfo.name for tarinfo in self.getmembers()]
  358.  
  359. def gettarinfo(self, name=None, arcname=None, fileobj=None):
  360. """Create a TarInfo object for either the file `name' or the file
  361. object `fileobj' (using os.fstat on its file descriptor). You can
  362. modify some of the TarInfo's attributes before you add it using
  363. addfile(). If given, `arcname' specifies an alternative name for the
  364. file in the archive.
  365. """
  366. self._check("aw")
  367.  
  368. # When fileobj is given, replace name by
  369. # fileobj's real name.
  370. if fileobj is not None:
  371. name = fileobj.name
  372.  
  373. # Building the name of the member in the archive.
  374. # Backward slashes are converted to forward slashes,
  375. # Absolute paths are turned to relative paths.
  376. if arcname is None:
  377. arcname = name
  378. drv, arcname = os.path.splitdrive(arcname)
  379. arcname = arcname.replace(os.sep, "/")
  380. arcname = arcname.lstrip("/")
  381.  
  382. # Now, fill the TarInfo object with
  383. # information specific for the file.
  384. tarinfo = self.tarinfo()
  385. tarinfo.tarfile = self
  386.  
  387. # Use os.stat or os.lstat, depending on platform
  388. # and if symlinks shall be resolved.
  389. if fileobj is None:
  390. if hasattr(os, "lstat") and not self.dereference:
  391. statres = os.lstat(name)
  392. else:
  393. statres = os.stat(name)
  394. else:
  395. statres = os.fstat(fileobj.fileno())
  396. linkname = ""
  397.  
  398. stmd = statres.st_mode
  399. if stat.S_ISREG(stmd):
  400. inode = (statres.st_ino, statres.st_dev)
  401. if not self.dereference and statres.st_nlink > 1 and \
  402. inode in self.inodes and arcname != self.inodes[inode]:
  403. # Is it a hardlink to an already
  404. # archived file?
  405. type = LNKTYPE
  406. linkname = self.inodes[inode]
  407. else:
  408. # The inode is added only if its valid.
  409. # For win32 it is always 0.
  410. type = REGTYPE
  411. if inode[0]:
  412. self.inodes[inode] = arcname
  413. elif stat.S_ISDIR(stmd):
  414. type = DIRTYPE
  415. elif stat.S_ISFIFO(stmd):
  416. type = FIFOTYPE
  417. elif stat.S_ISLNK(stmd):
  418. type = SYMTYPE
  419. linkname = os.readlink(name)
  420. elif stat.S_ISCHR(stmd):
  421. type = CHRTYPE
  422. elif stat.S_ISBLK(stmd):
  423. type = BLKTYPE
  424. else:
  425. return None
  426.  
  427. # Fill the TarInfo object with all
  428. # information we can get.
  429. tarinfo.name = arcname
  430. tarinfo.mode = stmd
  431. tarinfo.uid = statres.st_uid
  432. tarinfo.gid = statres.st_gid
  433. if type == REGTYPE:
  434. tarinfo.size = statres.st_size
  435. else:
  436. tarinfo.size = 0L
  437. tarinfo.mtime = statres.st_mtime
  438. tarinfo.type = type
  439. tarinfo.linkname = linkname
  440. if pwd:
  441. try:
  442. tarinfo.uname = pwd.getpwuid(tarinfo.uid)[0]
  443. except KeyError:
  444. pass
  445. if grp:
  446. try:
  447. tarinfo.gname = grp.getgrgid(tarinfo.gid)[0]
  448. except KeyError:
  449. pass
  450.  
  451. if type in (CHRTYPE, BLKTYPE):
  452. if hasattr(os, "major") and hasattr(os, "minor"):
  453. tarinfo.devmajor = os.major(statres.st_rdev)
  454. tarinfo.devminor = os.minor(statres.st_rdev)
  455. return tarinfo
  456.  
  457. def list(self, verbose=True):
  458. """Print a table of contents to sys.stdout. If `verbose' is False, only
  459. the names of the members are printed. If it is True, an `ls -l'-like
  460. output is produced.
  461. """
  462. self._check()
  463.  
  464. for tarinfo in self:
  465. if verbose:
  466. print filemode(tarinfo.mode),
  467. print "%s/%s" % (tarinfo.uname or tarinfo.uid,
  468. tarinfo.gname or tarinfo.gid),
  469. if tarinfo.ischr() or tarinfo.isblk():
  470. print "%10s" % ("%d,%d" \
  471. % (tarinfo.devmajor, tarinfo.devminor)),
  472. else:
  473. print "%10d" % tarinfo.size,
  474. print "%d-%02d-%02d %02d:%02d:%02d" \
  475. % time.localtime(tarinfo.mtime)[:6],
  476.  
  477. print tarinfo.name + ("/" if tarinfo.isdir() else ""),
  478.  
  479. if verbose:
  480. if tarinfo.issym():
  481. print "->", tarinfo.linkname,
  482. if tarinfo.islnk():
  483. print "link to", tarinfo.linkname,
  484. print
  485.  
  486. def add(self, name, arcname=None, recursive=True, exclude=None, filter=None):
  487. """Add the file `name' to the archive. `name' may be any type of file
  488. (directory, fifo, symbolic link, etc.). If given, `arcname'
  489. specifies an alternative name for the file in the archive.
  490. Directories are added recursively by default. This can be avoided by
  491. setting `recursive' to False. `exclude' is a function that should
  492. return True for each filename to be excluded. `filter' is a function
  493. that expects a TarInfo object argument and returns the changed
  494. TarInfo object, if it returns None the TarInfo object will be
  495. excluded from the archive.
  496. """
  497. self._check("aw")
  498.  
  499. if arcname is None:
  500. arcname = name
  501.  
  502. # Exclude pathnames.
  503. if exclude is not None:
  504. import warnings
  505. warnings.warn("use the filter argument instead",
  506. DeprecationWarning, 2)
  507. if exclude(name):
  508. self._dbg(2, "tarfile: Excluded %r" % name)
  509. return
  510.  
  511. # Skip if somebody tries to archive the archive...
  512. if self.name is not None and os.path.abspath(name) == self.name:
  513. self._dbg(2, "tarfile: Skipped %r" % name)
  514. return
  515.  
  516. self._dbg(1, name)
  517.  
  518. # Create a TarInfo object from the file.
  519. tarinfo = self.gettarinfo(name, arcname)
  520.  
  521. if tarinfo is None:
  522. self._dbg(1, "tarfile: Unsupported type %r" % name)
  523. return
  524.  
  525. # Change or exclude the TarInfo object.
  526. if filter is not None:
  527. tarinfo = filter(tarinfo)
  528. if tarinfo is None:
  529. self._dbg(2, "tarfile: Excluded %r" % name)
  530. return
  531.  
  532. # Append the tar header and data to the archive.
  533. if tarinfo.isreg():
  534. with bltn_open(name, "rb") as f:
  535. self.addfile(tarinfo, f)
  536.  
  537. elif tarinfo.isdir():
  538. self.addfile(tarinfo)
  539. if recursive:
  540. for f in os.listdir(name):
  541. self.add(os.path.join(name, f), os.path.join(arcname, f),
  542. recursive, exclude, filter)
  543.  
  544. else:
  545. self.addfile(tarinfo)
  546.  
  547. def addfile(self, tarinfo, fileobj=None):
  548. """Add the TarInfo object `tarinfo' to the archive. If `fileobj' is
  549. given, tarinfo.size bytes are read from it and added to the archive.
  550. You can create TarInfo objects using gettarinfo().
  551. On Windows platforms, `fileobj' should always be opened with mode
  552. 'rb' to avoid irritation about the file size.
  553. """
  554. self._check("aw")
  555.  
  556. tarinfo = copy.copy(tarinfo)
  557.  
  558. buf = tarinfo.tobuf(self.format, self.encoding, self.errors)
  559. self.fileobj.write(buf)
  560. self.offset += len(buf)
  561.  
  562. # If there's data to follow, append it.
  563. if fileobj is not None:
  564. copyfileobj(fileobj, self.fileobj, tarinfo.size)
  565. blocks, remainder = divmod(tarinfo.size, BLOCKSIZE)
  566. if remainder > 0:
  567. self.fileobj.write(NUL * (BLOCKSIZE - remainder))
  568. blocks += 1
  569. self.offset += blocks * BLOCKSIZE
  570.  
  571. self.members.append(tarinfo)
  572.  
  573. def extractall(self, path=".", members=None):
  574. """Extract all members from the archive to the current working
  575. directory and set owner, modification time and permissions on
  576. directories afterwards. `path' specifies a different directory
  577. to extract to. `members' is optional and must be a subset of the
  578. list returned by getmembers().
  579. """
  580. directories = []
  581.  
  582. if members is None:
  583. members = self
  584.  
  585. for tarinfo in members:
  586. if tarinfo.isdir():
  587. # Extract directories with a safe mode.
  588. directories.append(tarinfo)
  589. tarinfo = copy.copy(tarinfo)
  590. tarinfo.mode = 0700
  591. self.extract(tarinfo, path)
  592.  
  593. # Reverse sort directories.
  594. directories.sort(key=operator.attrgetter('name'))
  595. directories.reverse()
  596.  
  597. # Set correct owner, mtime and filemode on directories.
  598. for tarinfo in directories:
  599. dirpath = os.path.join(path, tarinfo.name)
  600. try:
  601. self.chown(tarinfo, dirpath)
  602. self.utime(tarinfo, dirpath)
  603. self.chmod(tarinfo, dirpath)
  604. except ExtractError, e:
  605. if self.errorlevel > 1:
  606. raise
  607. else:
  608. self._dbg(1, "tarfile: %s" % e)
  609.  
  610. def extract(self, member, path=""):
  611. """Extract a member from the archive to the current working directory,
  612. using its full name. Its file information is extracted as accurately
  613. as possible. `member' may be a filename or a TarInfo object. You can
  614. specify a different directory using `path'.
  615. """
  616. self._check("r")
  617.  
  618. if isinstance(member, basestring):
  619. tarinfo = self.getmember(member)
  620. else:
  621. tarinfo = member
  622.  
  623. # Prepare the link target for makelink().
  624. if tarinfo.islnk():
  625. tarinfo._link_target = os.path.join(path, tarinfo.linkname)
  626.  
  627. try:
  628. self._extract_member(tarinfo, os.path.join(path, tarinfo.name))
  629. except EnvironmentError, e:
  630. if self.errorlevel > 0:
  631. raise
  632. else:
  633. if e.filename is None:
  634. self._dbg(1, "tarfile: %s" % e.strerror)
  635. else:
  636. self._dbg(1, "tarfile: %s %r" % (e.strerror, e.filename))
  637. except ExtractError, e:
  638. if self.errorlevel > 1:
  639. raise
  640. else:
  641. self._dbg(1, "tarfile: %s" % e)
  642.  
  643. def extractfile(self, member):
  644. """Extract a member from the archive as a file object. `member' may be
  645. a filename or a TarInfo object. If `member' is a regular file, a
  646. file-like object is returned. If `member' is a link, a file-like
  647. object is constructed from the link's target. If `member' is none of
  648. the above, None is returned.
  649. The file-like object is read-only and provides the following
  650. methods: read(), readline(), readlines(), seek() and tell()
  651. """
  652. self._check("r")
  653.  
  654. if isinstance(member, basestring):
  655. tarinfo = self.getmember(member)
  656. else:
  657. tarinfo = member
  658.  
  659. if tarinfo.isreg():
  660. return self.fileobject(self, tarinfo)
  661.  
  662. elif tarinfo.type not in SUPPORTED_TYPES:
  663. # If a member's type is unknown, it is treated as a
  664. # regular file.
  665. return self.fileobject(self, tarinfo)
  666.  
  667. elif tarinfo.islnk() or tarinfo.issym():
  668. if isinstance(self.fileobj, _Stream):
  669. # A small but ugly workaround for the case that someone tries
  670. # to extract a (sym)link as a file-object from a non-seekable
  671. # stream of tar blocks.
  672. raise StreamError("cannot extract (sym)link as file object")
  673. else:
  674. # A (sym)link's file object is its target's file object.
  675. return self.extractfile(self._find_link_target(tarinfo))
  676. else:
  677. # If there's no data associated with the member (directory, chrdev,
  678. # blkdev, etc.), return None instead of a file object.
  679. return None
  680.  
  681. def _extract_member(self, tarinfo, targetpath):
  682. """Extract the TarInfo object tarinfo to a physical
  683. file called targetpath.
  684. """
  685. # Fetch the TarInfo object for the given name
  686. # and build the destination pathname, replacing
  687. # forward slashes to platform specific separators.
  688. targetpath = targetpath.rstrip("/")
  689. targetpath = targetpath.replace("/", os.sep)
  690.  
  691. # Create all upper directories.
  692. upperdirs = os.path.dirname(targetpath)
  693. if upperdirs and not os.path.exists(upperdirs):
  694. # Create directories that are not part of the archive with
  695. # default permissions.
  696. os.makedirs(upperdirs)
  697.  
  698. if tarinfo.islnk() or tarinfo.issym():
  699. self._dbg(1, "%s -> %s" % (tarinfo.name, tarinfo.linkname))
  700. else:
  701. self._dbg(1, tarinfo.name)
  702.  
  703. if tarinfo.isreg():
  704. self.makefile(tarinfo, targetpath)
  705. elif tarinfo.isdir():
  706. self.makedir(tarinfo, targetpath)
  707. elif tarinfo.isfifo():
  708. self.makefifo(tarinfo, targetpath)
  709. elif tarinfo.ischr() or tarinfo.isblk():
  710. self.makedev(tarinfo, targetpath)
  711. elif tarinfo.islnk() or tarinfo.issym():
  712. self.makelink(tarinfo, targetpath)
  713. elif tarinfo.type not in SUPPORTED_TYPES:
  714. self.makeunknown(tarinfo, targetpath)
  715. else:
  716. self.makefile(tarinfo, targetpath)
  717.  
  718. self.chown(tarinfo, targetpath)
  719. if not tarinfo.issym():
  720. self.chmod(tarinfo, targetpath)
  721. self.utime(tarinfo, targetpath)
  722.  
  723. #--------------------------------------------------------------------------
  724. # Below are the different file methods. They are called via
  725. # _extract_member() when extract() is called. They can be replaced in a
  726. # subclass to implement other functionality.
  727.  
  728. def makedir(self, tarinfo, targetpath):
  729. """Make a directory called targetpath.
  730. """
  731. try:
  732. # Use a safe mode for the directory, the real mode is set
  733. # later in _extract_member().
  734. os.mkdir(targetpath, 0700)
  735. except EnvironmentError, e:
  736. if e.errno != errno.EEXIST:
  737. raise
  738.  
  739. def makefile(self, tarinfo, targetpath):
  740. """Make a file called targetpath.
  741. """
  742. source = self.extractfile(tarinfo)
  743. try:
  744. with bltn_open(targetpath, "wb") as target:
  745. copyfileobj(source, target)
  746. finally:
  747. source.close()
  748.  
  749. def makeunknown(self, tarinfo, targetpath):
  750. """Make a file from a TarInfo object with an unknown type
  751. at targetpath.
  752. """
  753. self.makefile(tarinfo, targetpath)
  754. self._dbg(1, "tarfile: Unknown file type %r, " \
  755. "extracted as regular file." % tarinfo.type)
  756.  
  757. def makefifo(self, tarinfo, targetpath):
  758. """Make a fifo called targetpath.
  759. """
  760. if hasattr(os, "mkfifo"):
  761. os.mkfifo(targetpath)
  762. else:
  763. raise ExtractError("fifo not supported by system")
  764.  
  765. def makedev(self, tarinfo, targetpath):
  766. """Make a character or block device called targetpath.
  767. """
  768. if not hasattr(os, "mknod") or not hasattr(os, "makedev"):
  769. raise ExtractError("special devices not supported by system")
  770.  
  771. mode = tarinfo.mode
  772. if tarinfo.isblk():
  773. mode |= stat.S_IFBLK
  774. else:
  775. mode |= stat.S_IFCHR
  776.  
  777. os.mknod(targetpath, mode,
  778. os.makedev(tarinfo.devmajor, tarinfo.devminor))
  779.  
  780. def makelink(self, tarinfo, targetpath):
  781. """Make a (symbolic) link called targetpath. If it cannot be created
  782. (platform limitation), we try to make a copy of the referenced file
  783. instead of a link.
  784. """
  785. if hasattr(os, "symlink") and hasattr(os, "link"):
  786. # For systems that support symbolic and hard links.
  787. if tarinfo.issym():
  788. if os.path.lexists(targetpath):
  789. os.unlink(targetpath)
  790. os.symlink(tarinfo.linkname, targetpath)
  791. else:
  792. # See extract().
  793. if os.path.exists(tarinfo._link_target):
  794. if os.path.lexists(targetpath):
  795. os.unlink(targetpath)
  796. os.link(tarinfo._link_target, targetpath)
  797. else:
  798. self._extract_member(self._find_link_target(tarinfo), targetpath)
  799. else:
  800. try:
  801. self._extract_member(self._find_link_target(tarinfo), targetpath)
  802. except KeyError:
  803. raise ExtractError("unable to resolve link inside archive")
  804.  
  805. def chown(self, tarinfo, targetpath):
  806. """Set owner of targetpath according to tarinfo.
  807. """
  808. if pwd and hasattr(os, "geteuid") and os.geteuid() == 0:
  809. # We have to be root to do so.
  810. try:
  811. g = grp.getgrnam(tarinfo.gname)[2]
  812. except KeyError:
  813. g = tarinfo.gid
  814. try:
  815. u = pwd.getpwnam(tarinfo.uname)[2]
  816. except KeyError:
  817. u = tarinfo.uid
  818. try:
  819. if tarinfo.issym() and hasattr(os, "lchown"):
  820. os.lchown(targetpath, u, g)
  821. else:
  822. if sys.platform != "os2emx":
  823. os.chown(targetpath, u, g)
  824. except EnvironmentError, e:
  825. raise ExtractError("could not change owner")
  826.  
  827. def chmod(self, tarinfo, targetpath):
  828. """Set file permissions of targetpath according to tarinfo.
  829. """
  830. if hasattr(os, 'chmod'):
  831. try:
  832. os.chmod(targetpath, tarinfo.mode)
  833. except EnvironmentError, e:
  834. raise ExtractError("could not change mode")
  835.  
  836. def utime(self, tarinfo, targetpath):
  837. """Set modification time of targetpath according to tarinfo.
  838. """
  839. if not hasattr(os, 'utime'):
  840. return
  841. try:
  842. os.utime(targetpath, (tarinfo.mtime, tarinfo.mtime))
  843. except EnvironmentError, e:
  844. raise ExtractError("could not change modification time")
  845.  
  846. #--------------------------------------------------------------------------
  847. def next(self):
  848. """Return the next member of the archive as a TarInfo object, when
  849. TarFile is opened for reading. Return None if there is no more
  850. available.
  851. """
  852. self._check("ra")
  853. if self.firstmember is not None:
  854. m = self.firstmember
  855. self.firstmember = None
  856. return m
  857.  
  858. # Read the next block.
  859. self.fileobj.seek(self.offset)
  860. tarinfo = None
  861. while True:
  862. try:
  863. tarinfo = self.tarinfo.fromtarfile(self)
  864. except EOFHeaderError, e:
  865. if self.ignore_zeros:
  866. self._dbg(2, "0x%X: %s" % (self.offset, e))
  867. self.offset += BLOCKSIZE
  868. continue
  869. except InvalidHeaderError, e:
  870. if self.ignore_zeros:
  871. self._dbg(2, "0x%X: %s" % (self.offset, e))
  872. self.offset += BLOCKSIZE
  873. continue
  874. elif self.offset == 0:
  875. raise ReadError(str(e))
  876. except EmptyHeaderError:
  877. if self.offset == 0:
  878. raise ReadError("empty file")
  879. except TruncatedHeaderError, e:
  880. if self.offset == 0:
  881. raise ReadError(str(e))
  882. except SubsequentHeaderError, e:
  883. raise ReadError(str(e))
  884. break
  885.  
  886. if tarinfo is not None:
  887. self.members.append(tarinfo)
  888. else:
  889. self._loaded = True
  890.  
  891. return tarinfo
  892.  
  893. #--------------------------------------------------------------------------
  894. # Little helper methods:
  895.  
  896. def _getmember(self, name, tarinfo=None, normalize=False):
  897. """Find an archive member by name from bottom to top.
  898. If tarinfo is given, it is used as the starting point.
  899. """
  900. # Ensure that all members have been loaded.
  901. members = self.getmembers()
  902.  
  903. # Limit the member search list up to tarinfo.
  904. if tarinfo is not None:
  905. members = members[:members.index(tarinfo)]
  906.  
  907. if normalize:
  908. name = os.path.normpath(name)
  909.  
  910. for member in reversed(members):
  911. if normalize:
  912. member_name = os.path.normpath(member.name)
  913. else:
  914. member_name = member.name
  915.  
  916. if name == member_name:
  917. return member
  918.  
  919. def _load(self):
  920. """Read through the entire archive file and look for readable
  921. members.
  922. """
  923. while True:
  924. tarinfo = self.next()
  925. if tarinfo is None:
  926. break
  927. self._loaded = True
  928.  
  929. def _check(self, mode=None):
  930. """Check if TarFile is still open, and if the operation's mode
  931. corresponds to TarFile's mode.
  932. """
  933. if self.closed:
  934. raise IOError("%s is closed" % self.__class__.__name__)
  935. if mode is not None and self.mode not in mode:
  936. raise IOError("bad operation for mode %r" % self.mode)
  937.  
  938. def _find_link_target(self, tarinfo):
  939. """Find the target member of a symlink or hardlink member in the
  940. archive.
  941. """
  942. if tarinfo.issym():
  943. # Always search the entire archive.
  944. linkname = "/".join(filter(None, (os.path.dirname(tarinfo.name), tarinfo.linkname)))
  945. limit = None
  946. else:
  947. # Search the archive before the link, because a hard link is
  948. # just a reference to an already archived file.
  949. linkname = tarinfo.linkname
  950. limit = tarinfo
  951.  
  952. member = self._getmember(linkname, tarinfo=limit, normalize=True)
  953. if member is None:
  954. raise KeyError("linkname %r not found" % linkname)
  955. return member
  956.  
  957. def __iter__(self):
  958. """Provide an iterator object.
  959. """
  960. if self._loaded:
  961. return iter(self.members)
  962. else:
  963. return TarIter(self)
  964.  
  965. def _dbg(self, level, msg):
  966. """Write debugging output to sys.stderr.
  967. """
  968. if level <= self.debug:
  969. print >> sys.stderr, msg
  970.  
  971. def __enter__(self):
  972. self._check()
  973. return self
  974.  
  975. def __exit__(self, type, value, traceback):
  976. if type is None:
  977. self.close()
  978. else:
  979. # An exception occurred. We must not call close() because
  980. # it would try to write end-of-archive blocks and padding.
  981. if not self._extfileobj:
  982. self.fileobj.close()
  983. self.closed = True
  984. # class TarFile
  985.  
  986. TarFile

TarFile

7、日志模块

用于便捷记录日志且线程安全的模块

  1. #!/usr/bin/env python
  2. #-*- coding:utf-8 -*-
  3. import logging
  4.  
  5. def log_models(logname,infos):
  6. logger = logging.getLogger(logname) #定义username
  7. logger.setLevel(logging.DEBUG) #定义全局日志级别
  8.  
  9. ch = logging.StreamHandler() #定义屏幕日志
  10. ch.setLevel(logging.DEBUG) #定义屏幕日志级别
  11.  
  12. fh = logging.FileHandler('log.txt') #定义日志保存文件
  13. fh.setLevel(logging.DEBUG) #定义文件日志保存级别
  14.  
  15. formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
  16.  
  17. ch.setFormatter(formatter) #屏幕输出格式套用自定义的日志格式
  18. fh.setFormatter(formatter) #日志输出格式套用自定义的日志格式
  19.  
  20. logger.addHandler(ch) #把屏幕输出日志交给logger接口执行
  21. logger.addHandler(fh)#把文件输出日志交给logger接口执行
  22.  
  23. logger.debug(infos) #调用日志模块

对于等级:

  1. CRITICAL = 50
  2. FATAL = CRITICAL
  3. ERROR = 40
  4. WARNING = 30
  5. WARN = WARNING
  6. INFO = 20
  7. DEBUG = 10
  8. NOTSET = 0

只有大于当前日志等级的操作才会被记录。

自定义模块

自定义模块一般都是自己写的可以是独立的一个.py文件也可以是一个文件夹

但是普通的文件夹不是一个模块必须有__int__的目录才是模块

aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAM0AAABKCAIAAACXa0deAAAIDElEQVR4nO2d23MSVxyA8190fExrFduZdhynoxIChhhSIDYKiWgNYs1dLpGYmIFEIE2JUXOrWRJzwU5pWMgFMAmQyaZOdfrgQ6cdZ9oZq5W/xT4cdrNyXZC9cDg730NY9pzdnP3mnN3D/vZXJZHVccPT+esM+cAd9WKE36nj7P9CSGR1IpHo+ImvclDF+yGWFnGry09gvdIzvB9JRVFZnvVixP7+/pwRScY1leUZgi+QZwguQJ4huAB5huAC5BmCC5BnCC5AniG4QECepc/7jxg1vDcQoiQI2rMiVBMbMQJ3adF0v8AQumeF/r5ZhGdaJ75PLgRm4v2UQEmJPfP7/aFMy+rqat6yfHmGflbngBJ7du16R0bPDNfyP2TB5FkMljxDv3iyTenHzeWVlRTJlpaXmRSkHFLru03ePZN3T63vzrYNHbHUNEcQB2Mf6Zm41eUn188Zz4ilOgd+0HUBIzW1Fx14chsCPcfBGqX37IKmJcWzZo2WSUHKIZN3zxp4YQ28MHn3sm1DASSjOqRejACeiaU6B0YKZ8SAQ/TeLr0b0zpxdA/BEqzcB0zPzlKSTU1NMyyVyTMi2zYUKQNlykfqGj/pmdQ0R+COVvBHau9Ffcv7WYEPVjyTNzRubm6GQqGNjQ15QyPDUpRDTfoek5cwruw16XuybUORzTMwaIJRkv5so9aJ+506rRNPv7VEnrEHW/MaDqcrFArdcTiZFynuPoA+boIrsKRnNP+0Tpy69hK3uvw4NocnfRJLddrW1DGX97MCH2x5Jq2rf7S4KK2rZ16k6HkNsREjB0fc4cQOrs/IC3w/htGf1ab7RN8MScYesM3TMgFNmHGPgDzjBhSHwguV5RmKQ+GLyvIMwRfIMwQXIM8QXJDfM7UlIhDeVVUhKHg/HQVRZp7xfgwCoeyaAh7PVrIsA26c9yPnuCkECPyezXp+gk815BlvjQusSl+pvRWCTzXkGW+Nm80zail0dyp7NEjsuu3hIr7ltykECOSeAcyjgR5HsNDdIc9KCJyeqSxh+QBebwvU2wP19oDS5r8+vMbSUanMW+4QsTDBqXDIM94al+7Z130hpSuktkQ0D6ItU/Emd+SyDXnGJ1B5trTsVVnCCsu6wrqpHAmrnU80k3Gd59fmydjVYZ9tbKylv4C7AZU9GiTiVnNYZd5ZIHbdnjh4TC3o2aK+vWnaXiBDXYhQ1GBmZBup5g5VdmEiDFaCytWWiGoinqNC5BmL5PVsYfFxzU2f5Paq3LHeOBZRjm+dexC7MLunvrcz6L6biIme+WXMd/e+ZwTh21HTLsto3xbcn4EiBBG3msNqoBSoiuaW1ZerTuQZi+T17Mf5x7W2QMNYpHF8Szmxo7oXVd+PNU3GFWMR94PBREz0dvsI892l9mf2sDqpSGk8o4pQH6kdqcw7C6SFRTSFAIHKs/sPf5aOrCvGtxvc25phvMs51+V82OV8ODphf7N9LBETvY4cZr477j1TWyIGz27Qs2Xw7ILus7imECBQeeaa/KXGFqgbDTcN4f9siP7bPgJ4Gz2aiIkSMdGr9Wrmuyu5ZwdjLhg3SZMMnt2DMdQeDYbiC6E8MybIMxbJ69ngOC7p90tsa7qhR6/DnwC3ElFRIip68+TT15HDv/90gvnuGHqmtkQMnl0m9wEpni344mT4zHtDpNVH5K0KecYieT0zugLfDa+13A5aR6f/3fw4ERP9GToZW5ZHvM1jE/YRt0s3sMr7f6HONxVi9R3cdRbXFAIEKs/aR9YM9o0rQxsDo1Ov1qsTMdFL76FzfZGX3kNDYz/wfvwUOTyjd5NFN4UAgcczOtr+wLOV438tfQR4vvxly61AyjZgtuLg7S8EEaZ9BHNaBR0e8wqzeWb1EQz3izxjkbJrXNQUFMizsqTsmgJ5VpaUXVOUU7zTuyoBHQxqioJAnpUlZdcUZeYZgoL30wGtZ7nJFocyPT1jsvTxfngVDmyepa9UKFVT09NINX6B3zNqSS8CXrL3Ia8P+vAaKgTIPQN0dHVfzZTAAHnGGZB6VndWdmNKPrAC4lAa+zHdFUN6EeQZZ8DpWa1ap3SuS2R1VBzK+baO9CLIM86AyrOlpaWaOkWN+lKttuOs6b6ib/L8+BqIQ2nRX+1rP6VQyOhFSEsO0qlQxqTkUpFQb313Jt+5nHxjPM0zUIT+uviM9WdMywL9+5eh8mx+fv5U93TGOBSLRZOIiZ6ufEYvknwPPPXKdyptSsZcKlLTHKmRuNXlT+a7SHomNmJEWu6BrPXnTMsCJVB5NjMzmzkO5ftNx6AcxKHQi6SMeikfs+VSIbekeYZhGXNDZas/d1oWKIHKs7t3J8g4lC3VtR5950V9Z6u+s9Xer3izdRTEodCLZPUgUy6VHJ4ROO7PlEglh8c50rJACVSeDY/cAXEoZ/U3/l47nDEOhV4kaQl5pqm0KRlzqeTqz0g1aRdqNAvT6pekpWWBHqg8M/fdPN0zK7GtNeu/zRiH8nwxU3+GYfTxkVqfkkslt2cSMgMQQWA9Le9fvaXVD6ioND9QeXa9o1PXZmhqu9Hb2QTiUP4Ifr49e2xj5gu76WR/1ymlUsb2YdDJPetRUWlZoPLs8pW2louXvrmgNbY3UHEoMnn9S+8hc/tp7g8ph2eVlpYFHs/oNChkTz3VVBzKb1h1g4LTngyQzbMKTMsCp2cIoYE8Q3AB8gzBBcgzBBcgzxBcAI9n2eID0BPbQgB+z1AcihCAzbP0lSgORQjk9ex/Ruh87VX93xQAAAAASUVORK5CYII=" alt="" />

例子:

我自己写一个登录模块,然后我在其他程序里去调用他的之后直接调用模块名(python文件名)+里面的方法即可!

开源模块

一、下载安装

下载安装有两种方式:

通过第三方的集成安装

  1. yum
  2. pip
  3. apt-get
  4. ...

通过源码安装

  1. 下载源码
  2. 解压源码
  3. 进入目录
  4. 编译源码 python setup.py build
  5. 安装源码 python setup.py install

注:在使用源码安装时,需要使用到gcc编译和python开发环境

  1. yum install gcc
  2. yum install python-devel

  3. apt-get python-dev

二、导入模块

同自定义模块中导入的方式

三、paramiko模块

1、安装

linux下

  1. yum -y install pip
  2. pip install paramiko

windows下安装

  1. 1、下载pip
  2. https://pypi.python.org/pypi/pip#downloads
  3. 2、解压pip包然后安装
  4. python setup.py install
  5. 3、添加环境变量
  6. C:\Python34\Scripts;(根据自己的python路径定)
  7. 4、安装paramiko
  8. pip install paramiko
  1. #!/usr/bin/env python
  2. #coding:utf-8
  3.  
  4. import paramiko
  5.  
  6. ssh = paramiko.SSHClient()
  7. ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
  8. ssh.connect('192.168.1.108', 22, 'shuaige', '')
  9. stdin, stdout, stderr = ssh.exec_command('df')
  10. print stdout.read()
  11. ssh.close();

执行命令-通过用户名密码

  1. import paramiko
  2.  
  3. private_key_path = '/home/auto/.ssh/id_rsa'
  4. key = paramiko.RSAKey.from_private_key_file(private_key_path)
  5.  
  6. ssh = paramiko.SSHClient()
  7. ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
  8. ssh.connect('主机名 ', 端口, '用户名', key)
  9.  
  10. stdin, stdout, stderr = ssh.exec_command('df')
  11. print stdout.read()
  12. ssh.close()

执行命令-通过密钥连接服务器

  1. import os,sys
  2. import paramiko
  3.  
  4. t = paramiko.Transport(('182.92.219.86',22))
  5. t.connect(username='shuaige',password='')
  6. sftp = paramiko.SFTPClient.from_transport(t)
  7. sftp.put('/tmp/test.py','/tmp/test.py')
  8. t.close()
  9.  
  10. import os,sys
  11. import paramiko
  12.  
  13. t = paramiko.Transport(('182.92.219.86',22))
  14. t.connect(username='shuaige',password='')
  15. sftp = paramiko.SFTPClient.from_transport(t)
  16. sftp.get('/tmp/test.py','/tmp/test2.py')
  17. t.close()

上传和下载通过用户名密码

  1. import paramiko
  2.  
  3. pravie_key_path = '/home/auto/.ssh/id_rsa'
  4. key = paramiko.RSAKey.from_private_key_file(pravie_key_path)
  5.  
  6. t = paramiko.Transport(('182.92.219.86',22))
  7. t.connect(username='shuaige',pkey=key)
  8.  
  9. sftp = paramiko.SFTPClient.from_transport(t)
  10. sftp.put('/tmp/test3.py','/tmp/test3.py')
  11.  
  12. t.close()
  13.  
  14. import paramiko
  15.  
  16. pravie_key_path = '/home/auto/.ssh/id_rsa'
  17. key = paramiko.RSAKey.from_private_key_file(pravie_key_path)
  18.  
  19. t = paramiko.Transport(('182.92.219.86',22))
  20. t.connect(username='shuaige',pkey=key)
  21.  
  22. sftp = paramiko.SFTPClient.from_transport(t)
  23. sftp.get('/tmp/test3.py','/tmp/test4.py')
  24.  
  25. t.close()

上传和下载-通过密钥

更多请看:http://www.cnblogs.com/wupeiqi/

Python之路【第四篇】:模块的更多相关文章

  1. python之路第四篇(基础篇)

    一.冒泡算法实现: 方法一: li = [13,33,12,80,66,1] print li for m in range(4): num1 = li[m] num2 = li[m+1] if nu ...

  2. Python之路(第四篇):Python基本数据类型列表、元组、字典

    一.列表 1.列表 list ,是一个类,li = [1,2,"nicholas"] li是通过list类创建的对象. 2.list特点: 用中括号括起来,用逗号分割每个元素,列表 ...

  3. Python之路(第四十六篇)多种方法实现python线程池(threadpool模块\multiprocessing.dummy模块\concurrent.futures模块)

    一.线程池 很久(python2.6)之前python没有官方的线程池模块,只有第三方的threadpool模块, 之后再python2.6加入了multiprocessing.dummy 作为可以使 ...

  4. Python之路【第九篇】:Python操作 RabbitMQ、Redis、Memcache、SQLAlchemy

    Python之路[第九篇]:Python操作 RabbitMQ.Redis.Memcache.SQLAlchemy   Memcached Memcached 是一个高性能的分布式内存对象缓存系统,用 ...

  5. 【Python之路】第二篇--初识Python

    Python简介 Python可以应用于众多领域,如:数据分析.组件集成.网络服务.图像处理.数值计算和科学计算等众多领域.目前业内几乎所有大中型互联网企业都在使用Python,如:Youtube.D ...

  6. Python之路【第一篇】python基础

    一.python开发 1.开发: 1)高级语言:python .Java .PHP. C#  Go ruby  c++  ===>字节码 2)低级语言:c .汇编 2.语言之间的对比: 1)py ...

  7. Python之路【第一篇】:Python简介和入门

    python简介: 一.什么是python Python(英国发音:/ pa θ n/ 美国发音:/ pa θɑ n/),是一种面向对象.直译式的计算机程序语言. 每一门语言都有自己的哲学: pyth ...

  8. Python之路【第一篇】:Python前世今生

    Python简介 Python前世今生 python的创始人为吉多·范罗苏姆(Guido van Rossum).1989年的圣诞节期间,吉多·范罗苏姆为了在阿姆斯特丹打发时间,决心开发一个新的脚本解 ...

  9. Python之路: socket篇

    (默认)与特定的地址家族相关的协议,如果是  ,则系统就会根据地址格式和套接类别,自动选择一个合适的协议 sk import socketip_port = ()sk = socket.socket( ...

  10. python之路: 基础篇

    )或>>> name = )    #按照占位符的顺序):]        #下标识从0开始的 wulaoer >>> print name[:]        # ...

随机推荐

  1. eclipse中maven install和build,clean

    eclipse插件,m2eclipse 1.maven install相当于maven原生的命令: mvn install 2.aven build是 m2eclipse这个插件自己创造的概念,需要你 ...

  2. C# 常用加密方式

    using System;using System.Collections.Generic;using System.Text;using System.Security.Cryptography;u ...

  3. 【HDU 4940】Destroy Transportation system(无源无汇带上下界可行流)

    Description Tom is a commander, his task is destroying his enemy’s transportation system. Let’s repr ...

  4. form 表单用php来跳转页面

    action="submit.php" method="post"  fomr 表单的提交跳转:method =post/get,get密码和用户的可见性,密码 ...

  5. bzoj1103: [POI2007]大都市meg

    题目链接:http://www.lydsy.com/JudgeOnline/problem.php?id=1103 题目大意:在经济全球化浪潮的影响下,习惯于漫步在清晨的乡间小路的邮递员Blue Ma ...

  6. 嵌入式环境下通过 UDP 链接来调试 QT 程序

    据说有为嵌入式提供的 QT Debug 手段,但是目前还没发现,所以想到了这个笨办法.有更好思路的可以推荐. 该思路是基于 QDebug() .因为 QT 提供了重写 QT msg 处理方法的接口 q ...

  7. 数据结构作业——N个数中未出现的最小整数(想法题)

    Description 给出一串数字,这串数字由 n 个数 ai 组成,找出未出现在这串数字中的最小正整数 Input 输入第一行为一个正整数 n (1 <= n <= 1000) 第二行 ...

  8. Django TemplateSyntaxError Could not parse the remainder: '()'

    返回的数据是列表集合,如 n [5]: a = set() In [6]: a.add((1, 3)) In [7]: a Out[7]: {(1, 3)} 在模板中使用方式如下: {% for ar ...

  9. PhyLab2.0设计分析阶段任务大纲(α)

    任务概述 由于接手软剑攻城队的PhyLab项目,省去了用户需求分析.团队编码规范.用户界面原型设计和后端逻辑设计的大部分环节,因此前期的主要任务落在了用户使用反馈.功能优化增改方向.用户体验优化以及源 ...

  10. 【项目】UICollectionViewFlowlayout再一次自定义

    项目中好友列表需要使用UICollection完成,加入了长按点击颤抖删除按钮