0x1、前言

​ 在现场取证遇到分析流量包的情况会比较少,虽然流量类设备原理是把数据都抓出来进行解析,很大一定程度上已经把人可以做的事情交给了机器自动完成。

​ 可用于PCAP包分析的软件比如科来,Wireshark都是很好用的分析软件,找Pcap解析的编程类代码时发现已经有很多大佬写过Python脚本辅助解析Pcap,也有提取将Pcap信息以界面形式展示出来框架。

​ 本文对利用Python里的Scapy库提取协议五元组信息进行学习性总结,没有用于实战,因为实践过程中发现PCAP读包解包查包速度太慢了。

0x2、参考库

Python解析pcap包的常见库有Scapy、dpkt、Pyshark等。可以参考的源码,工具如下

  1. https://github.com/thepacketgeek/cloud-pcap
  2. https://github.com/madpowah/ForensicPCAP
  3. https://github.com/le4f/pcap-analyzer
  4. https://github.com/HatBoy/Pcap-Analyzer
  5. https://github.com/caesar0301/awesome-pcaptools
  6. https://asecuritysite.com/forensics/pcap
  7. https://github.com/DanMcInerney/net-creds

提供解析Pcap包服务的网站https://packettotal.com/、https://www.capanalysis.net/ca/、https://asecuritysite.com/forensics/pcap?infile=smtp.pcap&infile=smtp.pcap、https://app.any.run

0x3、邮件协议

提取发件人的邮箱,就要熟悉SMTP的几个端口进行识别。

  1. 25 端口为SMTPSimple Mail Transfer Protocol,简单邮件传输协议)服务所开放的,是用于发送邮件。
  2. 110端口是为POP3(邮件协议3)服务开放的,POP2POP3都是主要用于接收邮件的,目前POP3使用的比较多,许多服务器都同时支持POP2POP3
  3. 143端口主要是用于“Internet Message AccessProtocolv2Internet消息访问协议,简称IMAP),和POP3一样,是用于电子邮件的接收的协议。
  4. 465 端口是SSL/TLS通讯协议的 内容一开始就被保护起来了 是看不到原文的。
  5. 465 端口(SMTPS):465端口是为SMTPSSMTP-over-SSL)协议服务开放的,这是SMTP协议基于SSL安全协议之上的一种变种协议,它继承了SSL安全协议的非对称加密的高度安全可靠性,可防止邮件泄露。
  6. 587 端口是STARTTLS协议的 属于TLS通讯协议 只是他是在STARTTLS命令执行后才对之后的原文进行保护的。

SMTP常用命令语法

在Wireshark中SMTP协议数据是在建立TCP三次握手后会出现的,常用的命令如下。

  1. SMTP命令不区分大小写,但参数区分大小写,有关这方面的详细说明请参考RFC821
  2. HELO <domain> <CRLF>。向服务器标识用户身份发送者能欺骗,说谎,但一般情况下服务器都能检测到。
  3. MAIL FROM: <reverse-path> <CRLF>。<reverse-path>为发送者地址,此命令用来初始化邮件传输,即用来对所有的状态和缓冲区进行初始化。
  4. RCPT TO:<forward-path> <CRLF>。 <forward-path>用来标志邮件接收者的地址,常用在MAIL FROM后,可以有多个RCPT TO
  5. DATA <CRLF>。将之后的数据作为数据发送,以<CRLF>.<CRLF>标志数据的结尾。
  6. REST <CRLF>。重置会话,当前传输被取消。
  7. NOOP <CRLF>。要求服务器返回OK应答,一般用作测试。
  8. QUIT <CRLF>。结束会话。
  9. VRFY <string> <CRLF>。验证指定的邮箱是否存在,由于安全方面的原因,服务器大多禁止此命令。
  10. EXPN <string> <CRLF>。验证给定的邮箱列表是否存在,由于安全方面的原因,服务器大多禁止此命令。
  11. HELP <CRLF>。查询服务器支持什么命令。

代码

forensicPCAP 可以根据解析PCAP包,然后利用CMD模块循环交互界面,输入命令调出对应函数。核心原理为

基于Scapy找到源端口或目的端口为110、143端口的数据记录保存作为邮件数据。

关键代码代码如下:

self.pcap是构造函数self.pcap = rdpcap(namefile)提前读取,然后enumerate()遍历。指定目标端口和源端口是110、143的数据筛选出来。

  1. def do_mail(self, arg, opts=None):
  2. """Print the number of mail's requests and store its
  3. Usage :
  4. - mail"""
  5. sys.stdout.write(bcolors.TXT + "## Searching mail's request ... ")
  6. sys.stdout.flush()
  7. con = []
  8. mailpkts = []
  9. for i,packet in enumerate(self.pcap):
  10. # TCP的包
  11. if TCP in packet:
  12. # 获取源端口或目的端口为110、143端口的数据记录
  13. if packet.getlayer('TCP').dport == 110 or packet.getlayer('TCP').sport == 110 or packet.getlayer('TCP').dport == 143 or packet.getlayer('TCP').sport == 143 :
  14. if packet.getlayer('TCP').flags == 2:
  15. con.append(i)
  16. mailpkts.append(packet)
  17. sys.stdout.write("OK.\n")
  18. print "## Result : Mail's request : " + str(len(con))
  19. sys.stdout.write(bcolors.TXT + "## Saving mails ... ")
  20. sys.stdout.flush()
  21. res = ""
  22. for packet in mailpkts:
  23. if packet.getlayer('TCP').flags == 24:
  24. res = res + packet.getlayer('Raw').load
  25. sys.stdout.write(".")
  26. sys.stdout.flush()
  27. sys.stdout.write("OK\n")
  28. sys.stdout.flush()
  29. self.cmd = "mail"
  30. self.last = res

0x4、DNS解析

forensicPCAP 解析DNS部分代码。bcolors.TXT是设定了一个字体显示的颜色,res = packet.getlayer('DNS').qd.qname是获取DNS域名解析记录。

  1. ############# do_dns() ###########
  2. def do_dns(self, arg, opts=None):
  3. """Print all DNS requests in the PCAP file
  4. Usage :
  5. - dns"""
  6. sys.stdout.write(bcolors.TXT + "## Listing all DNS requests ...")
  7. sys.stdout.flush()
  8. dns = []
  9. dns.append([])
  10. # 枚举PCAP包的数据
  11. for i,packet in enumerate(self.pcap):
  12. if DNS in packet:
  13. # 获取DNS域名解析记录
  14. res = packet.getlayer('DNS').qd.qname
  15. if res[len(res) - 1] == '.':
  16. res = res[:-1]
  17. # 保存域名
  18. dns.append([i, res])
  19. sys.stdout.write("OK.\n")
  20. # 统计DNS数目
  21. print bcolors.TXT + "## Result : " + str(len(dns) - 1) + " DNS request(s)" + bcolors.ENDC
  22. self.last = dns
  23. self.cmd = "dns"

pcap-analyzer核心的代码是借鉴了forensicPCAP,获取解析为DNS的请求记录显示的时候会统计出次数最多的IP的前十名。

代码

  1. def get_dns(file):
  2. dns = []
  3. # 打开PCAP文件
  4. pcap = rdpcap(UPLOAD_FOLDER+file)
  5. for packet in pcap:
  6. if DNS in packet:
  7. # 核心代码
  8. res = packet.getlayer('DNS').qd.qname
  9. if res[len(res) - 1] == '.':
  10. res = res[:-1]
  11. dns.append(res)
  12. # 统计DNS协议,出现次数最多的IP前十名
  13. dns = Counter(dns).most_common(10)

0x5、密码信息提取

net-creds 以Scapy为基础,解析PCAP中含有密码信息的一个脚本。

代码

主要代码由other_parser()函数实现,分割每个包中的HTTP等内容,然后搜索身份验证相关的关键字筛选出账户、密码。

  1. def other_parser(src_ip_port, dst_ip_port, full_load, ack, seq, pkt, verbose):
  2. '''
  3. Pull out pertinent info from the parsed HTTP packet data
  4. '''
  5. user_passwd = None
  6. http_url_req = None
  7. method = None
  8. http_methods = ['GET ', 'POST ', 'CONNECT ', 'TRACE ', 'TRACK ', 'PUT ', 'DELETE ', 'HEAD ']
  9. http_line, header_lines, body = parse_http_load(full_load, http_methods)
  10. headers = headers_to_dict(header_lines)
  11. if 'host' in headers:
  12. host = headers['host']
  13. else:
  14. host = ''
  15. if http_line != None:
  16. method, path = parse_http_line(http_line, http_methods)
  17. http_url_req = get_http_url(method, host, path, headers)
  18. if http_url_req != None:
  19. if verbose == False:
  20. if len(http_url_req) > 98:
  21. http_url_req = http_url_req[:99] + '...'
  22. printer(src_ip_port, None, http_url_req)
  23. # Print search terms
  24. searched = get_http_searches(http_url_req, body, host)
  25. if searched:
  26. printer(src_ip_port, dst_ip_port, searched)
  27. # Print user/pwds
  28. if body != '':
  29. user_passwd = get_login_pass(body)
  30. if user_passwd != None:
  31. try:
  32. http_user = user_passwd[0].decode('utf8')
  33. http_pass = user_passwd[1].decode('utf8')
  34. # Set a limit on how long they can be prevent false+
  35. if len(http_user) > 75 or len(http_pass) > 75:
  36. return
  37. user_msg = 'HTTP username: %s' % http_user
  38. printer(src_ip_port, dst_ip_port, user_msg)
  39. pass_msg = 'HTTP password: %s' % http_pass
  40. printer(src_ip_port, dst_ip_port, pass_msg)
  41. except UnicodeDecodeError:
  42. pass
  43. # Print POST loads
  44. # ocsp is a common SSL post load that's never interesting
  45. if method == 'POST' and 'ocsp.' not in host:
  46. try:
  47. if verbose == False and len(body) > 99:
  48. # If it can't decode to utf8 we're probably not interested in it
  49. msg = 'POST load: %s...' % body[:99].encode('utf8')
  50. else:
  51. msg = 'POST load: %s' % body.encode('utf8')
  52. printer(src_ip_port, None, msg)
  53. except UnicodeDecodeError:
  54. pass
  55. # Kerberos over TCP
  56. decoded = Decode_Ip_Packet(str(pkt)[14:])
  57. kerb_hash = ParseMSKerbv5TCP(decoded['data'][20:])
  58. if kerb_hash:
  59. printer(src_ip_port, dst_ip_port, kerb_hash)
  60. # Non-NETNTLM NTLM hashes (MSSQL, DCE-RPC,SMBv1/2,LDAP, MSSQL)
  61. NTLMSSP2 = re.search(NTLMSSP2_re, full_load, re.DOTALL)
  62. NTLMSSP3 = re.search(NTLMSSP3_re, full_load, re.DOTALL)
  63. if NTLMSSP2:
  64. parse_ntlm_chal(NTLMSSP2.group(), ack)
  65. if NTLMSSP3:
  66. ntlm_resp_found = parse_ntlm_resp(NTLMSSP3.group(), seq)
  67. if ntlm_resp_found != None:
  68. printer(src_ip_port, dst_ip_port, ntlm_resp_found)
  69. # Look for authentication headers
  70. if len(headers) == 0:
  71. authenticate_header = None
  72. authorization_header = None
  73. for header in headers:
  74. authenticate_header = re.match(authenticate_re, header)
  75. authorization_header = re.match(authorization_re, header)
  76. if authenticate_header or authorization_header:
  77. break
  78. if authorization_header or authenticate_header:
  79. # NETNTLM
  80. netntlm_found = parse_netntlm(authenticate_header, authorization_header, headers, ack, seq)
  81. if netntlm_found != None:
  82. printer(src_ip_port, dst_ip_port, netntlm_found)
  83. # Basic Auth
  84. parse_basic_auth(src_ip_port, dst_ip_port, headers, authorization_header)

关键字列表:

  1. # Regexs
  2. authenticate_re = '(www-|proxy-)?authenticate'
  3. authorization_re = '(www-|proxy-)?authorization'
  4. ftp_user_re = r'USER (.+)\r\n'
  5. ftp_pw_re = r'PASS (.+)\r\n'
  6. irc_user_re = r'NICK (.+?)((\r)?\n|\s)'
  7. irc_pw_re = r'NS IDENTIFY (.+)'
  8. irc_pw_re2 = 'nickserv :identify (.+)'
  9. mail_auth_re = '(\d+ )?(auth|authenticate) (login|plain)'
  10. mail_auth_re1 = '(\d+ )?login '
  11. NTLMSSP2_re = 'NTLMSSP\x00\x02\x00\x00\x00.+'
  12. NTLMSSP3_re = 'NTLMSSP\x00\x03\x00\x00\x00.+'
  13. # Prone to false+ but prefer that to false-
  14. http_search_re = '((search|query|&q|\?q|search\?p|searchterm|keywords|keyword|command|terms|keys|question|kwd|searchPhrase)=([^&][^&]*))'

0x6、提取数据需要关注的元素

  • 源IP、目的IP、源端口、目的端口、协议、数据包大小
  1. 关联出受害者,攻击者控制的跳板
  • 协议:HTTP、FTP、邮件协议
  1. - 先查看HTTPHTTPS类,然后查看数据包长度。
  2. - 提取相关IP信息。
  3. - 判断是否是木马远控、密码账户、可疑IP
  • 账户密码字段
  1. - 渗透测试,嗅探回来的数据包分析含有账户密码信息
  2. - 暴力破解IP

0x7、测试dpkt解析pcap

本想借助dpkt解析mail、dns、http来辅助分析pcap包进行分析,查阅资料学习却发现并不如使用scapy那么方便。

dpkt是一个python模块,可以对简单的数据包创建/解析,以及基本TCP / IP协议的解析,速度很快。

dpkt 手册

https://dpkt.readthedocs.io/en/latest/

dpkt 下载

https://pypi.org/project/dpkt/

看官方手册发现DPKT是读取每个pcap包里的内容,用isinstance判断是不是有IP的包,再判断是属于哪个协议,对应的协议已经封装好API如果发现可以匹配某个协议API就输出来相关值。

想要扩展这个源码还需要去学习一下协议相关的字段含义。

API调用:

https://dpkt.readthedocs.io/en/latest/api/api_auto.html#module-dpkt.qq

在手册中找到了在Github中部分API的示例代码,具备参考价值。

https://github.com/jeffsilverm/dpkt_doc

手册例子

以下代码是手册中的例子,通过查询发现inet_pton无法直接使用,按照网络上的解决方法修改了一下。

打印数据包

使用DPKT读取pcap文件并打印出数据包的内容。打印出以太网帧和IP数据包中的字段。

python2测试代码:

  1. #!/usr/bin/env python
  2. """
  3. Use DPKT to read in a pcap file and print out the contents of the packets
  4. This example is focused on the fields in the Ethernet Frame and IP packet
  5. """
  6. import dpkt
  7. import datetime
  8. import socket
  9. from dpkt.compat import compat_ord
  10. import ctypes
  11. import os
  12. def mac_addr(address):
  13. """Convert a MAC address to a readable/printable string
  14. Args:
  15. address (str): a MAC address in hex form (e.g. '\x01\x02\x03\x04\x05\x06')
  16. Returns:
  17. str: Printable/readable MAC address
  18. """
  19. return ':'.join('%02x' % compat_ord(b) for b in address)
  20. class sockaddr(ctypes.Structure):
  21. _fields_ = [("sa_family", ctypes.c_short),
  22. ("__pad1", ctypes.c_ushort),
  23. ("ipv4_addr", ctypes.c_byte * 4),
  24. ("ipv6_addr", ctypes.c_byte * 16),
  25. ("__pad2", ctypes.c_ulong)]
  26. if hasattr(ctypes, 'windll'):
  27. WSAStringToAddressA = ctypes.windll.ws2_32.WSAStringToAddressA
  28. WSAAddressToStringA = ctypes.windll.ws2_32.WSAAddressToStringA
  29. else:
  30. def not_windows():
  31. raise SystemError(
  32. "Invalid platform. ctypes.windll must be available."
  33. )
  34. WSAStringToAddressA = not_windows
  35. WSAAddressToStringA = not_windows
  36. def inet_pton(address_family, ip_string):
  37. addr = sockaddr()
  38. addr.sa_family = address_family
  39. addr_size = ctypes.c_int(ctypes.sizeof(addr))
  40. if WSAStringToAddressA(
  41. ip_string,
  42. address_family,
  43. None,
  44. ctypes.byref(addr),
  45. ctypes.byref(addr_size)
  46. ) != 0:
  47. raise socket.error(ctypes.FormatError())
  48. if address_family == socket.AF_INET:
  49. return ctypes.string_at(addr.ipv4_addr, 4)
  50. if address_family == socket.AF_INET6:
  51. return ctypes.string_at(addr.ipv6_addr, 16)
  52. raise socket.error('unknown address family')
  53. def inet_ntop(address_family, packed_ip):
  54. addr = sockaddr()
  55. addr.sa_family = address_family
  56. addr_size = ctypes.c_int(ctypes.sizeof(addr))
  57. ip_string = ctypes.create_string_buffer(128)
  58. ip_string_size = ctypes.c_int(ctypes.sizeof(ip_string))
  59. if address_family == socket.AF_INET:
  60. if len(packed_ip) != ctypes.sizeof(addr.ipv4_addr):
  61. raise socket.error('packed IP wrong length for inet_ntoa')
  62. ctypes.memmove(addr.ipv4_addr, packed_ip, 4)
  63. elif address_family == socket.AF_INET6:
  64. if len(packed_ip) != ctypes.sizeof(addr.ipv6_addr):
  65. raise socket.error('packed IP wrong length for inet_ntoa')
  66. ctypes.memmove(addr.ipv6_addr, packed_ip, 16)
  67. else:
  68. raise socket.error('unknown address family')
  69. if WSAAddressToStringA(
  70. ctypes.byref(addr),
  71. addr_size,
  72. None,
  73. ip_string,
  74. ctypes.byref(ip_string_size)
  75. ) != 0:
  76. raise socket.error(ctypes.FormatError())
  77. return ip_string[:ip_string_size.value - 1]
  78. # Adding our two functions to the socket library
  79. if os.name == 'nt':
  80. socket.inet_pton = inet_pton
  81. socket.inet_ntop = inet_ntop
  82. def inet_to_str(inet):
  83. return socket.inet_ntop(socket.AF_INET, inet)
  84. def print_packets(pcap):
  85. """Print out information about each packet in a pcap
  86. Args:
  87. pcap: dpkt pcap reader object (dpkt.pcap.Reader)
  88. """
  89. # packet num count
  90. r_num = 0
  91. # For each packet in the pcap process the contents
  92. for timestamp, buf in pcap:
  93. r_num=r_num+1
  94. print ('packet num count :' , r_num )
  95. # Print out the timestamp in UTC
  96. print('Timestamp: ', str(datetime.datetime.utcfromtimestamp(timestamp)))
  97. # Unpack the Ethernet frame (mac src/dst, ethertype)
  98. eth = dpkt.ethernet.Ethernet(buf)
  99. print('Ethernet Frame: ', mac_addr(eth.src), mac_addr(eth.dst), eth.type)
  100. # Make sure the Ethernet data contains an IP packet
  101. if not isinstance(eth.data, dpkt.ip.IP):
  102. print('Non IP Packet type not supported %s\n' % eth.data.__class__.__name__)
  103. continue
  104. # Now unpack the data within the Ethernet frame (the IP packet)
  105. # Pulling out src, dst, length, fragment info, TTL, and Protocol
  106. ip = eth.data
  107. # Pull out fragment information (flags and offset all packed into off field, so use bitmasks)
  108. do_not_fragment = bool(ip.off & dpkt.ip.IP_DF)
  109. more_fragments = bool(ip.off & dpkt.ip.IP_MF)
  110. fragment_offset = ip.off & dpkt.ip.IP_OFFMASK
  111. # Print out the info
  112. print('IP: %s -> %s (len=%d ttl=%d DF=%d MF=%d offset=%d)\n' % \
  113. (inet_to_str(ip.src), inet_to_str(ip.dst), ip.len, ip.ttl, do_not_fragment, more_fragments, fragment_offset))
  114. def test():
  115. """Open up a test pcap file and print out the packets"""
  116. with open('pcap222.pcap', 'rb') as f:
  117. pcap = dpkt.pcap.Reader(f)
  118. print_packets(pcap)
  119. if __name__ == '__main__':
  120. test()

输出:

  1. ('packet num count :', 4474)
  2. ('Timestamp: ', '2017-08-01 03:55:03.314832')
  3. ('Ethernet Frame: ', '9c:5c:8e:76:bf:24', 'ec:88:8f:86:14:5c', 2048)
  4. IP: 192.168.1.103 -> 211.90.25.31 (len=52 ttl=64 DF=1 MF=0 offset=0)
  5. ('packet num count :', 4475)
  6. ('Timestamp: ', '2017-08-01 03:55:03.485679')
  7. ('Ethernet Frame: ', '9c:5c:8e:76:bf:24', 'ec:88:8f:86:14:5c', 2048)
  8. IP: 192.168.1.103 -> 180.97.33.12 (len=114 ttl=64 DF=0 MF=0 offset=0)
  9. ('packet num count :', 4476)
  10. ('Timestamp: ', '2017-08-01 03:55:03.486141')
  11. ('Ethernet Frame: ', '9c:5c:8e:76:bf:24', 'ec:88:8f:86:14:5c', 2048)
  12. IP: 192.168.1.103 -> 119.75.222.122 (len=52 ttl=64 DF=1 MF=0 offset=0)

打印ICMP

检查ICMP数据包并显示ICMP内容。

  1. #!/usr/bin/env python
  2. """
  3. Use DPKT to read in a pcap file and print out the contents of the packets
  4. This example is focused on the fields in the Ethernet Frame and IP packet
  5. """
  6. import dpkt
  7. import datetime
  8. import socket
  9. from dpkt.compat import compat_ord
  10. import ctypes
  11. import os
  12. def mac_addr(address):
  13. """Convert a MAC address to a readable/printable string
  14. Args:
  15. address (str): a MAC address in hex form (e.g. '\x01\x02\x03\x04\x05\x06')
  16. Returns:
  17. str: Printable/readable MAC address
  18. """
  19. return ':'.join('%02x' % compat_ord(b) for b in address)
  20. class sockaddr(ctypes.Structure):
  21. _fields_ = [("sa_family", ctypes.c_short),
  22. ("__pad1", ctypes.c_ushort),
  23. ("ipv4_addr", ctypes.c_byte * 4),
  24. ("ipv6_addr", ctypes.c_byte * 16),
  25. ("__pad2", ctypes.c_ulong)]
  26. if hasattr(ctypes, 'windll'):
  27. WSAStringToAddressA = ctypes.windll.ws2_32.WSAStringToAddressA
  28. WSAAddressToStringA = ctypes.windll.ws2_32.WSAAddressToStringA
  29. else:
  30. def not_windows():
  31. raise SystemError(
  32. "Invalid platform. ctypes.windll must be available."
  33. )
  34. WSAStringToAddressA = not_windows
  35. WSAAddressToStringA = not_windows
  36. def inet_pton(address_family, ip_string):
  37. addr = sockaddr()
  38. addr.sa_family = address_family
  39. addr_size = ctypes.c_int(ctypes.sizeof(addr))
  40. if WSAStringToAddressA(
  41. ip_string,
  42. address_family,
  43. None,
  44. ctypes.byref(addr),
  45. ctypes.byref(addr_size)
  46. ) != 0:
  47. raise socket.error(ctypes.FormatError())
  48. if address_family == socket.AF_INET:
  49. return ctypes.string_at(addr.ipv4_addr, 4)
  50. if address_family == socket.AF_INET6:
  51. return ctypes.string_at(addr.ipv6_addr, 16)
  52. raise socket.error('unknown address family')
  53. def inet_ntop(address_family, packed_ip):
  54. addr = sockaddr()
  55. addr.sa_family = address_family
  56. addr_size = ctypes.c_int(ctypes.sizeof(addr))
  57. ip_string = ctypes.create_string_buffer(128)
  58. ip_string_size = ctypes.c_int(ctypes.sizeof(ip_string))
  59. if address_family == socket.AF_INET:
  60. if len(packed_ip) != ctypes.sizeof(addr.ipv4_addr):
  61. raise socket.error('packed IP wrong length for inet_ntoa')
  62. ctypes.memmove(addr.ipv4_addr, packed_ip, 4)
  63. elif address_family == socket.AF_INET6:
  64. if len(packed_ip) != ctypes.sizeof(addr.ipv6_addr):
  65. raise socket.error('packed IP wrong length for inet_ntoa')
  66. ctypes.memmove(addr.ipv6_addr, packed_ip, 16)
  67. else:
  68. raise socket.error('unknown address family')
  69. if WSAAddressToStringA(
  70. ctypes.byref(addr),
  71. addr_size,
  72. None,
  73. ip_string,
  74. ctypes.byref(ip_string_size)
  75. ) != 0:
  76. raise socket.error(ctypes.FormatError())
  77. return ip_string[:ip_string_size.value - 1]
  78. # Adding our two functions to the socket library
  79. if os.name == 'nt':
  80. socket.inet_pton = inet_pton
  81. socket.inet_ntop = inet_ntop
  82. def inet_to_str(inet):
  83. return socket.inet_ntop(socket.AF_INET, inet)
  84. def print_icmp(pcap):
  85. """Print out information about each packet in a pcap
  86. Args:
  87. pcap: dpkt pcap reader object (dpkt.pcap.Reader)
  88. """
  89. # packet num count
  90. r_num = 0
  91. # For each packet in the pcap process the contents
  92. for timestamp, buf in pcap:
  93. r_num=r_num+1
  94. print ('packet num count :' , r_num )
  95. # Unpack the Ethernet frame (mac src/dst, ethertype)
  96. eth = dpkt.ethernet.Ethernet(buf)
  97. # Make sure the Ethernet data contains an IP packet
  98. if not isinstance(eth.data, dpkt.ip.IP):
  99. print('Non IP Packet type not supported %s\n' % eth.data.__class__.__name__)
  100. continue
  101. # Now grab the data within the Ethernet frame (the IP packet)
  102. ip = eth.data
  103. # Now check if this is an ICMP packet
  104. if isinstance(ip.data, dpkt.icmp.ICMP):
  105. icmp = ip.data
  106. # Pull out fragment information (flags and offset all packed into off field, so use bitmasks)
  107. do_not_fragment = bool(ip.off & dpkt.ip.IP_DF)
  108. more_fragments = bool(ip.off & dpkt.ip.IP_MF)
  109. fragment_offset = ip.off & dpkt.ip.IP_OFFMASK
  110. # Print out the info
  111. print('Timestamp: ', str(datetime.datetime.utcfromtimestamp(timestamp)))
  112. print( 'Ethernet Frame: ', mac_addr(eth.src), mac_addr(eth.dst), eth.type)
  113. print( 'IP: %s -> %s (len=%d ttl=%d DF=%d MF=%d offset=%d)' % \
  114. (inet_to_str(ip.src), inet_to_str(ip.dst), ip.len, ip.ttl, do_not_fragment, more_fragments, fragment_offset))
  115. print('ICMP: type:%d code:%d checksum:%d data: %s\n' % (icmp.type, icmp.code, icmp.sum, repr(icmp.data)))
  116. def test():
  117. """Open up a test pcap file and print out the packets"""
  118. with open('pcap222.pcap', 'rb') as f:
  119. pcap = dpkt.pcap.Reader(f)
  120. print_icmp(pcap)
  121. if __name__ == '__main__':
  122. test()

输出:

  1. ('packet num count :', 377)
  2. ('Timestamp: ', '2017-08-01 03:45:56.403640')
  3. ('Ethernet Frame: ', 'ec:88:8f:86:14:5c', '9c:5c:8e:76:bf:24', 2048)
  4. IP: 202.118.168.73 -> 192.168.1.103 (len=56 ttl=253 DF=0 MF=0 offset=0)
  5. ICMP: type:3 code:13 checksum:52074 data: Unreach(data=IP(len=28, id=2556, off=16384, ttl=61, p=6, sum=36831, src='\xc0\xa8\x01g', dst='\xcal\x17q', opts='', data='n\xb1\x00P\x85)=]'))

打印HTTP请求

  1. #!/usr/bin/env python
  2. """
  3. Use DPKT to read in a pcap file and print out the contents of the packets
  4. This example is focused on the fields in the Ethernet Frame and IP packet
  5. """
  6. import dpkt
  7. import datetime
  8. import socket
  9. from dpkt.compat import compat_ord
  10. import ctypes
  11. import os
  12. def mac_addr(address):
  13. """Convert a MAC address to a readable/printable string
  14. Args:
  15. address (str): a MAC address in hex form (e.g. '\x01\x02\x03\x04\x05\x06')
  16. Returns:
  17. str: Printable/readable MAC address
  18. """
  19. return ':'.join('%02x' % compat_ord(b) for b in address)
  20. class sockaddr(ctypes.Structure):
  21. _fields_ = [("sa_family", ctypes.c_short),
  22. ("__pad1", ctypes.c_ushort),
  23. ("ipv4_addr", ctypes.c_byte * 4),
  24. ("ipv6_addr", ctypes.c_byte * 16),
  25. ("__pad2", ctypes.c_ulong)]
  26. if hasattr(ctypes, 'windll'):
  27. WSAStringToAddressA = ctypes.windll.ws2_32.WSAStringToAddressA
  28. WSAAddressToStringA = ctypes.windll.ws2_32.WSAAddressToStringA
  29. else:
  30. def not_windows():
  31. raise SystemError(
  32. "Invalid platform. ctypes.windll must be available."
  33. )
  34. WSAStringToAddressA = not_windows
  35. WSAAddressToStringA = not_windows
  36. def inet_pton(address_family, ip_string):
  37. addr = sockaddr()
  38. addr.sa_family = address_family
  39. addr_size = ctypes.c_int(ctypes.sizeof(addr))
  40. if WSAStringToAddressA(
  41. ip_string,
  42. address_family,
  43. None,
  44. ctypes.byref(addr),
  45. ctypes.byref(addr_size)
  46. ) != 0:
  47. raise socket.error(ctypes.FormatError())
  48. if address_family == socket.AF_INET:
  49. return ctypes.string_at(addr.ipv4_addr, 4)
  50. if address_family == socket.AF_INET6:
  51. return ctypes.string_at(addr.ipv6_addr, 16)
  52. raise socket.error('unknown address family')
  53. def inet_ntop(address_family, packed_ip):
  54. addr = sockaddr()
  55. addr.sa_family = address_family
  56. addr_size = ctypes.c_int(ctypes.sizeof(addr))
  57. ip_string = ctypes.create_string_buffer(128)
  58. ip_string_size = ctypes.c_int(ctypes.sizeof(ip_string))
  59. if address_family == socket.AF_INET:
  60. if len(packed_ip) != ctypes.sizeof(addr.ipv4_addr):
  61. raise socket.error('packed IP wrong length for inet_ntoa')
  62. ctypes.memmove(addr.ipv4_addr, packed_ip, 4)
  63. elif address_family == socket.AF_INET6:
  64. if len(packed_ip) != ctypes.sizeof(addr.ipv6_addr):
  65. raise socket.error('packed IP wrong length for inet_ntoa')
  66. ctypes.memmove(addr.ipv6_addr, packed_ip, 16)
  67. else:
  68. raise socket.error('unknown address family')
  69. if WSAAddressToStringA(
  70. ctypes.byref(addr),
  71. addr_size,
  72. None,
  73. ip_string,
  74. ctypes.byref(ip_string_size)
  75. ) != 0:
  76. raise socket.error(ctypes.FormatError())
  77. return ip_string[:ip_string_size.value - 1]
  78. # Adding our two functions to the socket library
  79. if os.name == 'nt':
  80. socket.inet_pton = inet_pton
  81. socket.inet_ntop = inet_ntop
  82. def inet_to_str(inet):
  83. return socket.inet_ntop(socket.AF_INET, inet)
  84. def print_http_requests(pcap):
  85. """Print out information about each packet in a pcap
  86. Args:
  87. pcap: dpkt pcap reader object (dpkt.pcap.Reader)
  88. """
  89. # packet num count
  90. r_num = 0
  91. # For each packet in the pcap process the contents
  92. for timestamp, buf in pcap:
  93. r_num=r_num+1
  94. print ('packet num count :' , r_num )
  95. # Unpack the Ethernet frame (mac src/dst, ethertype)
  96. eth = dpkt.ethernet.Ethernet(buf)
  97. # Make sure the Ethernet data contains an IP packet
  98. if not isinstance(eth.data, dpkt.ip.IP):
  99. print('Non IP Packet type not supported %s\n' % eth.data.__class__.__name__)
  100. continue
  101. # Now grab the data within the Ethernet frame (the IP packet)
  102. ip = eth.data
  103. # Check for TCP in the transport layer
  104. if isinstance(ip.data, dpkt.tcp.TCP):
  105. # Set the TCP data
  106. tcp = ip.data
  107. # Now see if we can parse the contents as a HTTP request
  108. try:
  109. request = dpkt.http.Request(tcp.data)
  110. except (dpkt.dpkt.NeedData, dpkt.dpkt.UnpackError):
  111. continue
  112. # Pull out fragment information (flags and offset all packed into off field, so use bitmasks)
  113. do_not_fragment = bool(ip.off & dpkt.ip.IP_DF)
  114. more_fragments = bool(ip.off & dpkt.ip.IP_MF)
  115. fragment_offset = ip.off & dpkt.ip.IP_OFFMASK
  116. # Print out the info
  117. print('Timestamp: ', str(datetime.datetime.utcfromtimestamp(timestamp)))
  118. print('Ethernet Frame: ', mac_addr(eth.src), mac_addr(eth.dst), eth.type)
  119. print('IP: %s -> %s (len=%d ttl=%d DF=%d MF=%d offset=%d)' %
  120. (inet_to_str(ip.src), inet_to_str(ip.dst), ip.len, ip.ttl, do_not_fragment, more_fragments, fragment_offset))
  121. print('HTTP request: %s\n' % repr(request))
  122. # Check for Header spanning acrossed TCP segments
  123. if not tcp.data.endswith(b'\r\n'):
  124. print('\nHEADER TRUNCATED! Reassemble TCP segments!\n')
  125. def test():
  126. """Open up a test pcap file and print out the packets"""
  127. with open('pcap222.pcap', 'rb') as f:
  128. pcap = dpkt.pcap.Reader(f)
  129. print_http_requests(pcap)
  130. if __name__ == '__main__':
  131. test()

输出:

  1. Timestamp: 2004-05-13 10:17:08.222534
  2. Ethernet Frame: 00:00:01:00:00:00 fe:ff:20:00:01:00 2048
  3. IP: 145.254.160.237 -> 65.208.228.223 (len=519 ttl=128 DF=1 MF=0 offset=0)
  4. HTTP request: Request(body='', uri='/download.html', headers={'accept-language': 'en-us,en;q=0.5', 'accept-encoding': 'gzip,deflate', 'connection': 'keep-alive', 'keep-alive': '300', 'accept': 'text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1', 'user-agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113', 'accept-charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7', 'host': 'www.ethereal.com', 'referer': 'http://www.ethereal.com/development.html'}, version='1.1', data='', method='GET')
  5. Timestamp: 2004-05-13 10:17:10.295515
  6. Ethernet Frame: 00:00:01:00:00:00 fe:ff:20:00:01:00 2048
  7. IP: 145.254.160.237 -> 216.239.59.99 (len=761 ttl=128 DF=1 MF=0 offset=0)
  8. HTTP request: Request(body='', uri='/pagead/ads?client=ca-pub-2309191948673629&random=1084443430285&lmt=1082467020&format=468x60_as&output=html&url=http%3A%2F%2Fwww.ethereal.com%2Fdownload.html&color_bg=FFFFFF&color_text=333333&color_link=000000&color_url=666633&color_border=666633', headers={'accept-language': 'en-us,en;q=0.5', 'accept-encoding': 'gzip,deflate', 'connection': 'keep-alive', 'keep-alive': '300', 'accept': 'text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1', 'user-agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113', 'accept-charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7', 'host': 'pagead2.googlesyndication.com', 'referer': 'http://www.ethereal.com/download.html'}, version='1.1', data='', method='GET')
  9. ...

打印出以太网IP

594 MB的pcap解析速度是127秒。

  1. # coding=utf-8
  2. import dpkt
  3. import socket
  4. import time
  5. import ctypes
  6. import os
  7. import datetime
  8. # 测试dpkt获取IP运行时间
  9. # 使用dpkt获取时间戳、源IP、目的IP
  10. class sockaddr(ctypes.Structure):
  11. _fields_ = [("sa_family", ctypes.c_short),
  12. ("__pad1", ctypes.c_ushort),
  13. ("ipv4_addr", ctypes.c_byte * 4),
  14. ("ipv6_addr", ctypes.c_byte * 16),
  15. ("__pad2", ctypes.c_ulong)]
  16. if hasattr(ctypes, 'windll'):
  17. WSAStringToAddressA = ctypes.windll.ws2_32.WSAStringToAddressA
  18. WSAAddressToStringA = ctypes.windll.ws2_32.WSAAddressToStringA
  19. else:
  20. def not_windows():
  21. raise SystemError(
  22. "Invalid platform. ctypes.windll must be available."
  23. )
  24. WSAStringToAddressA = not_windows
  25. WSAAddressToStringA = not_windows
  26. def inet_pton(address_family, ip_string):
  27. addr = sockaddr()
  28. addr.sa_family = address_family
  29. addr_size = ctypes.c_int(ctypes.sizeof(addr))
  30. if WSAStringToAddressA(
  31. ip_string,
  32. address_family,
  33. None,
  34. ctypes.byref(addr),
  35. ctypes.byref(addr_size)
  36. ) != 0:
  37. raise socket.error(ctypes.FormatError())
  38. if address_family == socket.AF_INET:
  39. return ctypes.string_at(addr.ipv4_addr, 4)
  40. if address_family == socket.AF_INET6:
  41. return ctypes.string_at(addr.ipv6_addr, 16)
  42. raise socket.error('unknown address family')
  43. def inet_ntop(address_family, packed_ip):
  44. addr = sockaddr()
  45. addr.sa_family = address_family
  46. addr_size = ctypes.c_int(ctypes.sizeof(addr))
  47. ip_string = ctypes.create_string_buffer(128)
  48. ip_string_size = ctypes.c_int(ctypes.sizeof(ip_string))
  49. if address_family == socket.AF_INET:
  50. if len(packed_ip) != ctypes.sizeof(addr.ipv4_addr):
  51. raise socket.error('packed IP wrong length for inet_ntoa')
  52. ctypes.memmove(addr.ipv4_addr, packed_ip, 4)
  53. elif address_family == socket.AF_INET6:
  54. if len(packed_ip) != ctypes.sizeof(addr.ipv6_addr):
  55. raise socket.error('packed IP wrong length for inet_ntoa')
  56. ctypes.memmove(addr.ipv6_addr, packed_ip, 16)
  57. else:
  58. raise socket.error('unknown address family')
  59. if WSAAddressToStringA(
  60. ctypes.byref(addr),
  61. addr_size,
  62. None,
  63. ip_string,
  64. ctypes.byref(ip_string_size)
  65. ) != 0:
  66. raise socket.error(ctypes.FormatError())
  67. return ip_string[:ip_string_size.value - 1]
  68. # Adding our two functions to the socket library
  69. if os.name == 'nt':
  70. socket.inet_pton = inet_pton
  71. socket.inet_ntop = inet_ntop
  72. def inet_to_str(inet):
  73. return socket.inet_ntop(socket.AF_INET, inet)
  74. def getip(pcap):
  75. Num = 0
  76. for timestamp, buf in pcap:
  77. eth = dpkt.ethernet.Ethernet(buf)
  78. # 对没有IP段的包过滤掉
  79. if eth.type != dpkt.ethernet.ETH_TYPE_IP:
  80. continue
  81. ip = eth.data
  82. ip_src = inet_to_str(ip.src)
  83. ip_dst = inet_to_str(ip.dst)
  84. # 打印时间戳,源->目标
  85. #print(ts + " " + ip_src + "-->" + ip_dst)
  86. Num= Num+1
  87. print ('{0}\ttime:{1}\tsrc:{2}-->dst:{3} '.format(Num,timestamp,ip_src ,ip_dst))
  88. if eth.data.__class__.__name__ == 'IP':
  89. ip = '%d.%d.%d.%d' % tuple(map(ord, list(eth.data.dst)))
  90. if eth.data.data.__class__.__name__ == 'TCP':
  91. if eth.data.data.dport == 80:
  92. print eth.data.data.data # http 请求的数据
  93. if __name__ == '__main__':
  94. starttime = datetime.datetime.now()
  95. f = open('pcap222.pcap', 'rb') # 要以rb方式打开,用r方式打开会报错
  96. pcap = dpkt.pcap.Reader(f)
  97. getip(pcap)
  98. endtime = datetime.datetime.now()
  99. print ('time : {0} seconds '.format((endtime - starttime).seconds))

输出:

  1. 1290064 time:1501562988.75 src:113.142.85.151-->dst:192.168.1.103
  2. 1290065 time:1501562988.75 src:192.168.1.103-->dst:113.142.85.151
  3. 1290066 time:1501562988.75 src:192.168.1.103-->dst:113.142.85.151
  4. 1290067 time:1501562988.75 src:113.142.85.151-->dst:192.168.1.103
  5. 1290068 time:1501562988.75 src:192.168.1.103-->dst:113.142.85.151
  6. 1290069 time:1501562988.76 src:192.168.1.103-->dst:113.142.85.151
  7. 1290070 time:1501562988.76 src:122.228.91.14-->dst:192.168.1.103
  8. 1290071 time:1501562988.76 src:192.168.1.103-->dst:113.142.85.151
  9. 1290072 time:1501562988.76 src:113.142.85.151-->dst:192.168.1.103
  10. 1290073 time:1501562988.76 src:192.168.1.103-->dst:113.142.85.151
  11. 1290074 time:1501562988.76 src:192.168.1.103-->dst:113.142.85.151
  12. GET / HTTP/1.1
  13. Accept: application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*
  14. Accept-Language: zh-cn
  15. User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Mac_PowerPC; en) Opera 9.24
  16. Referer: -
  17. Connection: Keep-Alive
  18. Host: win7.shangshai-qibao.cn

0x7、参考

SMTP协议分析 https://yq.aliyun.com/wenji/262429

scapy 解析pcap文件总结 https://www.cnblogs.com/14061216chen/p/8093441.html

python之字符串格式化(format) https://www.cnblogs.com/benric/p/4965224.html

Python解析Pcap包类源码学习的更多相关文章

  1. 【Java】【常用类】Object 基类 源码学习

    源码总览: 有好些都是native本地方法,背后是C++写的 没有关于构造器的描述,默认编译器提供的无参构造 https://blog.csdn.net/dmw412724/article/detai ...

  2. 【Java】【常用类】 Arrays工具类 源码学习

    虽然在数组的随笔中有说过,但实际上应该仔细深入一下源码进行分析 源码没有想象中的高大上,代码终究还是写给人看的,可读性大于执行性 最小阵列排序:1 乘 2的13次方 =  8192 学识浅薄,暂时还不 ...

  3. Java并发包源码学习系列:挂起与唤醒线程LockSupport工具类

    目录 LockSupport概述 park与unpark相关方法 中断演示 blocker的作用 测试无blocker 测试带blocker JDK提供的demo 总结 参考阅读 系列传送门: Jav ...

  4. Java并发包源码学习系列:JDK1.8的ConcurrentHashMap源码解析

    目录 为什么要使用ConcurrentHashMap? ConcurrentHashMap的结构特点 Java8之前 Java8之后 基本常量 重要成员变量 构造方法 tableSizeFor put ...

  5. Java并发包源码学习系列:阻塞队列实现之ArrayBlockingQueue源码解析

    目录 ArrayBlockingQueue概述 类图结构及重要字段 构造器 出队和入队操作 入队enqueue 出队dequeue 阻塞式操作 E take() 阻塞式获取 void put(E e) ...

  6. Java并发包源码学习系列:阻塞队列实现之LinkedBlockingQueue源码解析

    目录 LinkedBlockingQueue概述 类图结构及重要字段 构造器 出队和入队操作 入队enqueue 出队dequeue 阻塞式操作 E take() 阻塞式获取 void put(E e ...

  7. Java并发包源码学习系列:阻塞队列实现之PriorityBlockingQueue源码解析

    目录 PriorityBlockingQueue概述 类图结构及重要字段 什么是二叉堆 堆的基本操作 向上调整void up(int u) 向下调整void down(int u) 构造器 扩容方法t ...

  8. Java并发包源码学习系列:阻塞队列实现之DelayQueue源码解析

    目录 DelayQueue概述 类图及重要字段 Delayed接口 Delayed元素案例 构造器 put take first = null 有什么用 总结 参考阅读 系列传送门: Java并发包源 ...

  9. Java并发包源码学习系列:阻塞队列实现之SynchronousQueue源码解析

    目录 SynchronousQueue概述 使用案例 类图结构 put与take方法 void put(E e) E take() Transfer 公平模式TransferQueue QNode t ...

随机推荐

  1. yum工具的使用

    yum工具的使用 ---------- yum list|head -n 10会有一个报错:由于管道被破坏而退出-----------搜索名字为vim的rpm包yum search vim使用grep ...

  2. java io系列07之 FileInputStream和FileOutputStream

    本章介绍FileInputStream 和 FileOutputStream 转载请注明出处:http://www.cnblogs.com/skywang12345/p/io_07.html File ...

  3. ubuntu主题收集

    ubuntu主题收集 一些cmd常用命令: 任务栏底部,进入Dash并打开终端,命令最后一个是参数可选 ( Left | Bottom ) gsettings set com.canonical.Un ...

  4. VS Code +node npm 调试 js

    打开vsCode的调试控制台里面的终端 然后输入下面代码 npm install express-generator -g 1 创建一个命名为 myapp 的应用. express myapp 你就可 ...

  5. Idea构建maven项目

    Idea构建maven项目: 步骤一: 步骤二: 自动导入Maven项目: 步骤三:增加web 二:搭建spring项目结构: 结构图: 网上都是一大堆的:自己也可以去搜:ssm  pom.xml  ...

  6. 使用wget命令下载JDK失败(文件特别小)

    问题RT: 我们在网页上下载的时候要点一下 “Accept License Agreement ” ,使用wget下载的时候也需要提交这个 accept,方法如下: wget --no-check-c ...

  7. MySQL中int(m)的含义

    2017-12-18 @后厂 int(M): M indicates the maximum display width for integer types. 原来,在 int(M) 中,M 的值跟 ...

  8. python模块之logging模块

    1. 低配版 # 指定显示信息格式 import logging logging.basicConfig( level=20, # 设置显示或写入的起始级别 format="%(asctim ...

  9. 【转载】C#:使用双缓冲让界面绘制图形时避免闪烁

    https://blog.csdn.net/fujie724/article/details/5767064#

  10. C# Math.Round实现中国式四舍五入

    decimal sum = 11111.334; sum = , MidpointRounding.AwayFromZero);  sum:11111.33decimal sum = 11111.34 ...