Python XML解析和处理

movies.xml

<collection shelf = "New Arrivals">

<movie title = "Enemy Behind">

   <type>War, Thriller</type>

   <format>DVD</format>

   <year></year>

   <rating>PG</rating>

   <stars></stars>

   <description>Talk about a US-Japan war</description>

</movie>

<movie title = "Transformers">

   <type>Anime, Science Fiction</type>

   <format>DVD</format>

   <year></year>

   <rating>R</rating>

   <stars></stars>

   <description>A schientific fiction</description>

</movie>

   <movie title = "Trigun">

   <type>Anime, Action</type>

   <format>DVD</format>

   <episodes></episodes>

   <rating>PG</rating>

   <stars></stars>

   <description>Vash the Stampede!</description>

</movie>

<movie title = "Ishtar">

   <type>Comedy</type>

   <format>VHS</format>

   <rating>PG</rating>

   <stars></stars>

   <description>Viewable boredom</description>

</movie>

</collection>

使用SAX API解析XML

#!/usr/bin/python3

import xml.sax

class MovieHandler( xml.sax.ContentHandler ):

   def __init__(self):

      self.CurrentData = ""

      self.type = ""

      self.format = ""

      self.year = ""

      self.rating = ""

      self.stars = ""

      self.description = ""

   # Call when an element starts

   def startElement(self, tag, attributes):

      self.CurrentData = tag

      if tag == "movie":

         print ("*****Movie*****")

         title = attributes["title"]

         print ("Title:", title)

   # Call when an elements ends

   def endElement(self, tag):

      if self.CurrentData == "type":

         print ("Type:", self.type)

      elif self.CurrentData == "format":

         print ("Format:", self.format)

      elif self.CurrentData == "year":

         print ("Year:", self.year)

      elif self.CurrentData == "rating":

         print ("Rating:", self.rating)

      elif self.CurrentData == "stars":

         print ("Stars:", self.stars)

      elif self.CurrentData == "description":

         print ("Description:", self.description)

      self.CurrentData = ""

   # Call when a character is read

   def characters(self, content):

      if self.CurrentData == "type":

         self.type = content

      elif self.CurrentData == "format":

         self.format = content

      elif self.CurrentData == "year":

         self.year = content

      elif self.CurrentData == "rating":

         self.rating = content

      elif self.CurrentData == "stars":

         self.stars = content

      elif self.CurrentData == "description":

         self.description = content

if ( __name__ == "__main__"):

   # create an XMLReader

   parser = xml.sax.make_parser()

   # turn off namepsaces

   parser.setFeature(xml.sax.handler.feature_namespaces, )

   # override the default ContextHandler

   Handler = MovieHandler()

   parser.setContentHandler( Handler )

   parser.parse("movies.xml")

输出

*****Movie*****

Title: Enemy Behind

Type: War, Thriller

Format: DVD

Year:

Rating: PG

Stars:

Description: Talk about a US-Japan war

*****Movie*****

Title: Transformers

Type: Anime, Science Fiction

Format: DVD

Year:

Rating: R

Stars:

Description: A schientific fiction

*****Movie*****

Title: Trigun

Type: Anime, Action

Format: DVD

Rating: PG

Stars:

Description: Vash the Stampede!

*****Movie*****

Title: Ishtar

Type: Comedy

Format: VHS

Rating: PG

Stars:

Description: Viewable boredom

使用DOM API解析XML

#!/usr/bin/python3

from xml.dom.minidom import parse

import xml.dom.minidom

# Open XML document using minidom parser

DOMTree = xml.dom.minidom.parse("movies.xml")

collection = DOMTree.documentElement

if collection.hasAttribute("shelf"):

   print ("Root element : %s" % collection.getAttribute("shelf"))

# Get all the movies in the collection

movies = collection.getElementsByTagName("movie")

# Print detail of each movie.

for movie in movies:

   print ("*****Movie*****")

   if movie.hasAttribute("title"):

      print ("Title: %s" % movie.getAttribute("title"))

   type = movie.getElementsByTagName('type')[]

   print ("Type: %s" % type.childNodes[].data)

   format = movie.getElementsByTagName('format')[]

   print ("Format: %s" % format.childNodes[].data)

   rating = movie.getElementsByTagName('rating')[]

   print ("Rating: %s" % rating.childNodes[].data)

   description = movie.getElementsByTagName('description')[]

   print ("Description: %s" % description.childNodes[].data)

输出

Root element : New Arrivals

*****Movie*****

Title: Enemy Behind

Type: War, Thriller

Format: DVD

Rating: PG

Description: Talk about a US-Japan war

*****Movie*****

Title: Transformers

Type: Anime, Science Fiction

Format: DVD

Rating: R

Description: A schientific fiction

*****Movie*****

Title: Trigun

Type: Anime, Action

Format: DVD

Rating: PG

Description: Vash the Stampede!

*****Movie*****

Title: Ishtar

Type: Comedy

Format: VHS

Rating: PG

Description: Viewable boredom

Python XML解析和处理的更多相关文章

Python XML解析（转载）
Python XML解析什么是XML? XML 指可扩展标记语言(eXtensible Markup Language). 你可以通过本站学习XML教程 XML 被设计用来传输和存储数据. XML是 ...
Python XML解析之ElementTree
参考网址: http://www.runoob.com/python/python-xml.html https://docs.python.org/2/library/xml.etree.eleme ...
python大法好——Python XML解析
Python XML解析什么是XML? XML 被设计用来传输和存储数据. XML是一套定义语义标记的规则,这些标记将文档分成许多部件并对这些部件加以标识. 它也是元标记语言,即定义了用于定义其他与 ...
Python XML解析之DOM
DOM说明: DOM:Document Object Model API DOM是一种跨语言的XML解析机制,DOM把整个XML文件或字符串在内存中解析为树型结构方便访问. https://docs. ...
Python XML解析
什么是XML? XML 指可扩展标记语言(eXtensible Markup Language). 你可以通过本站学习XML教程 XML 被设计用来传输和存储数据. XML是一套定义语义标记的规则,这 ...
Python XML 解析
什么是 XML? XML 指可扩展标记语言(eXtensible Markup Language). XML 被设计用来传输和存储数据. XML 是一套定义语义标记的规则,这些标记将文档分成许多部件并 ...
Python XML 解析Ⅱ
make_parser方法以下方法创建一个新的解析器对象并返回. 参数说明: parser_list - 可选参数,解析器列表 parser方法以下方法创建一个 SAX 解析器并解析xml文档: ...
Python xml 解析百度糯米信息
先利用爬虫利用百度糯米提供的api来采集北京当天的团购信息,保存为numi.html import xml.etree.ElementTree as ET import os class Nuomi( ...
面试官问我：如何在 Python 中解析和修改 XML
摘要:我们经常需要解析用不同语言编写的数据.Python提供了许多库来解析或拆分用其他语言编写的数据.在此 Python XML 解析器教程中,您将学习如何使用 Python 解析 XML. 本文分享 ...

随机推荐

【BZOJ2164】采矿树链剖分+线段树维护DP
[BZOJ2164]采矿 Description 浩浩荡荡的cg大军发现了一座矿产资源极其丰富的城市,他们打算在这座城市实施新的采矿战略.这个城市可以看成一棵有n个节点的有根树,我们把每个节点用1到n ...
设备加速传感器(accelerometer) ---- HTML5+
模块:Accelerometer Accelerometer模块管理设备加速度传感器,用于获取设备加速度信息,包括x(屏幕水平方向).y(垂直屏幕水平方向).z(垂直屏幕平面方向)三个方向的加速度信息 ...
localstorage - HTML 5 Web 存储总结---【巷子】
001.localStorage概念在html5中,新加入了一个localStorage特性,这个特性主要是用来作为本地存储,解决了cookie存储空间不足的问题(cookie中每条cookie存储 ...
hibernate基于注解实现映射关系的配置
关联关系的配置步骤 ①要理清楚管理关系 ②确定管理依赖关系的哪一方 1一对一例如:person 和IdCard ①确定依赖关系:一对一 ②依赖关系由person类管理代码如下: person: @En ...
Redis的一些结构
Qt 控制线程的顺序执行（使用QWaitCondition，并且线程类的run函数里记得加exec(),使得线程常驻）
背景项目中用到多线程,对线程的执行顺序有要求: A.一个线程先收数据 B.一个线程处理数据 C.一个线程再将处理后的数据发送出去要求三个线程按照ABC的顺序循环执行. 思路子类化多线程方法重写子类 ...
django的crontab
最近需要考虑如何在django环境中跑定时任务. 这个在 stackoverflow 也有对应的讨论 , 方法也有不少, 这边简单尝试和总结下. 假设我们现在的定期任务就是睡眠 n 秒, 然后往 ...
Flask wtform组件
Wtforms简介 WTForms是一个支持多个web框架的form组件主要能够帮助我们生成html标签对数据进行验证安装 pip install wtforms Wtforms的使用这里借助 ...
003-and design-dva.js 知识导图-02-Reducer，Effect，Subscription，Router，dva配置，工具
一.Reducer reducer 是一个函数,接受 state 和 action,返回老的或新的 state .即:(state, action) => state 增删改以 todos 为 ...
go-006-运算符
运算符用于在程序运行时执行数学或逻辑运算. Go 语言内置的运算符有: 算术运算符关系运算符逻辑运算符位运算符赋值运算符其他运算符算术运算符下表列出了所有Go语言的算术运算符.假定 A ...

Python XML解析和处理

使用SAX API解析XML

使用DOM API解析XML

Python XML解析和处理的更多相关文章

随机推荐

热门专题