python 解析xml 文件: DOM 方式

环境

python：3.4.4

准备xml文件

首先新建一个xml文件，countries.xml。内容是在python官网上看到的。

<?xml version="1.0"?>

<data>

    <country name="Liechtenstein">

        <rank>1</rank>

        <year>2008</year>

        <gdppc>141100</gdppc>

        <neighbor name="Austria" direction="E"/>

        <neighbor name="Switzerland" direction="W"/>

    </country>

    <country name="Singapore">

        <rank>4</rank>

        <year>2011</year>

        <gdppc>59900</gdppc>

        <neighbor name="Malaysia" direction="N"/>

    </country>

    <country name="Panama">

        <rank>68</rank>

        <year>2011</year>

        <gdppc>13600</gdppc>

        <neighbor name="Costa Rica" direction="W"/>

        <neighbor name="Colombia" direction="E"/>

    </country>

</data>

准备python文件

新建一个test_DOM.py，用来解析xml文件。

#!/usr/bin/python

# -*- coding: UTF-8 -*-

from xml.dom.minidom import parse

import xml.dom.minidom

DOMTree = xml.dom.minidom.parse("countries.xml")

collection = DOMTree.documentElement

if collection.hasAttribute("data"):

    print ("Root element : %s" % collection.getAttribute("data"))

countries = collection.getElementsByTagName("country")

for country in countries:

    print ("*****Country*****")

    if country.hasAttribute("name"):

        print ("Name: %s" % country.getAttribute("name"))

    rank = country.getElementsByTagName('rank')[0]

    print ("Rank: %s" % rank.childNodes[0].data)

    year = country.getElementsByTagName('year')[0]

    print ("Year: %s" % year.childNodes[0].data)

    gdppc = country.getElementsByTagName('gdppc')[0]

    print ("Gdppc: %s" % gdppc.childNodes[0].data)

    neighbors = country.getElementsByTagName('neighbor')

    for neighbor in neighbors:

        print ("Neighbor:", neighbor.getAttribute("name"),neighbor.getAttribute("direction"))

执行结果

>python test_DOM.py

*****Country*****

Name: Liechtenstein

Rank: 1

Year: 2008

Gdppc: 141100

Neighbor: Austria E

Neighbor: Switzerland W

*****Country*****

Name: Singapore

Rank: 4

Year: 2011

Gdppc: 59900

Neighbor: Malaysia N

*****Country*****

Name: Panama

Rank: 68

Year: 2011

Gdppc: 13600

Neighbor: Costa Rica W

Neighbor: Colombia E

备注

DOM（Document Object Model）

DOM是一个W3C的跨语言的API，用来读取和更改 XML 文档。

一个DOM解析器在解析一个XML文档时，一次性读取整个文档，把文档中的所有元素保存在内存中的一个树结构中，之后可以对这个树结构进行读取或修改，也可以把修改过的树结构写入xml文件。

参见： https://docs.python.org/2/library/xml.dom.html

DOMTree = xml.dom.minidom.parse("countries.xml")

使用 xml.dom.minidom解析器打开 countries.xml 文件，并返回一个 Document对象，也就是树结构。Document 对象代表了整个 XML 文档，包括它的元素、属性、处理指令、备注等。

参见： https://docs.python.org/2/library/xml.dom.minidom.html

Return a Document from the given input. filename_or_file may be either a file name, or a file-like object. parser, if given, must be a SAX2 parser object. This function will change the document handler of the parser and activate namespace support; other parser configuration (like setting an entity resolver) must have been done in advance.

collection = DOMTree.documentElement

返回 DOMTree的根元素。

Document.documentElement

The one and only root element of the document.

rank = country.getElementsByTagName('rank')[0]

从country往下寻找所有 tag名为“rank”的元素节点，将找到的第一个节点赋值给 rank。

Document.getElementsByTagName(tagName)

Search for all descendants (direct children, children’s children, etc.) with a particular element type name.

collection.getAttribute("data")

获取并返回 collection 的“data”属性值。如果collection没有“data”属性，则返回一个空的字符串。

Element.getAttribute(name)

Return the value of the attribute named by name as a string. If no such attribute exists, an empty string is returned, as if the attribute had no value.

python 解析xml 文件: DOM 方式的更多相关文章

python 解析xml 文件: SAX方式
环境 python:3.4.4 准备xml文件首先新建一个xml文件,countries.xml.内容是在python官网上看到的. <?xml version="1.0" ...
[转载] python 解析xml 文件: SAX方式
环境 python:3.4.4 准备xml文件首先新建一个xml文件,countries.xml.内容是在python官网上看到的. <?xml version="1.0" ...
android解析xml文件的方式
android解析xml文件的方式作者:东子哥 ,发布于2012-11-26,来源:博客园在androd手机中处理xml数据时很常见的事情,通常在不同平台传输数据的时候,我们就可能使用xm ...
JAVA解析XML之DOM方式
JAVA解析XML之DOM方式准备工作创建DocumentBuilderFactory对象; 创建DocumentBuilder对象; 通过DocumentBuilder对象的parse方法 ...
Java解析XML文件的方式
在项目里,我们往往会把一些配置信息放到xml文件里,或者各部门间会通过xml文件来交换业务数据,所以有时候我们会遇到“解析xml文件”的需求.一般来讲,有基于DOM树和SAX的两种解析xml文件的方式 ...
python 解析xml 文件: Element Tree 方式
环境 python:3.4.4 准备xml文件首先新建一个xml文件,countries.xml.内容是在python官网上看到的. <?xml version="1.0" ...
PYTHON解析XML的多种方式效率对比实测
在最初学习PYTHON的时候,只知道有DOM和SAX两种解析方法,但是其效率都不够理想,由于需要处理的文件数量太大,这两种方式耗时太高无法接受. 在网络搜索后发现,目前应用比较广泛,且效率相对较高的E ...
遍历文件创建XML对象方法 python解析XML文件提取坐标计存入文件
XML文件??? xml即可扩展标记语言,它可以用来标记数据.定义数据类型,是一种允许用户对自己的标记语言进行定义的源语言. 里面的标签都是可以随心所欲的按照他的命名规则来定义的,文件名为roi.xm ...
【TensorFlow】Python解析xml文件
最近在项目中使用TensorFlow训练目标检测模型,在制作自己的数据集时使用了labelimg软件对图片进行标注,产生了VOC格式的数据,但标注生成的xml文件标签值难免会产生个别错误造成程序无法跑 ...

随机推荐

mysql -数据库(备份与恢复)
1,备份某个数据库(以db_abc为例) 1)通过 cmd 切换到mysql 安装目录下的'bin'目录,然后执行'mysqldump -uroot -p db_abc > db_abc_bak ...
linux常用编辑器
管理员在进行系统操作的时候,不可避免地会对文本进行修改,如进行各种服务程序配置文件的改动,使程序对用户提供不同的服务效果.在本章我们向大家介绍Linux上常见的编辑器ed.vi.emacs,同时以vi ...
最简单的基于FFmpeg的移动端例子：IOS 推流器
转至:http://blog.csdn.net/leixiaohua1020/article/details/47072519 ================================== ...
使用charles proxy for Mac来抓取手机App的网络包
之前做Web项目的时候,经常会使用Fiddler(Windows下).Charles Proxy(Mac下)来抓包,调试一些东西:现在搞Android App开发,有时候也需要分析手机App的网络请求 ...
Spring 中的注解
1.普通方式注解 a.在配置文件中配置 1.导入命名空间 xmlns:context="http://www.springframework.org/schema/ ...
【POJ2887】【块状链表】Big String
Description You are given a string and supposed to do some string manipulations. Input The first lin ...
【POJ1195】【二维树状数组】Mobile phones
Description Suppose that the fourth generation mobile phone base stations in the Tampere area operat ...
js 支持的原始数据类型
原始数据类型: 数值型: 1.十进制数 <script> var a =12; a = -12 a = 12.4 a =.23e2 //=>23 a = 2e3 //=>200 ...
Python【第七篇】面向对象进阶
大纲一.面向对象高级语法 1.静态方法.类方法.属性方法 2.类的特殊成员方法 3.反射二.异常处理三.网络编程之socket基础一.面向对象高级语法 1.静态方法:名义上归类管理,实际上静态 ...
tomcat架构分析-索引
出处:http://gearever.iteye.com tomcat架构分析 (概览) tomcat架构分析 (容器类) tomcat架构分析 (valve机制) tomcat架构分析 (valve ...