【BUG】xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence
来自http://blog.csdn.net/chenyanbo/article/details/6866941
xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence
说简单点当你解析别人的xml格式出现这个错误可能就是别人在生成xml时没有保存为utf-8的字符编码格式。
在中文版的window下java的默认的编码为GBK,也就是所虽然我们标识了要将xml保存为utf-8格式但实际上文件是以GBK格式来保存的,所以这也就是为什么能够我们使用GBK、GB2312编码来生成xml文件能正确的被解析,而以UTF-8格式生成的文件不能被xml解析器所解析的原因。
xml解析时遇到的编码异常:
- org.dom4j.DocumentException: Invalid byte 1 of 1-byte UTF-8 sequence. Nested exception: Invalid byte 1 of 1-byte UTF-8 sequence.
- at org.dom4j.io.SAXReader.read(SAXReader.java:484)
- at org.dom4j.io.SAXReader.read(SAXReader.java:321)
- at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
- at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
- at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
- Nested exception:
- com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
- at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
- at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
- at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
- at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.peekChar(XMLEntityScanner.java:487)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2687)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
- at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
- at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
- at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
- at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
- at org.dom4j.io.SAXReader.read(SAXReader.java:465)
- at org.dom4j.io.SAXReader.read(SAXReader.java:321)
- at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
- at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
- at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
- Nested exception: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
- at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
- at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
- at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
- at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.peekChar(XMLEntityScanner.java:487)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2687)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
- at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
- at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
- at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
- at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
- at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
- at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
- at org.dom4j.io.SAXReader.read(SAXReader.java:465)
- at org.dom4j.io.SAXReader.read(SAXReader.java:321)
- at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
- at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
- at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
org.dom4j.DocumentException: Invalid byte 1 of 1-byte UTF-8 sequence. Nested exception: Invalid byte 1 of 1-byte UTF-8 sequence.
at org.dom4j.io.SAXReader.read(SAXReader.java:484)
at org.dom4j.io.SAXReader.read(SAXReader.java:321)
at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
Nested exception:
com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.peekChar(XMLEntityScanner.java:487)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2687)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
at org.dom4j.io.SAXReader.read(SAXReader.java:465)
at org.dom4j.io.SAXReader.read(SAXReader.java:321)
at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
Nested exception: com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)
at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.peekChar(XMLEntityScanner.java:487)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2687)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:140)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
at org.dom4j.io.SAXReader.read(SAXReader.java:465)
at org.dom4j.io.SAXReader.read(SAXReader.java:321)
at com.dataoperate.PaseXml.pXml(PaseXml.java:28)
at com.dataoperate.JdbcOp.insertDb(JdbcOp.java:30)
at com.dataoperate.JdbcOp.main(JdbcOp.java:89)
解决:
1、最简单就是把<?xml version="1.0" encoding="UTF-8"?>改成<?xml version="1.0" encoding="gbk"?>
2、或者把xml打开另存的时候把字符集改为UTF-8后保存
3、在代码解析的时候先把xml重新写一遍
- SAXReader reader = new SAXReader();
- org.dom4j.Document document = reader.read("D:\\ha.xml");
- OutputFormat of = new OutputFormat();
- of.setEncoding("UTF-8"); //改变编码方式
- XMLWriter writer = new XMLWriter(new FileWriter "d:\\dom4j.xml"), of);
SAXReader reader = new SAXReader();
org.dom4j.Document document = reader.read("D:\\ha.xml");
OutputFormat of = new OutputFormat();
of.setEncoding("UTF-8"); //改变编码方式
XMLWriter writer = new XMLWriter(new FileWriter "d:\\dom4j.xml"), of);
4、直接dom4j读取的时候用io来读,修改字符编码
- FileInputStream in = new FileInputStream(new File(fileName));
- Reader read = new InputStreamReader(in,"gbk");
- Document document = reader.read(read);
【BUG】xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence的更多相关文章
- Xml读取异常--Invalid byte 1 of 1-byte UTF-8 sequence
xml读取异常Invalid byte 1 of 1-byte UTF-8 sequence org.dom4j.DocumentException: Invalid byte 1 of 1-byte ...
- mybatis异常invalid comparison: java.util.Date and java.lang.String
原文链接:http://blog.csdn.net/wanghailong_qd/article/details/50673144 mybatis异常invalid comparison: java. ...
- xml 读取递归算法
xml 读取递归算法:
- paip.获取proxool的配置 xml读取通过jdk xml 初始化c3c0在代码中总结
paip.获取proxool的配置 xml读取通过jdk xml 初始化c3c0在代码中 xml读取通过jdk xml 初始化c3c0在代码中.. ... 作者Attilax 艾龙, EMAI ...
- Java基础知识强化之IO流笔记27:FileInputStream读取数据一次一个字节数组byte[ ]
1. FileInputStream读取数据一次一个字节数组byte[ ] 使用FileInputStream一次读取一个字节数组: int read(byte[] b) 返回值:返回值其实是实际 ...
- Linq to XML 读取XML 备忘笔记
本文转载:http://www.cnblogs.com/infozero/archive/2010/07/13/1776383.html Linq to XML 读取XML 备忘笔记 最近一个项目中有 ...
- C#基础笔记---浅谈XML读取以及简单的ORM实现
背景: 在开发ASP.NETMVC4 项目中,虽然web.config配置满足了大部分需求,不过对于某些特定业务,我们有时候需要添加新的配置文件来记录配置信息,那么XML文件配置无疑是我们选择的一个方 ...
- python 读取文件时报错UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 205: illegal multibyte sequence
python读取文件时提示"UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 205: illegal m ...
- svn checkout The XML response contains invalid XML
svn checkout 报错:The XML response contains invalid XML 待解决? ---目前没有找到好的解决方法,svn数据库中存的log入手应该可以,有时间再去看 ...
随机推荐
- Luogu3804 【模板】后缀自动机(后缀自动机)
建出parent树统计即可.开始memcpy处写的是sizeof(son[y]),然后就T掉了……还是少用这种东西吧. 同时也有SA做法.答案子串一定是名次数组中相邻两个串的lcp.单调栈统计其是几个 ...
- Leading and Trailing LightOJ - 1282 (取数的前三位和后三位)
题意: 求n的k次方的前三位 和 后三位 ...刚开始用 Java的大数写的...果然超时... 好吧 这题用快速幂取模求后三位 然后用一个技巧求前三位 ...orz... 任何一个数n均可以表示 ...
- [AHOI2014/JSOI2014] 解题报告
[AHOI2014/JSOI2014] 奇怪的计算器 一个很关键的结论,任何时候每个数的相对大小是不变的. 于是可以把这个相对大小当成线段树的权值,每次只需要维护一下区间极值和tag就好了,关于操作四 ...
- .net连接ORACLE数据库
这段时间维护客户的一个系统,该系统使用的是ORACLE数据库,之前开发的时候用的都是MSSQL,并没有使用过ORACLE.这两种数据库虽然都是关系型数据库,但是具体的操作大有不同,这里作下记录. 连接 ...
- centos6.5重新调整/home和跟目录/大小
0. 说明 系统刚刚安装完之后,默认到/home有1.5TiB,而根分区只有200G.现在是要将VolGroup-lv_home缩小到200G,并将剩余的空间添加给VolGroup-lv_root. ...
- C# ADO.NET与面向对象
软件开发的三层:界面层,业务逻辑层,数据访问层: 数据访问层:项目添加App_Code文件夹: 实体类:根据数据库表结构,类名和数据库表名一致: 每个成员变量要与数据库表的列相对应,对象正好可以列为一 ...
- 洛谷P5206 数树
题意: task0,给定两棵树T1,T2,取它们公共边(两端点相同)加入一张新的图,记新图连通块个数为x,求yx. task1,给定T1,求所有T2的task0之和. task2,求所有T1的task ...
- 搭建一个简单的node.js服务器
第一步:安装node.js.可以去官网:https://nodejs.org/en/进行下载. 查看是否成功,只需在控制台输入 node -v.出现版本号的话,就证明成功了. 第二步:编写node.j ...
- MATLAB:图像裁切(imcrop函数)
对图像进行裁切可用imcrop函数,实现过程如下: close all; %关闭当前所有图形窗口,清空工作空间变量,清除工作空间所有变量 clear all; clc; [A,map]=imread( ...
- MATLAB:图像的与、或、非、异或逻辑运算(&、|、~、xor)
图像的与.或.非.异或逻辑运算涉及到了&.|.~和xor符号 close all;%关闭当前所有图形窗口,清空工作空间变量,清除工作空间所有变量 clc; clear all; I=imrea ...