XML文件解析之SAX解析
使用DOM解析的时候是需要把文档的所有内容读入内存然后建立一个DOM树结构,然后通过DOM提供的接口来实现XML文件的解析,如果文件比较小的时候肯定是很方便的。但是如果是XML文件很大的话,那么这种方式的解析效率肯定会大打折扣的,所以SAX解析就很有必要的了。SAX采用基于事件驱动的处理方式,它将XML文档转换成一系列的事件,由单独的事件处理器来决定如何处理。在读入文档的过程中便实现了解析过程,现在就简单介绍下SAX解析的具体实现过程。
1.主要对象
SAXParserFactory:解析工厂
SAXParser:解析器,通过解析工厂获取
ContentHander、DTDHander、ErrorHandler,EntityResolver:事件处理器接口
DefaultHandler:继承了上面的四个事件接口,在实际开发中直接从DefaultHandler继承并实现相关函数就可以了
2.XML文档
和上次DOM解析的XML文件是一样的
<?xml version="1.0" encoding="UTF-8"?>
<world>
<comuntry id="1">
<name>China</name>
<capital>Beijing</capital>
<population>1234</population>
<area>960</area>
</comuntry>
<comuntry id="2">
<name id="">America</name>
<capital>Washington</capital>
<population>234</population>
<area>900</area>
</comuntry>
<comuntry id="3">
<name >Japan</name>
<capital>Tokyo</capital>
<population>234</population>
<area>60</area>
</comuntry>
<comuntry id="4">
<name >Russia</name>
<capital>Moscow</capital>
<population>34</population>
<area>1960</area>
</comuntry>
</world>
3.主要接口分析
EntityResolver :
package org.xml.sax; import java.io.IOException; public interface EntityResolver { /**
* Allow the application to resolve external entities.
*
* <p>The parser will call this method before opening any external
* entity except the top-level document entity. Such entities include
* the external DTD subset and external parameter entities referenced
* within the DTD (in either case, only if the parser reads external
* parameter entities), and external general entities referenced
* within the document element (if the parser reads external general
* entities). The application may request that the parser locate
* the entity itself, that it use an alternative URI, or that it
* use data provided by the application (as a character or byte
* input stream).</p>
*
* <p>Application writers can use this method to redirect external
* system identifiers to secure and/or local URIs, to look up
* public identifiers in a catalogue, or to read an entity from a
* database or other input source (including, for example, a dialog
* box). Neither XML nor SAX specifies a preferred policy for using
* public or system IDs to resolve resources. However, SAX specifies
* how to interpret any InputSource returned by this method, and that
* if none is returned, then the system ID will be dereferenced as
* a URL. </p>
*
* <p>If the system identifier is a URL, the SAX parser must
* resolve it fully before reporting it to the application.</p>
*
* @param publicId The public identifier of the external entity
* being referenced, or null if none was supplied.
* @param systemId The system identifier of the external entity
* being referenced.
* @return An InputSource object describing the new input source,
* or null to request that the parser open a regular
* URI connection to the system identifier.
* @exception org.xml.sax.SAXException Any SAX exception, possibly
* wrapping another exception.
* @exception java.io.IOException A Java-specific IO exception,
* possibly the result of creating a new InputStream
* or Reader for the InputSource.
* @see org.xml.sax.InputSource
*/
public abstract InputSource resolveEntity (String publicId,
String systemId)
throws SAXException, IOException; }
DTDHandler :
package org.xml.sax; /**
* Receive notification of basic DTD-related events.
*
* <blockquote>
* <em>This module, both source code and documentation, is in the
* Public Domain, and comes with <strong>NO WARRANTY</strong>.</em>
* See <a href='http://www.saxproject.org'>http://www.saxproject.org</a>
* for further information.
* </blockquote>
*
* <p>If a SAX application needs information about notations and
* unparsed entities, then the application implements this
* interface and registers an instance with the SAX parser using
* the parser's setDTDHandler method. The parser uses the
* instance to report notation and unparsed entity declarations to
* the application.</p>
*
* <p>Note that this interface includes only those DTD events that
* the XML recommendation <em>requires</em> processors to report:
* notation and unparsed entity declarations.</p>
*
* <p>The SAX parser may report these events in any order, regardless
* of the order in which the notations and unparsed entities were
* declared; however, all DTD events must be reported after the
* document handler's startDocument event, and before the first
* startElement event.
* (If the {@link org.xml.sax.ext.LexicalHandler LexicalHandler} is
* used, these events must also be reported before the endDTD event.)
* </p>
*
* <p>It is up to the application to store the information for
* future use (perhaps in a hash table or object tree).
* If the application encounters attributes of type "NOTATION",
* "ENTITY", or "ENTITIES", it can use the information that it
* obtained through this interface to find the entity and/or
* notation corresponding with the attribute value.</p>
*
* @since SAX 1.0
* @author David Megginson
* @see org.xml.sax.XMLReader#setDTDHandler
*/
public interface DTDHandler { /**
* Receive notification of a notation declaration event.
*
* <p>It is up to the application to record the notation for later
* reference, if necessary;
* notations may appear as attribute values and in unparsed entity
* declarations, and are sometime used with processing instruction
* target names.</p>
*
* <p>At least one of publicId and systemId must be non-null.
* If a system identifier is present, and it is a URL, the SAX
* parser must resolve it fully before passing it to the
* application through this event.</p>
*
* <p>There is no guarantee that the notation declaration will be
* reported before any unparsed entities that use it.</p>
*
* @param name The notation name.
* @param publicId The notation's public identifier, or null if
* none was given.
* @param systemId The notation's system identifier, or null if
* none was given.
* @exception org.xml.sax.SAXException Any SAX exception, possibly
* wrapping another exception.
* @see #unparsedEntityDecl
* @see org.xml.sax.Attributes
*/
public abstract void notationDecl (String name,
String publicId,
String systemId)
throws SAXException; /**
* Receive notification of an unparsed entity declaration event.
*
* <p>Note that the notation name corresponds to a notation
* reported by the {@link #notationDecl notationDecl} event.
* It is up to the application to record the entity for later
* reference, if necessary;
* unparsed entities may appear as attribute values.
* </p>
*
* <p>If the system identifier is a URL, the parser must resolve it
* fully before passing it to the application.</p>
*
* @exception org.xml.sax.SAXException Any SAX exception, possibly
* wrapping another exception.
* @param name The unparsed entity's name.
* @param publicId The entity's public identifier, or null if none
* was given.
* @param systemId The entity's system identifier.
* @param notationName The name of the associated notation.
* @see #notationDecl
* @see org.xml.sax.Attributes
*/
public abstract void unparsedEntityDecl (String name,
String publicId,
String systemId,
String notationName)
throws SAXException; }
ContentHandler:
package org.xml.sax; /**
* Receive notification of the logical content of a document.
*
* <blockquote>
* <em>This module, both source code and documentation, is in the
* Public Domain, and comes with <strong>NO WARRANTY</strong>.</em>
* See <a href='http://www.saxproject.org'>http://www.saxproject.org</a>
* for further information.
* </blockquote>
*
* <p>This is the main interface that most SAX applications
* implement: if the application needs to be informed of basic parsing
* events, it implements this interface and registers an instance with
* the SAX parser using the {@link org.xml.sax.XMLReader#setContentHandler
* setContentHandler} method. The parser uses the instance to report
* basic document-related events like the start and end of elements
* and character data.</p>
*
* <p>The order of events in this interface is very important, and
* mirrors the order of information in the document itself. For
* example, all of an element's content (character data, processing
* instructions, and/or subelements) will appear, in order, between
* the startElement event and the corresponding endElement event.</p>
*
* <p>This interface is similar to the now-deprecated SAX 1.0
* DocumentHandler interface, but it adds support for Namespaces
* and for reporting skipped entities (in non-validating XML
* processors).</p>
*
* <p>Implementors should note that there is also a
* <code>ContentHandler</code> class in the <code>java.net</code>
* package; that means that it's probably a bad idea to do</p>
*
* <pre>import java.net.*;
* import org.xml.sax.*;
* </pre>
*
* <p>In fact, "import ...*" is usually a sign of sloppy programming
* anyway, so the user should consider this a feature rather than a
* bug.</p>
*
* @since SAX 2.0
* @author David Megginson
* @see org.xml.sax.XMLReader
* @see org.xml.sax.DTDHandler
* @see org.xml.sax.ErrorHandler
*/
public interface ContentHandler
{ /**
* Receive an object for locating the origin of SAX document events.
*
* <p>SAX parsers are strongly encouraged (though not absolutely
* required) to supply a locator: if it does so, it must supply
* the locator to the application by invoking this method before
* invoking any of the other methods in the ContentHandler
* interface.</p>
*
* <p>The locator allows the application to determine the end
* position of any document-related event, even if the parser is
* not reporting an error. Typically, the application will
* use this information for reporting its own errors (such as
* character content that does not match an application's
* business rules). The information returned by the locator
* is probably not sufficient for use with a search engine.</p>
*
* <p>Note that the locator will return correct information only
* during the invocation SAX event callbacks after
* {@link #startDocument startDocument} returns and before
* {@link #endDocument endDocument} is called. The
* application should not attempt to use it at any other time.</p>
*
* @param locator an object that can return the location of
* any SAX document event
* @see org.xml.sax.Locator
*/
public void setDocumentLocator (Locator locator); /**
* Receive notification of the beginning of a document.
*
* <p>The SAX parser will invoke this method only once, before any
* other event callbacks (except for {@link #setDocumentLocator
* setDocumentLocator}).</p>
*
* @throws org.xml.sax.SAXException any SAX exception, possibly
* wrapping another exception
* @see #endDocument
*/
public void startDocument ()
throws SAXException; /**
* Receive notification of the end of a document.
*
* <p><strong>There is an apparent contradiction between the
* documentation for this method and the documentation for {@link
* org.xml.sax.ErrorHandler#fatalError}. Until this ambiguity is
* resolved in a future major release, clients should make no
* assumptions about whether endDocument() will or will not be
* invoked when the parser has reported a fatalError() or thrown
* an exception.</strong></p>
*
* <p>The SAX parser will invoke this method only once, and it will
* be the last method invoked during the parse. The parser shall
* not invoke this method until it has either abandoned parsing
* (because of an unrecoverable error) or reached the end of
* input.</p>
*
* @throws org.xml.sax.SAXException any SAX exception, possibly
* wrapping another exception
* @see #startDocument
*/
public void endDocument()
throws SAXException; /**
* Begin the scope of a prefix-URI Namespace mapping.
*
* <p>The information from this event is not necessary for
* normal Namespace processing: the SAX XML reader will
* automatically replace prefixes for element and attribute
* names when the <code>http://xml.org/sax/features/namespaces</code>
* feature is <var>true</var> (the default).</p>
*
* <p>There are cases, however, when applications need to
* use prefixes in character data or in attribute values,
* where they cannot safely be expanded automatically; the
* start/endPrefixMapping event supplies the information
* to the application to expand prefixes in those contexts
* itself, if necessary.</p>
*
* <p>Note that start/endPrefixMapping events are not
* guaranteed to be properly nested relative to each other:
* all startPrefixMapping events will occur immediately before the
* corresponding {@link #startElement startElement} event,
* and all {@link #endPrefixMapping endPrefixMapping}
* events will occur immediately after the corresponding
* {@link #endElement endElement} event,
* but their order is not otherwise
* guaranteed.</p>
*
* <p>There should never be start/endPrefixMapping events for the
* "xml" prefix, since it is predeclared and immutable.</p>
*
* @param prefix the Namespace prefix being declared.
* An empty string is used for the default element namespace,
* which has no prefix.
* @param uri the Namespace URI the prefix is mapped to
* @throws org.xml.sax.SAXException the client may throw
* an exception during processing
* @see #endPrefixMapping
* @see #startElement
*/
public void startPrefixMapping (String prefix, String uri)
throws SAXException; /**
* End the scope of a prefix-URI mapping.
*
* <p>See {@link #startPrefixMapping startPrefixMapping} for
* details. These events will always occur immediately after the
* corresponding {@link #endElement endElement} event, but the order of
* {@link #endPrefixMapping endPrefixMapping} events is not otherwise
* guaranteed.</p>
*
* @param prefix the prefix that was being mapped.
* This is the empty string when a default mapping scope ends.
* @throws org.xml.sax.SAXException the client may throw
* an exception during processing
* @see #startPrefixMapping
* @see #endElement
*/
public void endPrefixMapping (String prefix)
throws SAXException; /**
* Receive notification of the beginning of an element.
*
* <p>The Parser will invoke this method at the beginning of every
* element in the XML document; there will be a corresponding
* {@link #endElement endElement} event for every startElement event
* (even when the element is empty). All of the element's content will be
* reported, in order, before the corresponding endElement
* event.</p>
*
* <p>This event allows up to three name components for each
* element:</p>
*
* <ol>
* <li>the Namespace URI;</li>
* <li>the local name; and</li>
* <li>the qualified (prefixed) name.</li>
* </ol>
*
* <p>Any or all of these may be provided, depending on the
* values of the <var>http://xml.org/sax/features/namespaces</var>
* and the <var>http://xml.org/sax/features/namespace-prefixes</var>
* properties:</p>
*
* <ul>
* <li>the Namespace URI and local name are required when
* the namespaces property is <var>true</var> (the default), and are
* optional when the namespaces property is <var>false</var> (if one is
* specified, both must be);</li>
* <li>the qualified name is required when the namespace-prefixes property
* is <var>true</var>, and is optional when the namespace-prefixes property
* is <var>false</var> (the default).</li>
* </ul>
*
* <p>Note that the attribute list provided will contain only
* attributes with explicit values (specified or defaulted):
* #IMPLIED attributes will be omitted. The attribute list
* will contain attributes used for Namespace declarations
* (xmlns* attributes) only if the
* <code>http://xml.org/sax/features/namespace-prefixes</code>
* property is true (it is false by default, and support for a
* true value is optional).</p>
*
* <p>Like {@link #characters characters()}, attribute values may have
* characters that need more than one <code>char</code> value. </p>
*
* @param uri the Namespace URI, or the empty string if the
* element has no Namespace URI or if Namespace
* processing is not being performed
* @param localName the local name (without prefix), or the
* empty string if Namespace processing is not being
* performed
* @param qName the qualified name (with prefix), or the
* empty string if qualified names are not available
* @param atts the attributes attached to the element. If
* there are no attributes, it shall be an empty
* Attributes object. The value of this object after
* startElement returns is undefined
* @throws org.xml.sax.SAXException any SAX exception, possibly
* wrapping another exception
* @see #endElement
* @see org.xml.sax.Attributes
* @see org.xml.sax.helpers.AttributesImpl
*/
public void startElement (String uri, String localName,
String qName, Attributes atts)
throws SAXException; /**
* Receive notification of the end of an element.
*
* <p>The SAX parser will invoke this method at the end of every
* element in the XML document; there will be a corresponding
* {@link #startElement startElement} event for every endElement
* event (even when the element is empty).</p>
*
* <p>For information on the names, see startElement.</p>
*
* @param uri the Namespace URI, or the empty string if the
* element has no Namespace URI or if Namespace
* processing is not being performed
* @param localName the local name (without prefix), or the
* empty string if Namespace processing is not being
* performed
* @param qName the qualified XML name (with prefix), or the
* empty string if qualified names are not available
* @throws org.xml.sax.SAXException any SAX exception, possibly
* wrapping another exception
*/
public void endElement (String uri, String localName,
String qName)
throws SAXException; /**
* Receive notification of character data.
*
* <p>The Parser will call this method to report each chunk of
* character data. SAX parsers may return all contiguous character
* data in a single chunk, or they may split it into several
* chunks; however, all of the characters in any single event
* must come from the same external entity so that the Locator
* provides useful information.</p>
*
* <p>The application must not attempt to read from the array
* outside of the specified range.</p>
*
* <p>Individual characters may consist of more than one Java
* <code>char</code> value. There are two important cases where this
* happens, because characters can't be represented in just sixteen bits.
* In one case, characters are represented in a <em>Surrogate Pair</em>,
* using two special Unicode values. Such characters are in the so-called
* "Astral Planes", with a code point above U+FFFF. A second case involves
* composite characters, such as a base character combining with one or
* more accent characters. </p>
*
* <p> Your code should not assume that algorithms using
* <code>char</code>-at-a-time idioms will be working in character
* units; in some cases they will split characters. This is relevant
* wherever XML permits arbitrary characters, such as attribute values,
* processing instruction data, and comments as well as in data reported
* from this method. It's also generally relevant whenever Java code
* manipulates internationalized text; the issue isn't unique to XML.</p>
*
* <p>Note that some parsers will report whitespace in element
* content using the {@link #ignorableWhitespace ignorableWhitespace}
* method rather than this one (validating parsers <em>must</em>
* do so).</p>
*
* @param ch the characters from the XML document
* @param start the start position in the array
* @param length the number of characters to read from the array
* @throws org.xml.sax.SAXException any SAX exception, possibly
* wrapping another exception
* @see #ignorableWhitespace
* @see org.xml.sax.Locator
*/
public void characters (char ch[], int start, int length)
throws SAXException; /**
* Receive notification of ignorable whitespace in element content.
*
* <p>Validating Parsers must use this method to report each chunk
* of whitespace in element content (see the W3C XML 1.0
* recommendation, section 2.10): non-validating parsers may also
* use this method if they are capable of parsing and using
* content models.</p>
*
* <p>SAX parsers may return all contiguous whitespace in a single
* chunk, or they may split it into several chunks; however, all of
* the characters in any single event must come from the same
* external entity, so that the Locator provides useful
* information.</p>
*
* <p>The application must not attempt to read from the array
* outside of the specified range.</p>
*
* @param ch the characters from the XML document
* @param start the start position in the array
* @param length the number of characters to read from the array
* @throws org.xml.sax.SAXException any SAX exception, possibly
* wrapping another exception
* @see #characters
*/
public void ignorableWhitespace (char ch[], int start, int length)
throws SAXException; /**
* Receive notification of a processing instruction.
*
* <p>The Parser will invoke this method once for each processing
* instruction found: note that processing instructions may occur
* before or after the main document element.</p>
*
* <p>A SAX parser must never report an XML declaration (XML 1.0,
* section 2.8) or a text declaration (XML 1.0, section 4.3.1)
* using this method.</p>
*
* <p>Like {@link #characters characters()}, processing instruction
* data may have characters that need more than one <code>char</code>
* value. </p>
*
* @param target the processing instruction target
* @param data the processing instruction data, or null if
* none was supplied. The data does not include any
* whitespace separating it from the target
* @throws org.xml.sax.SAXException any SAX exception, possibly
* wrapping another exception
*/
public void processingInstruction (String target, String data)
throws SAXException; /**
* Receive notification of a skipped entity.
* This is not called for entity references within markup constructs
* such as element start tags or markup declarations. (The XML
* recommendation requires reporting skipped external entities.
* SAX also reports internal entity expansion/non-expansion, except
* within markup constructs.)
*
* <p>The Parser will invoke this method each time the entity is
* skipped. Non-validating processors may skip entities if they
* have not seen the declarations (because, for example, the
* entity was declared in an external DTD subset). All processors
* may skip external entities, depending on the values of the
* <code>http://xml.org/sax/features/external-general-entities</code>
* and the
* <code>http://xml.org/sax/features/external-parameter-entities</code>
* properties.</p>
*
* @param name the name of the skipped entity. If it is a
* parameter entity, the name will begin with '%', and if
* it is the external DTD subset, it will be the string
* "[dtd]"
* @throws org.xml.sax.SAXException any SAX exception, possibly
* wrapping another exception
*/
public void skippedEntity (String name)
throws SAXException;
}
ErrorHandler:
package org.xml.sax; /**
* Basic interface for SAX error handlers.
*
* <blockquote>
* <em>This module, both source code and documentation, is in the
* Public Domain, and comes with <strong>NO WARRANTY</strong>.</em>
* See <a href='http://www.saxproject.org'>http://www.saxproject.org</a>
* for further information.
* </blockquote>
*
* <p>If a SAX application needs to implement customized error
* handling, it must implement this interface and then register an
* instance with the XML reader using the
* {@link org.xml.sax.XMLReader#setErrorHandler setErrorHandler}
* method. The parser will then report all errors and warnings
* through this interface.</p>
*
* <p><strong>WARNING:</strong> If an application does <em>not</em>
* register an ErrorHandler, XML parsing errors will go unreported,
* except that <em>SAXParseException</em>s will be thrown for fatal errors.
* In order to detect validity errors, an ErrorHandler that does something
* with {@link #error error()} calls must be registered.</p>
*
* <p>For XML processing errors, a SAX driver must use this interface
* in preference to throwing an exception: it is up to the application
* to decide whether to throw an exception for different types of
* errors and warnings. Note, however, that there is no requirement that
* the parser continue to report additional errors after a call to
* {@link #fatalError fatalError}. In other words, a SAX driver class
* may throw an exception after reporting any fatalError.
* Also parsers may throw appropriate exceptions for non-XML errors.
* For example, {@link XMLReader#parse XMLReader.parse()} would throw
* an IOException for errors accessing entities or the document.</p>
*
* @since SAX 1.0
* @author David Megginson
* @see org.xml.sax.XMLReader#setErrorHandler
* @see org.xml.sax.SAXParseException
*/
public interface ErrorHandler { /**
* Receive notification of a warning.
*
* <p>SAX parsers will use this method to report conditions that
* are not errors or fatal errors as defined by the XML
* recommendation. The default behaviour is to take no
* action.</p>
*
* <p>The SAX parser must continue to provide normal parsing events
* after invoking this method: it should still be possible for the
* application to process the document through to the end.</p>
*
* <p>Filters may use this method to report other, non-XML warnings
* as well.</p>
*
* @param exception The warning information encapsulated in a
* SAX parse exception.
* @exception org.xml.sax.SAXException Any SAX exception, possibly
* wrapping another exception.
* @see org.xml.sax.SAXParseException
*/
public abstract void warning (SAXParseException exception)
throws SAXException; /**
* Receive notification of a recoverable error.
*
* <p>This corresponds to the definition of "error" in section 1.2
* of the W3C XML 1.0 Recommendation. For example, a validating
* parser would use this callback to report the violation of a
* validity constraint. The default behaviour is to take no
* action.</p>
*
* <p>The SAX parser must continue to provide normal parsing
* events after invoking this method: it should still be possible
* for the application to process the document through to the end.
* If the application cannot do so, then the parser should report
* a fatal error even if the XML recommendation does not require
* it to do so.</p>
*
* <p>Filters may use this method to report other, non-XML errors
* as well.</p>
*
* @param exception The error information encapsulated in a
* SAX parse exception.
* @exception org.xml.sax.SAXException Any SAX exception, possibly
* wrapping another exception.
* @see org.xml.sax.SAXParseException
*/
public abstract void error (SAXParseException exception)
throws SAXException; /**
* Receive notification of a non-recoverable error.
*
* <p><strong>There is an apparent contradiction between the
* documentation for this method and the documentation for {@link
* org.xml.sax.ContentHandler#endDocument}. Until this ambiguity
* is resolved in a future major release, clients should make no
* assumptions about whether endDocument() will or will not be
* invoked when the parser has reported a fatalError() or thrown
* an exception.</strong></p>
*
* <p>This corresponds to the definition of "fatal error" in
* section 1.2 of the W3C XML 1.0 Recommendation. For example, a
* parser would use this callback to report the violation of a
* well-formedness constraint.</p>
*
* <p>The application must assume that the document is unusable
* after the parser has invoked this method, and should continue
* (if at all) only for the sake of collecting additional error
* messages: in fact, SAX parsers are free to stop reporting any
* other events once this method has been invoked.</p>
*
* @param exception The error information encapsulated in a
* SAX parse exception.
* @exception org.xml.sax.SAXException Any SAX exception, possibly
* wrapping another exception.
* @see org.xml.sax.SAXParseException
*/
public abstract void fatalError (SAXParseException exception)
throws SAXException; }
上面是四个基本处理事件的接口源码,通过阅读代码就可以知道每个事件需要完成的事情。
4.SAX解析具体实现过程,主要包括两个过程一个是解析规则的定义还有就是文件的读取
事件处理MyHandler.java
import java.io.IOException; import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.Locator;
import org.xml.sax.SAXException;
import org.xml.sax.SAXParseException;
import org.xml.sax.helpers.DefaultHandler; public class MyHandler extends DefaultHandler { /**
* 开始前缀 URI 名称空间范围映射。
* 此事件的信息对于常规的命名空间处理并非必需:
* 当 http://xml.org/sax/features/namespaces 功能为 true(默认)时,
* SAX XML 读取器将自动替换元素和属性名称的前缀。
* 参数意义如下:
* prefix :前缀
* uri :命名空间
*/
@Override
public void startPrefixMapping(String prefix, String uri)
throws SAXException {
// TODO Auto-generated method stub
System.out.println("(startPrefixMapping)start prefix_mapping : xmlns:"+prefix+" = "
+"\""+uri+"\"");
} /**
* 结束前缀 URI 范围的映射。
* @param prefix 前缀
*/
@Override
public void endPrefixMapping(String prefix) throws SAXException {
// TODO Auto-generated method stub
System.out.println("(endPrefixMapping)end prefix_mapping : "+prefix);
} /**
* 文档结束
*/
@Override
public void endDocument() throws SAXException {
// TODO Auto-generated method stub
System.out.println("(endDocument)doument is ended");
} /**
* 接收文档的结尾的通知。
* 参数意义如下:
* uri :元素的命名空间
* localName :元素的本地名称(不带前缀)
* qName :元素的限定名(带前缀)
*/
@Override
public void endElement(String uri, String localName, String qName)
throws SAXException {
// TODO Auto-generated method stub
System.out.println("(endElement)end element : "+qName+"("+uri+")");
} /**
* 接收元素内容中可忽略的空白的通知。
* 参数意义如下:
* ch : 来自 XML 文档的字符
* start : 数组中的开始位置
* length : 从数组中读取的字符的个数
*/
@Override
public void ignorableWhitespace(char[] ch, int start, int length)
throws SAXException {
// TODO Auto-generated method stub
StringBuffer buffer = new StringBuffer();
for(int i = start ; i < start+length ; i++){
switch(ch[i]){
case '\\':buffer.append("\\\\");break;
case '\r':buffer.append("\\r");break;
case '\n':buffer.append("\\n");break;
case '\t':buffer.append("\\t");break;
case '\"':buffer.append("\\\"");break;
default : buffer.append(ch[i]);
}
}
System.out.println("(ignorableWhitespace)ignorable whitespace("+length+"): "+buffer.toString());
} /**
* 接收用来查找 SAX 文档事件起源的对象。
* 参数意义如下:
* locator : 可以返回任何 SAX 文档事件位置的对象
*/
@Override
public void setDocumentLocator(Locator locator) {
// TODO Auto-generated method stub
System.out.println("(setDocumentLocator)set document_locator : (lineNumber = "+locator.getLineNumber()
+",columnNumber = "+locator.getColumnNumber()
+",systemId = "+locator.getSystemId()
+",publicId = "+locator.getPublicId()+")");
} /**
* 接收文档的开始的通知。
*/
@Override
public void startDocument() throws SAXException {
// TODO Auto-generated method stub
System.out.println("(startDocument)document is startting");
} /**
* 接收元素开始的通知。
* 参数意义如下:
* uri :元素的命名空间
* localName :元素的本地名称(不带前缀)
* qName :元素的限定名(带前缀)
* atts :元素的属性集合
*/
@Override
public void startElement(String uri, String localName, String qName,
Attributes attributes) throws SAXException {
// TODO Auto-generated method stub
System.out.println("(startElement)start element : "+qName+"("+uri+")");
} /**
* 接收注释声明事件的通知。
* 参数意义如下:
* name - 注释名称。
* publicId - 注释的公共标识符,如果未提供,则为 null。
* systemId - 注释的系统标识符,如果未提供,则为 null。
*/
@Override
public void notationDecl(String name, String publicId, String systemId)
throws SAXException {
// TODO Auto-generated method stub
System.out.println("(notationDecl)notation declare : (name = "+name
+",systemId = "+publicId
+",publicId = "+systemId+")");
} /**
* 允许应用程序解析外部实体。
* 解析器将在打开任何外部实体(顶级文档实体除外)前调用此方法
* 参数意义如下:
* publicId : 被引用的外部实体的公共标识符,如果未提供,则为 null。
* systemId : 被引用的外部实体的系统标识符。
* 返回:
* 一个描述新输入源的 InputSource 对象,或者返回 null,
* 以请求解析器打开到系统标识符的常规 URI 连接。
*/
@Override
public InputSource resolveEntity(String publicId, String systemId)
throws IOException, SAXException {
// TODO Auto-generated method stub
return super.resolveEntity(publicId, systemId);
} /**
* 接收跳过的实体的通知。
* 参数意义如下:
* name : 所跳过的实体的名称。如果它是参数实体,则名称将以 '%' 开头,
* 如果它是外部 DTD 子集,则将是字符串 "[dtd]"
*/
@Override
public void skippedEntity(String name) throws SAXException {
// TODO Auto-generated method stub
System.out.println("(skippedEntity)the name of the skipped entity : "+name);
} /**
* 接收未解析的实体声明事件的通知。
* 参数意义如下:
* name - 未解析的实体的名称。
* publicId - 实体的公共标识符,如果未提供,则为 null。
* systemId - 实体的系统标识符。
* notationName - 相关注释的名称。
*/
@Override
public void unparsedEntityDecl(String name, String publicId,
String systemId, String notationName) throws SAXException {
// TODO Auto-generated method stub
System.out.println("(unparsedEntityDecl)unparsed entity declare : (name = "+name
+",systemId = "+publicId
+",publicId = "+systemId
+",notationName = "+notationName+")");
} /**
* 接收处理指令的通知。
* 参数意义如下:
* target : 处理指令目标
* data : 处理指令数据,如果未提供,则为 null。
*/
@Override
public void processingInstruction(String target, String data)
throws SAXException {
// TODO Auto-generated method stub
System.out.println("(processingInstruction)process instruction : (target = \""
+target+"\",data = \""+data+"\")");
} /**
* 接收字符数据的通知。
* 在DOM中 ch[begin:end] 相当于Text节点的节点值(nodeValue)
*/
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
// TODO Auto-generated method stub
StringBuffer buffer = new StringBuffer();
for(int i = start ; i < start+length ; i++){
switch(ch[i]){
case '\\':buffer.append("\\\\");break;
case '\r':buffer.append("\\r");break;
case '\n':buffer.append("\\n");break;
case '\t':buffer.append("\\t");break;
case '\"':buffer.append("\\\"");break;
default : buffer.append(ch[i]);
}
}
System.out.println("(characters)characters("+length+"): "+buffer.toString());
}
/**
* 错误异常处理 可恢复
*/
@Override
public void error(SAXParseException e) throws SAXException {
// TODO Auto-generated method stub
System.err.println("(error)Error ("+e.getLineNumber()+","
+e.getColumnNumber()+") : "+e.getMessage());
} /**
* 致命性错误处理 不可恢复
*/
@Override
public void fatalError(SAXParseException e) throws SAXException {
// TODO Auto-generated method stub
System.err.println("(fatalError)FatalError ("+e.getLineNumber()+","
+e.getColumnNumber()+") : "+e.getMessage());
} /**
* 警告处理
*/
@Override
public void warning(SAXParseException e) throws SAXException {
// TODO Auto-generated method stub
System.err.println("(warning)("+e.getLineNumber()+","
+e.getColumnNumber()+") : "+e.getMessage());
}
}
解析开始:
SAXParse.java
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException; import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory; import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader; /**
* 1.得到SAX解析器的工厂实例
* 2.从SAX工厂实例中获得SAX解析器
* 3.把要解析的XML文档转化为输入流,以便DOM解析器解析它
* 4.解析XML文档
*/
public class SAXParse { /**
* @param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
// 得到SAX解析工厂
SAXParserFactory factory = SAXParserFactory.newInstance();
// 创建解析器
SAXParser parser =null;
try {
parser = factory.newSAXParser();
XMLReader xmlReader = parser.getXMLReader();
InputSource input = new InputSource(new FileInputStream(new File("world.xml")));
xmlReader.setContentHandler(new MyHandler());
xmlReader.parse(input);
} catch (ParserConfigurationException | SAXException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} } }
5.结果输出;
(setDocumentLocator)set document_locator : (lineNumber = 1,columnNumber = 1,systemId = null,publicId = null)
(startDocument)document is startting
(startElement)start element : world()
(characters)characters(2): \n\t
(startElement)start element : comuntry()
(characters)characters(3): \n\t\t
(startElement)start element : name()
(characters)characters(5): China
(endElement)end element : name()
(characters)characters(3): \n\t\t
(startElement)start element : capital()
(characters)characters(7): Beijing
(endElement)end element : capital()
(characters)characters(3): \n\t\t
(startElement)start element : population()
(characters)characters(4): 1234
(endElement)end element : population()
(characters)characters(3): \n\t\t
(startElement)start element : area()
(characters)characters(3): 960
(endElement)end element : area()
(characters)characters(2): \n\t
(endElement)end element : comuntry()
(characters)characters(2): \n\t
(startElement)start element : comuntry()
(characters)characters(3): \n\t\t
(startElement)start element : name()
(characters)characters(7): America
(endElement)end element : name()
(characters)characters(3): \n\t\t
(startElement)start element : capital()
(characters)characters(10): Washington
(endElement)end element : capital()
(characters)characters(3): \n\t\t
(startElement)start element : population()
(characters)characters(3): 234
(endElement)end element : population()
(characters)characters(3): \n\t\t
(startElement)start element : area()
(characters)characters(3): 900
(endElement)end element : area()
(characters)characters(2): \n\t
(endElement)end element : comuntry()
(characters)characters(2): \n\t
(startElement)start element : comuntry()
(characters)characters(3): \n\t\t
(startElement)start element : name()
(characters)characters(5): Japan
(endElement)end element : name()
(characters)characters(3): \n\t\t
(startElement)start element : capital()
(characters)characters(5): Tokyo
(endElement)end element : capital()
(characters)characters(3): \n\t\t
(startElement)start element : population()
(characters)characters(3): 234
(endElement)end element : population()
(characters)characters(3): \n\t\t
(startElement)start element : area()
(characters)characters(2): 60
(endElement)end element : area()
(characters)characters(2): \n\t
(endElement)end element : comuntry()
(characters)characters(2): \n\t
(startElement)start element : comuntry()
(characters)characters(3): \n\t\t
(startElement)start element : name()
(characters)characters(6): Russia
(endElement)end element : name()
(characters)characters(3): \n\t\t
(startElement)start element : capital()
(characters)characters(6): Moscow
(endElement)end element : capital()
(characters)characters(3): \n\t\t
(startElement)start element : population()
(characters)characters(2): 34
(endElement)end element : population()
(characters)characters(3): \n\t\t
(startElement)start element : area()
(characters)characters(4): 1960
(endElement)end element : area()
(characters)characters(2): \n\t
(endElement)end element : comuntry()
(characters)characters(1): \n
(endElement)end element : world()
(endDocument)doument is ended
6.SAX解析完成,这是一个很简单的解析读取过程,具体的应用需要定制。
XML文件解析之SAX解析的更多相关文章
- XML 解析---dom解析和sax解析
眼下XML解析的方法主要用两种: 1.dom解析:(Document Object Model.即文档对象模型)是W3C组织推荐的解析XML的一种方式. 使用dom解析XML文档,该解析器会先把XML ...
- javaweb学习总结十二(JAXP对XML文档进行SAX解析)
一:JAXP使用SAX方式解析XML文件 1:dom解析与sax解析异同点 2:sax解析特点 二:代码案例 1:xml文件 <?xml version="1.0" enco ...
- Android XML文档解析(一)——SAX解析
---------------------------------------------------------------------------------------------------- ...
- XML解析之SAX解析技术案例
Java代码: package com.xushouwei.xml; import java.io.File; import java.io.IOException; import java.text ...
- 解析XML文件之使用SAM解析器
XML是一种常见的传输数据方式,所以在开发中,我们会遇到对XML文件进行解析的时候,本篇主要介绍使用SAM解析器,对XML文件进行解析. SAX解析器的长处是显而易见的.那就是SAX并不须要将全部的文 ...
- 经典面试题:一张表区别DOM解析和SAX解析XML
============DOM解析 vs ...
- Dom4j解析和sax解析xml
xml基础知识 1)标签对大小写敏感,2)xml解析方式有两种dom解析和sax解析 3)常用的解析工具有dom的dom4j和sax的sax解析工具 4)文档声明中使用<?xml versio ...
- java解析XML之DOM解析和SAX解析(包含CDATA的问题)
Dom解析功能强大,可增删改查,操作时会将XML文档读到内存,因此适用于小文档: SAX解析是从头到尾逐行逐个元素解析,修改较为不便,但适用于只读的大文档:SAX采用事件驱动的方式解析XML.如同在电 ...
- XML解析(二) SAX解析
XML解析之SAX解析: SAX解析器:SAXParser类同DOM一样也在javax.xml.parsers包下,此类的实例可以从 SAXParserFactory.newSAXParser() 方 ...
- 解析XML文件之使用DOM解析器
在前面的文章中.介绍了使用SAX解析器对XML文件进行解析.SAX解析器的长处就是占用内存小.这篇文章主要介绍使用DOM解析器对XML文件进行解析. DOM解析器的长处可能是理解起来比較的直观,当然, ...
随机推荐
- 最简单的freemarker用法实例
1.下载freemarker-2.3.19.jar到web项目的lib下. 2.新建freemarker引擎协助类 package com.bxsurvey.sys.process.uti ...
- 在Springmvc普通类@Autowired注入request为null解决方法
在Springmvc普通类@Autowired注入request为null解决方法 在类中加入以下注入request对象的代码,运行时发现request为null,注入失败.在@Controlle ...
- 重置密码解决MySQL for Linux
重置密码解决MySQL for Linux错误 ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using passwor ...
- Qt编写自定义控件46-树状导航栏
一.前言 树状导航栏控件是所有控件中最牛逼最经典最厉害的一个,在很多购买者中,使用频率也是最高,因为该导航控件集合了非常多的展示效果,比如左侧图标+右侧箭头+元素前面的图标设置+各种颜色设置等,全部涵 ...
- 讲sql注入原理的 这篇不错(有空可以看看)
我们围绕以下几个方面来看这个问题: 1.什么是sql注入? 2.为什么要sql注入? 3.怎样sql注入? 1.什么是sql注入? 所谓SQL注入,就是通过把SQL命令插入到Web表单递交或输入域名或 ...
- CentOS 端口和防火墙操作
Centos 7 端口和防火墙命令: 查看已经开放的端口:firewall-cmd --list-ports 开启端口:firewall-cmd --zone=public --add-port=80 ...
- LeetCode_189. Rotate Array
189. Rotate Array Easy Given an array, rotate the array to the right by k steps, where k is non-nega ...
- LeetCode_118. Pascal's Triangle
118. Pascal's Triangle Easy Given a non-negative integer numRows, generate the first numRows of Pasc ...
- 长乐培训Day1
T1 魔法照片 题目 [题目描述] 如果你看过<哈利·波特>,你就会知道魔法世界里的照片是很神奇的.也许是因为小魔法师佳佳长的太帅,很多人都找他要那种神奇的魔法照片, 而且还都要佳佳和他的 ...
- time() 函数时间不同步问题
1.时区设置问题 处理方法:编辑php.ini 搜索 “timezone” 改写为 PRC 时区 2.服务器时间不同步 处理方法:设置服务器时间和本地时间进行同步