Class SAXReader
- java.lang.Object
-
- org.dom4j.io.SAXReader
-
public class SAXReader extends Object
SAXReadercreates a DOM4J tree from SAX parsing events.The actual SAX parser that is used by this class is configurable so you can use your favourite SAX parser if you wish. DOM4J comes configured with its own SAX parser so you do not need to worry about configuring the SAX parser.
To explicitly configure the SAX parser that is used via Java code you can use a constructor or use the
setXMLReader(XMLReader)orsetXMLReaderClassName(String)methods.If the parser is not specified explicitly then the standard SAX policy of using the
org.xml.sax.driversystem property is used to determine the implementation class ofXMLReader.If the
org.xml.sax.driversystem property is not defined then JAXP is used via reflection (so that DOM4J is not explicitly dependent on the JAXP classes) to load the JAXP configured SAXParser. If there is any error creating a JAXP SAXParser an informational message is output and then the default (Aelfred) SAX parser is used instead.If you are trying to use JAXP to explicitly set your SAX parser and are experiencing problems, you can turn on verbose error reporting by defining the system property
org.dom4j.verboseto be "true" which will output a more detailed description of why JAXP could not find a SAX parserFor more information on JAXP please go to Sun's Java & XML site
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static classSAXReader.SAXEntityResolver
-
Constructor Summary
Constructors Constructor Description SAXReader()This method internally callsSAXParserFactory.newInstance().newSAXParser().getXMLReader()orXMLReaderFactory.createXMLReader().SAXReader(boolean validating)This method internally callsSAXParserFactory.newInstance().newSAXParser().getXMLReader()orXMLReaderFactory.createXMLReader().SAXReader(String xmlReaderClassName)SAXReader(String xmlReaderClassName, boolean validating)SAXReader(DocumentFactory factory)This method internally callsSAXParserFactory.newInstance().newSAXParser().getXMLReader()orXMLReaderFactory.createXMLReader().SAXReader(DocumentFactory factory, boolean validating)This method internally callsSAXParserFactory.newInstance().newSAXParser().getXMLReader()orXMLReaderFactory.createXMLReader().SAXReader(XMLReader xmlReader)SAXReader(XMLReader xmlReader, boolean validating)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddHandler(String path, ElementHandler handler)Adds theElementHandlerto be called when the specified path is encounted.protected voidconfigureReader(XMLReader reader, DefaultHandler handler)Configures the XMLReader before useprotected SAXContentHandlercreateContentHandler(XMLReader reader)Factory Method to allow user derived SAXContentHandler objects to be usedstatic SAXReadercreateDefault()protected EntityResolvercreateDefaultEntityResolver(String systemId)protected XMLReadercreateXMLReader()Factory Method to allow alternate methods of creating and configuring XMLReader objectsprotected org.dom4j.io.DispatchHandlergetDispatchHandler()DocumentFactorygetDocumentFactory()DOCUMENT ME!StringgetEncoding()Returns encoding used for InputSource (null means system default encoding)EntityResolvergetEntityResolver()Returns the current entity resolver used to resolve entitiesErrorHandlergetErrorHandler()DOCUMENT ME!XMLFiltergetXMLFilter()Returns the SAX filter being used to filter SAX events.XMLReadergetXMLReader()DOCUMENT ME!protected XMLReaderinstallXMLFilter(XMLReader reader)Installs any XMLFilter objects required to allow the SAX event stream to be filtered and preprocessed before it gets to dom4j.booleanisIgnoreComments()Returns whether we should ignore comments or not.booleanisIncludeExternalDTDDeclarations()DOCUMENT ME!booleanisIncludeInternalDTDDeclarations()DOCUMENT ME!booleanisMergeAdjacentText()Returns whether adjacent text nodes should be merged together.booleanisStringInternEnabled()Sets whether String interning is enabled or disabled for element & attribute names and namespace URIs.booleanisStripWhitespaceText()Sets whether whitespace between element start and end tags should be ignoredbooleanisValidating()DOCUMENT ME!Documentread(File file)Reads a Document from the givenFileDocumentread(InputStream in)Reads a Document from the given stream using SAXDocumentread(InputStream in, String systemId)Reads a Document from the given stream using SAXDocumentread(Reader reader)Reads a Document from the givenReaderusing SAXDocumentread(Reader reader, String systemId)Reads a Document from the givenReaderusing SAXDocumentread(String systemId)Reads a Document from the given URL or filename using SAX.Documentread(URL url)Reads a Document from the givenURLusing SAXDocumentread(InputSource in)Reads a Document from the givenInputSourceusing SAXvoidremoveHandler(String path)Removes theElementHandlerfrom the event based processor, for the specified path.voidresetHandlers()This method clears out all the existing handlers and default handler setting things back as if no handler existed.voidsetDefaultHandler(ElementHandler handler)When multipleElementHandlerinstances have been registered, this will set a defaultElementHandlerto be called for any path which does NOT have a handler registered.protected voidsetDispatchHandler(org.dom4j.io.DispatchHandler dispatchHandler)voidsetDocumentFactory(DocumentFactory documentFactory)This sets theDocumentFactoryused to create new documents.voidsetEncoding(String encoding)Sets encoding used for InputSource (null means system default encoding)voidsetEntityResolver(EntityResolver entityResolver)Sets the entity resolver used to resolve entities.voidsetErrorHandler(ErrorHandler errorHandler)Sets theErrorHandlerused by the SAXXMLReader.voidsetFeature(String name, boolean value)Sets a SAX feature on the underlying SAX parser.voidsetIgnoreComments(boolean ignoreComments)Sets whether we should ignore comments or not.voidsetIncludeExternalDTDDeclarations(boolean include)Sets whether DTD external declarations should be expanded into the DocumentType object or not.voidsetIncludeInternalDTDDeclarations(boolean include)Sets whether internal DTD declarations should be expanded into the DocumentType object or not.voidsetMergeAdjacentText(boolean mergeAdjacentText)Sets whether or not adjacent text nodes should be merged together when parsing.voidsetProperty(String name, Object value)Allows a SAX property to be set on the underlying SAX parser.voidsetStringInternEnabled(boolean stringInternEnabled)Sets whether String interning is enabled or disabled for element & attribute names and namespace URIsvoidsetStripWhitespaceText(boolean stripWhitespaceText)Sets whether whitespace between element start and end tags should be ignored.voidsetValidation(boolean validation)Sets the validation mode.voidsetXMLFilter(XMLFilter filter)Sets the SAX filter to be used when filtering SAX eventsvoidsetXMLReader(XMLReader reader)Sets theXMLReaderused to parse SAX eventsvoidsetXMLReaderClassName(String xmlReaderClassName)Sets the class name of theXMLReaderto be used to parse SAX events.
-
-
-
Constructor Detail
-
SAXReader
public SAXReader()
This method internally callsSAXParserFactory.newInstance().newSAXParser().getXMLReader()orXMLReaderFactory.createXMLReader(). Be sure to configure returned reader if the default configuration does not suit you. Consider setting the following properties:reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false); reader.setFeature("http://xml.org/sax/features/external-general-entities", false); reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
-
SAXReader
public SAXReader(boolean validating)
This method internally callsSAXParserFactory.newInstance().newSAXParser().getXMLReader()orXMLReaderFactory.createXMLReader(). Be sure to configure returned reader if the default configuration does not suit you. Consider setting the following properties:reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false); reader.setFeature("http://xml.org/sax/features/external-general-entities", false); reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);- Parameters:
validating-
-
SAXReader
public SAXReader(DocumentFactory factory)
This method internally callsSAXParserFactory.newInstance().newSAXParser().getXMLReader()orXMLReaderFactory.createXMLReader(). Be sure to configure returned reader if the default configuration does not suit you. Consider setting the following properties:reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false); reader.setFeature("http://xml.org/sax/features/external-general-entities", false); reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);- Parameters:
factory-
-
SAXReader
public SAXReader(DocumentFactory factory, boolean validating)
This method internally callsSAXParserFactory.newInstance().newSAXParser().getXMLReader()orXMLReaderFactory.createXMLReader(). Be sure to configure returned reader if the default configuration does not suit you. Consider setting the following properties:reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false); reader.setFeature("http://xml.org/sax/features/external-general-entities", false); reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);- Parameters:
factory-validating-
-
SAXReader
public SAXReader(XMLReader xmlReader)
-
SAXReader
public SAXReader(XMLReader xmlReader, boolean validating)
-
SAXReader
public SAXReader(String xmlReaderClassName) throws SAXException
- Throws:
SAXException
-
SAXReader
public SAXReader(String xmlReaderClassName, boolean validating) throws SAXException
- Throws:
SAXException
-
-
Method Detail
-
createDefault
public static SAXReader createDefault()
-
setProperty
public void setProperty(String name, Object value) throws SAXException
Allows a SAX property to be set on the underlying SAX parser. This can be useful to set parser-specific properties such as the location of schema or DTD resources. Though use this method with caution as it has the possibility of breaking the standard behaviour. An alternative to calling this method is to correctly configure an XMLReader object instance and call thesetXMLReader(XMLReader)method- Parameters:
name- is the SAX property namevalue- is the value of the SAX property- Throws:
SAXException- if the XMLReader could not be created or the property could not be changed.
-
setFeature
public void setFeature(String name, boolean value) throws SAXException
Sets a SAX feature on the underlying SAX parser. This can be useful to set parser-specific features. Though use this method with caution as it has the possibility of breaking the standard behaviour. An alternative to calling this method is to correctly configure an XMLReader object instance and call thesetXMLReader(XMLReader)method- Parameters:
name- is the SAX feature namevalue- is the value of the SAX feature- Throws:
SAXException- if the XMLReader could not be created or the feature could not be changed.
-
read
public Document read(File file) throws DocumentException
Reads a Document from the given
File- Parameters:
file- is theFileto read from.- Returns:
- the newly created Document instance
- Throws:
DocumentException- if an error occurs during parsing.
-
read
public Document read(URL url) throws DocumentException
Reads a Document from the given
URLusing SAX- Parameters:
url-URLto read from.- Returns:
- the newly created Document instance
- Throws:
DocumentException- if an error occurs during parsing.
-
read
public Document read(String systemId) throws DocumentException
Reads a Document from the given URL or filename using SAX.
If the systemId contains a
':'character then it is assumed to be a URL otherwise its assumed to be a file name. If you want finer grained control over this mechansim then please explicitly pass in either aURLor aFileinstance instead of aStringto denote the source of the document.- Parameters:
systemId- is a URL for a document or a file name.- Returns:
- the newly created Document instance
- Throws:
DocumentException- if an error occurs during parsing.
-
read
public Document read(InputStream in) throws DocumentException
Reads a Document from the given stream using SAX
- Parameters:
in-InputStreamto read from.- Returns:
- the newly created Document instance
- Throws:
DocumentException- if an error occurs during parsing.
-
read
public Document read(Reader reader) throws DocumentException
Reads a Document from the givenReaderusing SAX- Parameters:
reader- is the reader for the input- Returns:
- the newly created Document instance
- Throws:
DocumentException- if an error occurs during parsing.
-
read
public Document read(InputStream in, String systemId) throws DocumentException
Reads a Document from the given stream using SAX
- Parameters:
in-InputStreamto read from.systemId- is the URI for the input- Returns:
- the newly created Document instance
- Throws:
DocumentException- if an error occurs during parsing.
-
read
public Document read(Reader reader, String systemId) throws DocumentException
Reads a Document from the given
Readerusing SAX- Parameters:
reader- is the reader for the inputsystemId- is the URI for the input- Returns:
- the newly created Document instance
- Throws:
DocumentException- if an error occurs during parsing.
-
read
public Document read(InputSource in) throws DocumentException
Reads a Document from the given
InputSourceusing SAX- Parameters:
in-InputSourceto read from.- Returns:
- the newly created Document instance
- Throws:
DocumentException- if an error occurs during parsing.
-
isValidating
public boolean isValidating()
DOCUMENT ME!- Returns:
- the validation mode, true if validating will be done otherwise false.
-
setValidation
public void setValidation(boolean validation)
Sets the validation mode.- Parameters:
validation- indicates whether or not validation should occur.
-
isIncludeInternalDTDDeclarations
public boolean isIncludeInternalDTDDeclarations()
DOCUMENT ME!- Returns:
- whether internal DTD declarations should be expanded into the DocumentType object or not.
-
setIncludeInternalDTDDeclarations
public void setIncludeInternalDTDDeclarations(boolean include)
Sets whether internal DTD declarations should be expanded into the DocumentType object or not.- Parameters:
include- whether or not DTD declarations should be expanded and included into the DocumentType object.
-
isIncludeExternalDTDDeclarations
public boolean isIncludeExternalDTDDeclarations()
DOCUMENT ME!- Returns:
- whether external DTD declarations should be expanded into the DocumentType object or not.
-
setIncludeExternalDTDDeclarations
public void setIncludeExternalDTDDeclarations(boolean include)
Sets whether DTD external declarations should be expanded into the DocumentType object or not.- Parameters:
include- whether or not DTD declarations should be expanded and included into the DocumentType object.
-
isStringInternEnabled
public boolean isStringInternEnabled()
Sets whether String interning is enabled or disabled for element & attribute names and namespace URIs. This proprety is enabled by default.- Returns:
- DOCUMENT ME!
-
setStringInternEnabled
public void setStringInternEnabled(boolean stringInternEnabled)
Sets whether String interning is enabled or disabled for element & attribute names and namespace URIs- Parameters:
stringInternEnabled- DOCUMENT ME!
-
isMergeAdjacentText
public boolean isMergeAdjacentText()
Returns whether adjacent text nodes should be merged together.- Returns:
- Value of property mergeAdjacentText.
-
setMergeAdjacentText
public void setMergeAdjacentText(boolean mergeAdjacentText)
Sets whether or not adjacent text nodes should be merged together when parsing.- Parameters:
mergeAdjacentText- New value of property mergeAdjacentText.
-
isStripWhitespaceText
public boolean isStripWhitespaceText()
Sets whether whitespace between element start and end tags should be ignored- Returns:
- Value of property stripWhitespaceText.
-
setStripWhitespaceText
public void setStripWhitespaceText(boolean stripWhitespaceText)
Sets whether whitespace between element start and end tags should be ignored.- Parameters:
stripWhitespaceText- New value of property stripWhitespaceText.
-
isIgnoreComments
public boolean isIgnoreComments()
Returns whether we should ignore comments or not.- Returns:
- boolean
-
setIgnoreComments
public void setIgnoreComments(boolean ignoreComments)
Sets whether we should ignore comments or not.- Parameters:
ignoreComments- whether we should ignore comments or not.
-
getDocumentFactory
public DocumentFactory getDocumentFactory()
DOCUMENT ME!- Returns:
- the
DocumentFactoryused to create document objects
-
setDocumentFactory
public void setDocumentFactory(DocumentFactory documentFactory)
This sets the
DocumentFactoryused to create new documents. This method allows the building of custom DOM4J tree objects to be implemented easily using a custom derivation ofDocumentFactory- Parameters:
documentFactory-DocumentFactoryused to create DOM4J objects
-
getErrorHandler
public ErrorHandler getErrorHandler()
DOCUMENT ME!- Returns:
- the
ErrorHandlerused by SAX
-
setErrorHandler
public void setErrorHandler(ErrorHandler errorHandler)
Sets theErrorHandlerused by the SAXXMLReader.- Parameters:
errorHandler- is theErrorHandlerused by SAX
-
getEntityResolver
public EntityResolver getEntityResolver()
Returns the current entity resolver used to resolve entities- Returns:
- DOCUMENT ME!
-
setEntityResolver
public void setEntityResolver(EntityResolver entityResolver)
Sets the entity resolver used to resolve entities.- Parameters:
entityResolver- DOCUMENT ME!
-
getXMLReader
public XMLReader getXMLReader() throws SAXException
DOCUMENT ME!- Returns:
- the
XMLReaderused to parse SAX events - Throws:
SAXException- DOCUMENT ME!
-
setXMLReader
public void setXMLReader(XMLReader reader)
Sets theXMLReaderused to parse SAX events- Parameters:
reader- is theXMLReaderto parse SAX events
-
getEncoding
public String getEncoding()
Returns encoding used for InputSource (null means system default encoding)- Returns:
- encoding used for InputSource
-
setEncoding
public void setEncoding(String encoding)
Sets encoding used for InputSource (null means system default encoding)- Parameters:
encoding- is encoding used for InputSource
-
setXMLReaderClassName
public void setXMLReaderClassName(String xmlReaderClassName) throws SAXException
Sets the class name of theXMLReaderto be used to parse SAX events.- Parameters:
xmlReaderClassName- is the class name of theXMLReaderto parse SAX events- Throws:
SAXException- DOCUMENT ME!
-
addHandler
public void addHandler(String path, ElementHandler handler)
Adds theElementHandlerto be called when the specified path is encounted.- Parameters:
path- is the path to be handledhandler- is theElementHandlerto be called by the event based processor.
-
removeHandler
public void removeHandler(String path)
Removes theElementHandlerfrom the event based processor, for the specified path.- Parameters:
path- is the path to remove theElementHandlerfor.
-
setDefaultHandler
public void setDefaultHandler(ElementHandler handler)
When multipleElementHandlerinstances have been registered, this will set a defaultElementHandlerto be called for any path which does NOT have a handler registered.- Parameters:
handler- is theElementHandlerto be called by the event based processor.
-
resetHandlers
public void resetHandlers()
This method clears out all the existing handlers and default handler setting things back as if no handler existed. Useful when reusing an object instance.
-
getXMLFilter
public XMLFilter getXMLFilter()
Returns the SAX filter being used to filter SAX events.- Returns:
- the SAX filter being used or null if no SAX filter is installed
-
setXMLFilter
public void setXMLFilter(XMLFilter filter)
Sets the SAX filter to be used when filtering SAX events- Parameters:
filter- is the SAX filter to use or null to disable filtering
-
installXMLFilter
protected XMLReader installXMLFilter(XMLReader reader)
Installs any XMLFilter objects required to allow the SAX event stream to be filtered and preprocessed before it gets to dom4j.- Parameters:
reader- DOCUMENT ME!- Returns:
- the new XMLFilter if applicable or the original XMLReader if no filter is being used.
-
getDispatchHandler
protected org.dom4j.io.DispatchHandler getDispatchHandler()
-
setDispatchHandler
protected void setDispatchHandler(org.dom4j.io.DispatchHandler dispatchHandler)
-
createXMLReader
protected XMLReader createXMLReader() throws SAXException
Factory Method to allow alternate methods of creating and configuring XMLReader objects- Returns:
- DOCUMENT ME!
- Throws:
SAXException- DOCUMENT ME!
-
configureReader
protected void configureReader(XMLReader reader, DefaultHandler handler) throws DocumentException
Configures the XMLReader before use- Parameters:
reader- DOCUMENT ME!handler- DOCUMENT ME!- Throws:
DocumentException- DOCUMENT ME!
-
createContentHandler
protected SAXContentHandler createContentHandler(XMLReader reader)
Factory Method to allow user derived SAXContentHandler objects to be used- Parameters:
reader- DOCUMENT ME!- Returns:
- DOCUMENT ME!
-
createDefaultEntityResolver
protected EntityResolver createDefaultEntityResolver(String systemId)
-
-