gnu.xml.pipeline
Class DomConsumer
- EventConsumer
This consumer builds a DOM Document from its input, acting either as a
pipeline terminus or as an intermediate buffer. When a document's worth
of events has been delivered to this consumer, that document is read with
a
DomParser
and sent to the next consumer. It is also available
as a read-once property.
The DOM tree is constructed as faithfully as possible. There are some
complications since a DOM should expose behaviors that can't be implemented
without API backdoors into that DOM, and because some SAX parsers don't
report all the information that DOM permits to be exposed. The general
problem areas involve information from the Document Type Declaration (DTD).
DOM only represents a limited subset, but has some behaviors that depend
on much deeper knowledge of a document's DTD. You shouldn't have much to
worry about unless you change handling of "noise" nodes from its default
setting (which ignores them all); note if you use JAXP to populate your
DOM trees, it wants to save "noise" nodes by default. (Such nodes include
ignorable whitespace, comments, entity references and CDATA boundaries.)
Otherwise, your
main worry will be if you use a SAX parser that doesn't flag ignorable
whitespace unless it's validating (few don't).
The SAX2 events used as input must contain XML Names for elements
and attributes, with original prefixes. In SAX2,
this is optional unless the "namespace-prefixes" parser feature is set.
Moreover, many application components won't provide completely correct
structures anyway.
Before you convert a DOM to an output document,
you should plan to postprocess it to create or repair such namespace
information. The
NSFilter
pipeline stage does such work.
Note: changes late in DOM L2 process made it impractical to
attempt to create the DocumentType node in any implementation-neutral way,
much less to populate it (L1 didn't support even creating such nodes).
To create and populate such a node, subclass the inner
DomConsumer.Handler
class and teach it about the backdoors into
whatever DOM implementation you want. It's possible that some revised
DOM API (L3?) will make this problem solvable again.
static class | DomConsumer.Handler - Class used to intercept various parsing events and use them to
populate a DOM document.
|
DomConsumer(Class<T> impl) - Configures this pipeline terminus to use the specified implementation
of DOM when constructing its result value.
|
DomConsumer(Class<T> impl, EventConsumer n) - Configures this consumer as a buffer/filter, using the specified
DOM implementation when constructing its result value.
|
ContentHandler | getContentHandler() - Returns the document handler being used.
|
DTDHandler | getDTDHandler() - Returns the DTD handler being used.
|
Document | getDocument() - Returns the document constructed from the preceding
sequence of events.
|
Object | getProperty(String id) - Returns the lexical handler being used.
|
boolean | isHidingCDATA() - Returns true if the consumer is saving CDATA boundaries, or
false (the default) otherwise.
|
boolean | isHidingComments() - Returns true if the consumer is hiding comments (the default),
and false if they should be placed into the output document.
|
boolean | isHidingReferences() - Returns true if the consumer is hiding entity references nodes
(the default), and false if EntityReference nodes should
instead be created.
|
boolean | isHidingWhitespace() - Returns true if the consumer is hiding ignorable whitespace
(the default), and false if such whitespace should be placed
into the output document as children of element nodes.
|
void | setErrorHandler(ErrorHandler handler) - This method provides a filter stage with a handler that abstracts
presentation of warnings and both recoverable and fatal errors.
|
protected void | setHandler(DomConsumer.Handler h) - This is the hook through which a subclass provides a handler
which knows how to access DOM extensions, specific to some
implementation, to record additional data in a DOM.
|
void | setHidingCDATA(boolean flag) - Controls whether the consumer will save CDATA boundaries.
|
void | setHidingComments(boolean flag) - Controls whether the consumer is hiding comments.
|
void | setHidingReferences(boolean flag) - Controls whether the consumer will hide entity expansions,
or will instead mark them with entity reference nodes.
|
void | setHidingWhitespace(boolean flag) - Controls whether the consumer hides ignorable whitespace
|
clone , equals , extends Object> getClass , finalize , hashCode , notify , notifyAll , toString , wait , wait , wait |
DomConsumer
public DomConsumer(Class<T> impl)
throws SAXException
Configures this pipeline terminus to use the specified implementation
of DOM when constructing its result value.
impl
- class implementing Document
which publicly exposes a default constructor
SAXException
- when there is a problem creating an
empty DOM document using the specified implementation
DomConsumer
public DomConsumer(Class<T> impl,
EventConsumer n)
throws SAXException
Configures this consumer as a buffer/filter, using the specified
DOM implementation when constructing its result value.
This event consumer acts as a buffer and filter, in that it
builds a DOM tree and then writes it out when
endDocument
is invoked. Because of the limitations of DOM, much information
will as a rule not be seen in that replay. To get a full fidelity
copy of the input event stream, use a
TeeConsumer
.
impl
- class implementing Document
which publicly exposes a default constructor
SAXException
- when there is a problem creating an
empty DOM document using the specified DOM implementation
getDocument
public final Document getDocument()
Returns the document constructed from the preceding
sequence of events. This method should not be
used again until another sequence of events has been
given to this EventConsumer.
isHidingCDATA
public final boolean isHidingCDATA()
Returns true if the consumer is saving CDATA boundaries, or
false (the default) otherwise.
isHidingComments
public final boolean isHidingComments()
Returns true if the consumer is hiding comments (the default),
and false if they should be placed into the output document.
isHidingReferences
public final boolean isHidingReferences()
Returns true if the consumer is hiding entity references nodes
(the default), and false if EntityReference nodes should
instead be created. Such EntityReference nodes will normally be
empty, unless an implementation arranges to populate them and then
turn them back into readonly objects.
isHidingWhitespace
public final boolean isHidingWhitespace()
Returns true if the consumer is hiding ignorable whitespace
(the default), and false if such whitespace should be placed
into the output document as children of element nodes.
setErrorHandler
public void setErrorHandler(ErrorHandler handler)
This method provides a filter stage with a handler that abstracts
presentation of warnings and both recoverable and fatal errors.
Most pipeline stages should share a single policy and mechanism
for such reports, since application components require consistency
in such activities. Accordingly, typical responses to this method
invocation involve saving the handler for use; filters will pass
it on to any other consumers they use.
- setErrorHandler in interface EventConsumer
handler
- encapsulates error handling policy for this stage
setHandler
protected void setHandler(DomConsumer.Handler h)
This is the hook through which a subclass provides a handler
which knows how to access DOM extensions, specific to some
implementation, to record additional data in a DOM.
Treat this as part of construction; don't call it except
before (or between) parses.
setHidingCDATA
public final void setHidingCDATA(boolean flag)
Controls whether the consumer will save CDATA boundaries.
flag
- True to treat CDATA text differently from other
text nodes
setHidingComments
public final void setHidingComments(boolean flag)
Controls whether the consumer is hiding comments.
setHidingReferences
public final void setHidingReferences(boolean flag)
Controls whether the consumer will hide entity expansions,
or will instead mark them with entity reference nodes.
flag
- False if entity reference nodes will appear
setHidingWhitespace
public final void setHidingWhitespace(boolean flag)
Controls whether the consumer hides ignorable whitespace
DomConsumer.java --
Copyright (C) 1999,2000,2001 Free Software Foundation, Inc.
This file is part of GNU Classpath.
GNU Classpath is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2, or (at your option)
any later version.
GNU Classpath is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.
You should have received a copy of the GNU General Public License
along with GNU Classpath; see the file COPYING. If not, write to the
Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
02110-1301 USA.
Linking this library statically or dynamically with other modules is
making a combined work based on this library. Thus, the terms and
conditions of the GNU General Public License cover the whole
combination.
As a special exception, the copyright holders of this library give you
permission to link this library with independent modules to produce an
executable, regardless of the license terms of these independent
modules, and to copy and distribute the resulting executable under
terms of your choice, provided that you also meet, for each linked
independent module, the terms and conditions of the license of that
module. An independent module is a module which is not derived from
or based on this library. If you modify this library, you may extend
this exception to your version of the library, but you are not
obligated to do so. If you do not wish to do so, delete this
exception statement from your version.