lxml.etree module
The lxml.etree
module implements the extended ElementTree API for XML.
- exception lxml.etree.C14NError
Bases:
LxmlError
Error during C14N serialisation.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.DTDError
Bases:
LxmlError
Base class for DTD errors.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.DTDParseError
Bases:
DTDError
Error while parsing a DTD.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.DTDValidateError
Bases:
DTDError
Error while validating an XML document with a DTD.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.DocumentInvalid
Bases:
LxmlError
Validation error.
Raised by all document validators when their
assertValid(tree)
method fails.- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.Error
Bases:
Exception
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.LxmlError
Bases:
Error
Main exception base class for lxml. All other exceptions inherit from this one.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.LxmlRegistryError
Bases:
LxmlError
Base class of lxml registry errors.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.LxmlSyntaxError
Bases:
LxmlError
,SyntaxError
Base class for all syntax errors.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- end_lineno
exception end lineno
- end_offset
exception end offset
- filename
exception filename
- lineno
exception lineno
- msg
exception msg
- offset
exception offset
- print_file_and_line
exception print_file_and_line
- text
exception text
- exception lxml.etree.NamespaceRegistryError
Bases:
LxmlRegistryError
Error registering a namespace extension.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.ParseError(message, code, line, column, filename=None)
Bases:
LxmlSyntaxError
Syntax error while parsing an XML document.
For compatibility with ElementTree 1.3 and later.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- end_lineno
exception end lineno
- end_offset
exception end offset
- filename
exception filename
- lineno
exception lineno
- msg
exception msg
- offset
exception offset
- property position
- print_file_and_line
exception print_file_and_line
- text
exception text
- exception lxml.etree.ParserError
Bases:
LxmlError
Internal lxml parser error.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.RelaxNGError
Bases:
LxmlError
Base class for RelaxNG errors.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.RelaxNGParseError
Bases:
RelaxNGError
Error while parsing an XML document as RelaxNG.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.RelaxNGValidateError
Bases:
RelaxNGError
Error while validating an XML document with a RelaxNG schema.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.SchematronError
Bases:
LxmlError
Base class of all Schematron errors.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.SchematronParseError
Bases:
SchematronError
Error while parsing an XML document as Schematron schema.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.SchematronValidateError
Bases:
SchematronError
Error while validating an XML document with a Schematron schema.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.SerialisationError
Bases:
LxmlError
A libxml2 error that occurred during serialisation.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XIncludeError
Bases:
LxmlError
Error during XInclude processing.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XMLSchemaError
Bases:
LxmlError
Base class of all XML Schema errors
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XMLSchemaParseError
Bases:
XMLSchemaError
Error while parsing an XML document as XML Schema.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XMLSchemaValidateError
Bases:
XMLSchemaError
Error while validating an XML document with an XML Schema.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XMLSyntaxAssertionError(message, code, line, column, filename=None)
Bases:
XMLSyntaxError
,AssertionError
An XMLSyntaxError that additionally inherits from AssertionError for ElementTree / backwards compatibility reasons.
This class may get replaced by a plain XMLSyntaxError in a future version.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- end_lineno
exception end lineno
- end_offset
exception end offset
- filename
exception filename
- lineno
exception lineno
- msg
exception msg
- offset
exception offset
- property position
- print_file_and_line
exception print_file_and_line
- text
exception text
- exception lxml.etree.XMLSyntaxError(message, code, line, column, filename=None)
Bases:
ParseError
Syntax error while parsing an XML document.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- end_lineno
exception end lineno
- end_offset
exception end offset
- filename
exception filename
- lineno
exception lineno
- msg
exception msg
- offset
exception offset
- property position
- print_file_and_line
exception print_file_and_line
- text
exception text
- exception lxml.etree.XPathError
Bases:
LxmlError
Base class of all XPath errors.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XPathEvalError
Bases:
XPathError
Error during XPath evaluation.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XPathFunctionError
Bases:
XPathEvalError
Internal error looking up an XPath extension function.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XPathResultError
Bases:
XPathEvalError
Error handling an XPath result.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XPathSyntaxError
Bases:
LxmlSyntaxError
,XPathError
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- end_lineno
exception end lineno
- end_offset
exception end offset
- filename
exception filename
- lineno
exception lineno
- msg
exception msg
- offset
exception offset
- print_file_and_line
exception print_file_and_line
- text
exception text
- exception lxml.etree.XSLTApplyError
Bases:
XSLTError
Error running an XSL transformation.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XSLTError
Bases:
LxmlError
Base class of all XSLT errors.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XSLTExtensionError
Bases:
XSLTError
Error registering an XSLT extension.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XSLTParseError
Bases:
XSLTError
Error parsing a stylesheet document.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree.XSLTSaveError
Bases:
XSLTError
,SerialisationError
Error serialising an XSLT result.
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- exception lxml.etree._TargetParserResult(result)
Bases:
Exception
- add_note()
Exception.add_note(note) – add a note to the exception
- with_traceback()
Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.
- args
- class lxml.etree.AncestorsIterator(self, node, tag=None)
Bases:
_ElementMatchIterator
Iterates over the ancestors of an element (from parent to parent).
- class lxml.etree.AttributeBasedElementClassLookup(self, attribute_name, class_mapping, fallback=None)
Bases:
FallbackElementClassLookup
Checks an attribute of an Element and looks up the value in a class dictionary.
- Arguments:
attribute name - ‘{ns}name’ style string
class mapping - Python dict mapping attribute values to Element classes
fallback - optional fallback lookup mechanism
A None key in the class mapping will be checked if the attribute is missing.
- set_fallback(self, lookup)
Sets the fallback scheme for this lookup method.
- fallback
- class lxml.etree.C14NWriterTarget
Bases:
object
Canonicalization writer target for the XMLParser.
Serialises parse events to XML C14N 2.0.
Configuration options:
with_comments: set to true to include comments
strip_text: set to true to strip whitespace before and after text content
rewrite_prefixes: set to true to replace namespace prefixes by “n{number}”
- qname_aware_tags: a set of qname aware tag names in which prefixes
should be replaced in text content
- qname_aware_attrs: a set of qname aware attribute names in which prefixes
should be replaced in text content
exclude_attrs: a set of attribute names that should not be serialised
exclude_tags: a set of tag names that should not be serialised
- _iter_namespaces(ns_stack)
- close()
- comment(text)
- data(data)
- end(tag)
- pi(target, data)
- start(tag, attrs)
- start_ns(prefix, uri)
- class lxml.etree.CDATA(data)
Bases:
object
CDATA factory. This factory creates an opaque data object that can be used to set Element text. The usual way to use it is:
>>> el = Element('content') >>> el.text = CDATA('a string') >>> print(el.text) a string >>> print(tostring(el, encoding="unicode")) <content><![CDATA[a string]]></content>
- class lxml.etree.CommentBase
Bases:
_Comment
All custom Comment classes must inherit from this one.
To create an XML Comment instance, use the
Comment()
factory.Subclasses must not override __init__ or __new__ as it is absolutely undefined when these objects will be created or destroyed. All persistent state of Comments must be stored in the underlying XML. If you really need to initialize the object after creation, you can implement an
_init(self)
method that will be called after object creation.- _init(self)
Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.
- addnext(self, element)
Adds the element as a following sibling directly after this element.
This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- addprevious(self, element)
Adds the element as a preceding sibling directly before this element.
This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- append(self, value)
- clear(self, keep_tail=False)
Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.
Pass
keep_tail=True
to leave the tail text untouched.
- cssselect(expr, *, translator)
Run the CSS expression on this element and its children, returning a list of the results.
Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.
- extend(self, elements)
Extends the current children by the elements in the iterable.
- find(self, path, namespaces=None)
Finds the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds text for the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- get(self, key, default=None)
- getchildren(self)
Returns all direct children. The elements are returned in document order.
- Deprecated
Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use
list(element)
or simply iterate over elements.
- getiterator(self, tag=None, *tags)
Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags, see iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
element.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getnext(self)
Returns the following sibling of this element or None.
- getparent(self)
Returns the parent of this element or None for the root element.
- getprevious(self)
Returns the preceding sibling of this element or None.
- getroottree(self)
Return an ElementTree for the root node of the document that contains this element.
This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.
- index(self, child, start=None, stop=None)
Find the position of the child within the parent.
This method is not part of the original ElementTree API.
- insert(self, index, value)
- items(self)
- iter(self, tag=None, *tags)
Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags: pass
"{ns}localname"
as tag. Either or both ofns
andlocalname
can be*
for a wildcard;ns
can be empty for no namespace."localname"
is equivalent to"{}localname"
(i.e. no namespace) but"*"
is"{*}*"
(any or no namespace), not"{}*"
.You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.
Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.
- iterancestors(self, tag=None, *tags)
Iterate over the ancestors of this element (from parent to parent).
Can be restricted to find only elements with specific tags, see iter.
- iterchildren(self, tag=None, *tags, reversed=False)
Iterate over the children of this element.
As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.
- iterdescendants(self, tag=None, *tags)
Iterate over the descendants of this element in document order.
As opposed to
el.iter()
, this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.
- iterfind(self, path, namespaces=None)
Iterates over all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- itersiblings(self, tag=None, *tags, preceding=False)
Iterate over the following or preceding siblings of this element.
The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.
Can be restricted to find only elements with specific tags, see iter.
- itertext(self, tag=None, *tags, with_tail=True)
Iterates over the text content of a subtree.
You can pass tag names to restrict text content to specific elements, see iter.
You can set the
with_tail
keyword argument toFalse
to skip over tail text.
- keys(self)
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with the same document.
- remove(self, element)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
- replace(self, old_element, new_element)
Replaces a subelement with the element passed as second argument.
- set(self, key, value)
- values(self)
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
Evaluate an xpath expression using the element as context node.
- attrib
- base
The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.
Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.
Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).
- nsmap
Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.
Note that changing the returned dict has no effect on the Element.
- prefix
Namespace prefix or None.
- sourceline
Original line number as found by the parser or None if unknown.
- tag
- tail
Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.
- text
- class lxml.etree.CustomElementClassLookup(self, fallback=None)
Bases:
FallbackElementClassLookup
Element class lookup based on a subclass method.
You can inherit from this class and override the method:
lookup(self, type, doc, namespace, name)
to lookup the element class for a node. Arguments of the method: * type: one of ‘element’, ‘comment’, ‘PI’, ‘entity’ * doc: document that the node is in * namespace: namespace URI of the node (or None for comments/PIs/entities) * name: name of the element/entity, None for comments, target for PIs
If you return None from this method, the fallback will be called.
- lookup(self, type, doc, namespace, name)
- set_fallback(self, lookup)
Sets the fallback scheme for this lookup method.
- fallback
- class lxml.etree.DTD(self, file=None, external_id=None)
Bases:
_Validator
A DTD validator.
Can load from filesystem directly given a filename or file-like object. Alternatively, pass the keyword parameter
external_id
to load from a catalog.- _append_log_message(domain, type, level, line, message, filename)
- _clear_error_log()
- assertValid(self, etree)
Raises DocumentInvalid if the document does not comply with the schema.
- assert_(self, etree)
Raises AssertionError if the document does not comply with the schema.
- elements()
- entities()
- iterelements()
- iterentities()
- validate(self, etree)
Validate the document using this schema.
Returns true if document is valid, false if not.
- error_log
The log of validation errors and warnings.
- external_id
- name
- system_url
- class lxml.etree.DocInfo
Bases:
object
Document information provided by parser and DTD.
- clear()
Removes DOCTYPE and internal subset from the document.
- URL
The source URL of the document (or None if unknown).
- doctype
Returns a DOCTYPE declaration string for the document.
- encoding
Returns the encoding name as declared by the document.
- externalDTD
Returns a DTD validator based on the external subset of the document.
- internalDTD
Returns a DTD validator based on the internal subset of the document.
- public_id
Public ID of the DOCTYPE.
Mutable. May be set to a valid string or None. If a DTD does not exist, setting this variable (even to None) will create one.
- root_name
Returns the name of the root node as defined by the DOCTYPE.
- standalone
Returns the standalone flag as declared by the document. The possible values are True (
standalone='yes'
), False (standalone='no'
or flag not provided in the declaration), and None (unknown or no declaration found). Note that a normal truth test on this value will always tell if thestandalone
flag was set to'yes'
or not.
- system_url
System ID of the DOCTYPE.
Mutable. May be set to a valid string or None. If a DTD does not exist, setting this variable (even to None) will create one.
- xml_version
Returns the XML version as declared by the document.
- class lxml.etree.ETCompatXMLParser(self, encoding=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, schema=None, huge_tree=False, remove_blank_text=False, resolve_entities=True, remove_comments=True, remove_pis=True, strip_cdata=True, target=None, compact=True)
Bases:
XMLParser
An XML parser with an ElementTree compatible default setup.
See the XMLParser class for details.
This parser has
remove_comments
andremove_pis
enabled by default and thus ignores comments and processing instructions.- close(self)
Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.
This method must be called after passing the last chunk of data into the
feed()
method. It should only be called when using the feed parser interface, all other usage is undefined.
- copy(self)
Create a new parser with the same configuration.
- feed(self, data)
Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.
This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the
close()
method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to callclose()
.The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the
parse()
function concurrently.
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with this parser.
- setElementClassLookup(lookup)
- Deprecated
use
parser.set_element_class_lookup(lookup)
instead.
- set_element_class_lookup(self, lookup=None)
Set a lookup scheme for element classes generated from this parser.
Reset it by passing None or nothing.
- error_log
The error log of the last parser run.
- feed_error_log
The error log of the last (or current) run of the feed parser.
Note that this is local to the feed parser and thus is different from what the
error_log
property returns.
- resolvers
The custom resolver registry of this parser.
- target
- version
The version of the underlying XML parser.
- class lxml.etree.ETXPath(self, path, extensions=None, regexp=True, smart_strings=True)
Bases:
XPath
Special XPath class that supports the ElementTree {uri} notation for namespaces.
Note that this class does not accept the
namespace
keyword argument. All namespaces must be passed as part of the path string. Smart strings will be returned for string results unless you passsmart_strings=False
.- evaluate(self, _eval_arg, **_variables)
Evaluate an XPath expression.
Instead of calling this method, you can also call the evaluator object itself.
Variables may be provided as keyword arguments. Note that namespaces are currently not supported for variables.
- Deprecated
call the object, not its method.
- error_log
- path
The literal XPath expression.
- class lxml.etree.ElementBase(*children, attrib=None, nsmap=None, **_extra)
Bases:
_Element
The public Element class. All custom Element classes must inherit from this one. To create an Element, use the Element() factory.
BIG FAT WARNING: Subclasses must not override __init__ or __new__ as it is absolutely undefined when these objects will be created or destroyed. All persistent state of Elements must be stored in the underlying XML. If you really need to initialize the object after creation, you can implement an
_init(self)
method that will be called directly after object creation.Subclasses of this class can be instantiated to create a new Element. By default, the tag name will be the class name and the namespace will be empty. You can modify this with the following class attributes:
TAG - the tag name, possibly containing a namespace in Clark notation
NAMESPACE - the default namespace URI, unless provided as part of the TAG attribute.
HTML - flag if the class is an HTML tag, as opposed to an XML tag. This only applies to un-namespaced tags and defaults to false (i.e. XML).
PARSER - the parser that provides the configuration for the newly created document. Providing an HTML parser here will default to creating an HTML element.
In user code, the latter three are commonly inherited in class hierarchies that implement a common namespace.
- _init(self)
Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.
- addnext(self, element)
Adds the element as a following sibling directly after this element.
This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- addprevious(self, element)
Adds the element as a preceding sibling directly before this element.
This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- append(self, element)
Adds a subelement to the end of this element.
- clear(self, keep_tail=False)
Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.
Pass
keep_tail=True
to leave the tail text untouched.
- cssselect(expr, *, translator)
Run the CSS expression on this element and its children, returning a list of the results.
Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.
- extend(self, elements)
Extends the current children by the elements in the iterable.
- find(self, path, namespaces=None)
Finds the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds text for the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- get(self, key, default=None)
Gets an element attribute.
- getchildren(self)
Returns all direct children. The elements are returned in document order.
- Deprecated
Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use
list(element)
or simply iterate over elements.
- getiterator(self, tag=None, *tags)
Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags, see iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
element.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getnext(self)
Returns the following sibling of this element or None.
- getparent(self)
Returns the parent of this element or None for the root element.
- getprevious(self)
Returns the preceding sibling of this element or None.
- getroottree(self)
Return an ElementTree for the root node of the document that contains this element.
This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.
- index(self, child, start=None, stop=None)
Find the position of the child within the parent.
This method is not part of the original ElementTree API.
- insert(self, index, element)
Inserts a subelement at the given position in this element
- items(self)
Gets element attributes, as a sequence. The attributes are returned in an arbitrary order.
- iter(self, tag=None, *tags)
Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags: pass
"{ns}localname"
as tag. Either or both ofns
andlocalname
can be*
for a wildcard;ns
can be empty for no namespace."localname"
is equivalent to"{}localname"
(i.e. no namespace) but"*"
is"{*}*"
(any or no namespace), not"{}*"
.You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.
Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.
- iterancestors(self, tag=None, *tags)
Iterate over the ancestors of this element (from parent to parent).
Can be restricted to find only elements with specific tags, see iter.
- iterchildren(self, tag=None, *tags, reversed=False)
Iterate over the children of this element.
As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.
- iterdescendants(self, tag=None, *tags)
Iterate over the descendants of this element in document order.
As opposed to
el.iter()
, this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.
- iterfind(self, path, namespaces=None)
Iterates over all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- itersiblings(self, tag=None, *tags, preceding=False)
Iterate over the following or preceding siblings of this element.
The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.
Can be restricted to find only elements with specific tags, see iter.
- itertext(self, tag=None, *tags, with_tail=True)
Iterates over the text content of a subtree.
You can pass tag names to restrict text content to specific elements, see iter.
You can set the
with_tail
keyword argument toFalse
to skip over tail text.
- keys(self)
Gets a list of attribute names. The names are returned in an arbitrary order (just like for an ordinary Python dictionary).
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with the same document.
- remove(self, element)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
- replace(self, old_element, new_element)
Replaces a subelement with the element passed as second argument.
- set(self, key, value)
Sets an element attribute. In HTML documents (not XML or XHTML), the value None is allowed and creates an attribute without value (just the attribute name).
- values(self)
Gets element attribute values as a sequence of strings. The attributes are returned in an arbitrary order.
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
Evaluate an xpath expression using the element as context node.
- attrib
Element attribute dictionary. Where possible, use get(), set(), keys(), values() and items() to access element attributes.
- base
The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.
Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.
Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).
- nsmap
Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.
Note that changing the returned dict has no effect on the Element.
- prefix
Namespace prefix or None.
- sourceline
Original line number as found by the parser or None if unknown.
- tag
Element tag
- tail
Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.
- text
Text before the first subelement. This is either a string or the value None, if there was no text.
- class lxml.etree.ElementChildIterator(self, node, tag=None, reversed=False)
Bases:
_ElementMatchIterator
Iterates over the children of an element.
- class lxml.etree.ElementClassLookup(self)
Bases:
object
Superclass of Element class lookups.
- class lxml.etree.ElementDefaultClassLookup(self, element=None, comment=None, pi=None, entity=None)
Bases:
ElementClassLookup
Element class lookup scheme that always returns the default Element class.
The keyword arguments
element
,comment
,pi
andentity
accept the respective Element classes.- comment_class
- element_class
- entity_class
- pi_class
- class lxml.etree.ElementDepthFirstIterator(self, node, tag=None, inclusive=True)
Bases:
object
Iterates over an element and its sub-elements in document order (depth first pre-order).
Note that this also includes comments, entities and processing instructions. To filter them out, check if the
tag
property of the returned element is a string (i.e. not None and not a factory function), or pass theElement
factory for thetag
argument to receive only Elements.If the optional
tag
argument is not None, the iterator returns only the elements that match the respective name and namespace.The optional boolean argument ‘inclusive’ defaults to True and can be set to False to exclude the start element itself.
Note that the behaviour of this iterator is completely undefined if the tree it traverses is modified during iteration.
- class lxml.etree.ElementNamespaceClassLookup(self, fallback=None)
Bases:
FallbackElementClassLookup
Element class lookup scheme that searches the Element class in the Namespace registry.
Usage:
>>> lookup = ElementNamespaceClassLookup() >>> ns_elements = lookup.get_namespace("http://schema.org/Movie")
>>> @ns_elements ... class movie(ElementBase): ... "Element implementation for 'movie' tag (using class name) in schema namespace."
>>> @ns_elements("movie") ... class MovieElement(ElementBase): ... "Element implementation for 'movie' tag (explicit tag name) in schema namespace."
- get_namespace(self, ns_uri)
Retrieve the namespace object associated with the given URI. Pass None for the empty namespace.
Creates a new namespace object if it does not yet exist.
- set_fallback(self, lookup)
Sets the fallback scheme for this lookup method.
- fallback
- class lxml.etree.ElementTextIterator(self, element, tag=None, with_tail=True)
Bases:
object
Iterates over the text content of a subtree.
You can pass the
tag
keyword argument to restrict text content to a specific tag name.You can set the
with_tail
keyword argument toFalse
to skip over tail text (e.g. if you know that it’s only whitespace from pretty-printing).
- class lxml.etree.EntityBase
Bases:
_Entity
All custom Entity classes must inherit from this one.
To create an XML Entity instance, use the
Entity()
factory.Subclasses must not override __init__ or __new__ as it is absolutely undefined when these objects will be created or destroyed. All persistent state of Entities must be stored in the underlying XML. If you really need to initialize the object after creation, you can implement an
_init(self)
method that will be called after object creation.- _init(self)
Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.
- addnext(self, element)
Adds the element as a following sibling directly after this element.
This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- addprevious(self, element)
Adds the element as a preceding sibling directly before this element.
This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- append(self, value)
- clear(self, keep_tail=False)
Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.
Pass
keep_tail=True
to leave the tail text untouched.
- cssselect(expr, *, translator)
Run the CSS expression on this element and its children, returning a list of the results.
Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.
- extend(self, elements)
Extends the current children by the elements in the iterable.
- find(self, path, namespaces=None)
Finds the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds text for the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- get(self, key, default=None)
- getchildren(self)
Returns all direct children. The elements are returned in document order.
- Deprecated
Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use
list(element)
or simply iterate over elements.
- getiterator(self, tag=None, *tags)
Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags, see iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
element.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getnext(self)
Returns the following sibling of this element or None.
- getparent(self)
Returns the parent of this element or None for the root element.
- getprevious(self)
Returns the preceding sibling of this element or None.
- getroottree(self)
Return an ElementTree for the root node of the document that contains this element.
This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.
- index(self, child, start=None, stop=None)
Find the position of the child within the parent.
This method is not part of the original ElementTree API.
- insert(self, index, value)
- items(self)
- iter(self, tag=None, *tags)
Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags: pass
"{ns}localname"
as tag. Either or both ofns
andlocalname
can be*
for a wildcard;ns
can be empty for no namespace."localname"
is equivalent to"{}localname"
(i.e. no namespace) but"*"
is"{*}*"
(any or no namespace), not"{}*"
.You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.
Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.
- iterancestors(self, tag=None, *tags)
Iterate over the ancestors of this element (from parent to parent).
Can be restricted to find only elements with specific tags, see iter.
- iterchildren(self, tag=None, *tags, reversed=False)
Iterate over the children of this element.
As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.
- iterdescendants(self, tag=None, *tags)
Iterate over the descendants of this element in document order.
As opposed to
el.iter()
, this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.
- iterfind(self, path, namespaces=None)
Iterates over all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- itersiblings(self, tag=None, *tags, preceding=False)
Iterate over the following or preceding siblings of this element.
The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.
Can be restricted to find only elements with specific tags, see iter.
- itertext(self, tag=None, *tags, with_tail=True)
Iterates over the text content of a subtree.
You can pass tag names to restrict text content to specific elements, see iter.
You can set the
with_tail
keyword argument toFalse
to skip over tail text.
- keys(self)
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with the same document.
- remove(self, element)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
- replace(self, old_element, new_element)
Replaces a subelement with the element passed as second argument.
- set(self, key, value)
- values(self)
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
Evaluate an xpath expression using the element as context node.
- attrib
- base
The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.
Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.
Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).
- name
- nsmap
Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.
Note that changing the returned dict has no effect on the Element.
- prefix
Namespace prefix or None.
- sourceline
Original line number as found by the parser or None if unknown.
- tag
- tail
Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.
- text
- class lxml.etree.ErrorDomains
Bases:
object
Libxml2 error domains
- _getName(default=None, /)
Return the value for key if key is in the dictionary, else default.
- BUFFER = 29
- C14N = 21
- CATALOG = 20
- CHECK = 24
- DATATYPE = 15
- DTD = 4
- FTP = 9
- HTML = 5
- HTTP = 10
- I18N = 27
- IO = 8
- MEMORY = 6
- MODULE = 26
- NAMESPACE = 3
- NONE = 0
- OUTPUT = 7
- PARSER = 1
- REGEXP = 14
- RELAXNGP = 18
- RELAXNGV = 19
- SCHEMASP = 16
- SCHEMASV = 17
- SCHEMATRONV = 28
- TREE = 2
- URI = 30
- VALID = 23
- WRITER = 25
- XINCLUDE = 11
- XPATH = 12
- XPOINTER = 13
- XSLT = 22
- _names = {0: 'NONE', 1: 'PARSER', 2: 'TREE', 3: 'NAMESPACE', 4: 'DTD', 5: 'HTML', 6: 'MEMORY', 7: 'OUTPUT', 8: 'IO', 9: 'FTP', 10: 'HTTP', 11: 'XINCLUDE', 12: 'XPATH', 13: 'XPOINTER', 14: 'REGEXP', 15: 'DATATYPE', 16: 'SCHEMASP', 17: 'SCHEMASV', 18: 'RELAXNGP', 19: 'RELAXNGV', 20: 'CATALOG', 21: 'C14N', 22: 'XSLT', 23: 'VALID', 24: 'CHECK', 25: 'WRITER', 26: 'MODULE', 27: 'I18N', 28: 'SCHEMATRONV', 29: 'BUFFER', 30: 'URI'}
- class lxml.etree.ErrorLevels
Bases:
object
Libxml2 error levels
- _getName(default=None, /)
Return the value for key if key is in the dictionary, else default.
- ERROR = 2
- FATAL = 3
- NONE = 0
- WARNING = 1
- _names = {0: 'NONE', 1: 'WARNING', 2: 'ERROR', 3: 'FATAL'}
- class lxml.etree.ErrorTypes
Bases:
object
Libxml2 error types
- _getName(default=None, /)
Return the value for key if key is in the dictionary, else default.
- BUF_OVERFLOW = 7000
- C14N_CREATE_CTXT = 1950
- C14N_CREATE_STACK = 1952
- C14N_INVALID_NODE = 1953
- C14N_RELATIVE_NAMESPACE = 1955
- C14N_REQUIRES_UTF8 = 1951
- C14N_UNKNOW_NODE = 1954
- CATALOG_ENTRY_BROKEN = 1651
- CATALOG_MISSING_ATTR = 1650
- CATALOG_NOT_CATALOG = 1653
- CATALOG_PREFER_VALUE = 1652
- CATALOG_RECURSION = 1654
- CHECK_ENTITY_TYPE = 5012
- CHECK_FOUND_ATTRIBUTE = 5001
- CHECK_FOUND_CDATA = 5003
- CHECK_FOUND_COMMENT = 5007
- CHECK_FOUND_DOCTYPE = 5008
- CHECK_FOUND_ELEMENT = 5000
- CHECK_FOUND_ENTITY = 5005
- CHECK_FOUND_ENTITYREF = 5004
- CHECK_FOUND_FRAGMENT = 5009
- CHECK_FOUND_NOTATION = 5010
- CHECK_FOUND_PI = 5006
- CHECK_FOUND_TEXT = 5002
- CHECK_NAME_NOT_NULL = 5037
- CHECK_NOT_ATTR = 5023
- CHECK_NOT_ATTR_DECL = 5024
- CHECK_NOT_DTD = 5022
- CHECK_NOT_ELEM_DECL = 5025
- CHECK_NOT_ENTITY_DECL = 5026
- CHECK_NOT_NCNAME = 5034
- CHECK_NOT_NS_DECL = 5027
- CHECK_NOT_UTF8 = 5032
- CHECK_NO_DICT = 5033
- CHECK_NO_DOC = 5014
- CHECK_NO_ELEM = 5016
- CHECK_NO_HREF = 5028
- CHECK_NO_NAME = 5015
- CHECK_NO_NEXT = 5020
- CHECK_NO_PARENT = 5013
- CHECK_NO_PREV = 5018
- CHECK_NS_ANCESTOR = 5031
- CHECK_NS_SCOPE = 5030
- CHECK_OUTSIDE_DICT = 5035
- CHECK_UNKNOWN_NODE = 5011
- CHECK_WRONG_DOC = 5017
- CHECK_WRONG_NAME = 5036
- CHECK_WRONG_NEXT = 5021
- CHECK_WRONG_PARENT = 5029
- CHECK_WRONG_PREV = 5019
- DTD_ATTRIBUTE_DEFAULT = 500
- DTD_ATTRIBUTE_REDEFINED = 501
- DTD_ATTRIBUTE_VALUE = 502
- DTD_CONTENT_ERROR = 503
- DTD_CONTENT_MODEL = 504
- DTD_CONTENT_NOT_DETERMINIST = 505
- DTD_DIFFERENT_PREFIX = 506
- DTD_DUP_TOKEN = 541
- DTD_ELEM_DEFAULT_NAMESPACE = 507
- DTD_ELEM_NAMESPACE = 508
- DTD_ELEM_REDEFINED = 509
- DTD_EMPTY_NOTATION = 510
- DTD_ENTITY_TYPE = 511
- DTD_ID_FIXED = 512
- DTD_ID_REDEFINED = 513
- DTD_ID_SUBSET = 514
- DTD_INVALID_CHILD = 515
- DTD_INVALID_DEFAULT = 516
- DTD_LOAD_ERROR = 517
- DTD_MISSING_ATTRIBUTE = 518
- DTD_MIXED_CORRUPT = 519
- DTD_MULTIPLE_ID = 520
- DTD_NOTATION_REDEFINED = 526
- DTD_NOTATION_VALUE = 527
- DTD_NOT_EMPTY = 528
- DTD_NOT_PCDATA = 529
- DTD_NOT_STANDALONE = 530
- DTD_NO_DOC = 521
- DTD_NO_DTD = 522
- DTD_NO_ELEM_NAME = 523
- DTD_NO_PREFIX = 524
- DTD_NO_ROOT = 525
- DTD_ROOT_NAME = 531
- DTD_STANDALONE_DEFAULTED = 538
- DTD_STANDALONE_WHITE_SPACE = 532
- DTD_UNKNOWN_ATTRIBUTE = 533
- DTD_UNKNOWN_ELEM = 534
- DTD_UNKNOWN_ENTITY = 535
- DTD_UNKNOWN_ID = 536
- DTD_UNKNOWN_NOTATION = 537
- DTD_XMLID_TYPE = 540
- DTD_XMLID_VALUE = 539
- ERR_ATTLIST_NOT_FINISHED = 51
- ERR_ATTLIST_NOT_STARTED = 50
- ERR_ATTRIBUTE_NOT_FINISHED = 40
- ERR_ATTRIBUTE_NOT_STARTED = 39
- ERR_ATTRIBUTE_REDEFINED = 42
- ERR_ATTRIBUTE_WITHOUT_VALUE = 41
- ERR_CDATA_NOT_FINISHED = 63
- ERR_CHARREF_AT_EOF = 10
- ERR_CHARREF_IN_DTD = 13
- ERR_CHARREF_IN_EPILOG = 12
- ERR_CHARREF_IN_PROLOG = 11
- ERR_COMMENT_ABRUPTLY_ENDED = 112
- ERR_COMMENT_NOT_FINISHED = 45
- ERR_CONDSEC_INVALID = 83
- ERR_CONDSEC_INVALID_KEYWORD = 95
- ERR_CONDSEC_NOT_FINISHED = 59
- ERR_CONDSEC_NOT_STARTED = 58
- ERR_DOCTYPE_NOT_FINISHED = 61
- ERR_DOCUMENT_EMPTY = 4
- ERR_DOCUMENT_END = 5
- ERR_DOCUMENT_START = 3
- ERR_ELEMCONTENT_NOT_FINISHED = 55
- ERR_ELEMCONTENT_NOT_STARTED = 54
- ERR_ENCODING_NAME = 79
- ERR_ENTITYREF_AT_EOF = 14
- ERR_ENTITYREF_IN_DTD = 17
- ERR_ENTITYREF_IN_EPILOG = 16
- ERR_ENTITYREF_IN_PROLOG = 15
- ERR_ENTITYREF_NO_NAME = 22
- ERR_ENTITYREF_SEMICOL_MISSING = 23
- ERR_ENTITY_BOUNDARY = 90
- ERR_ENTITY_CHAR_ERROR = 87
- ERR_ENTITY_IS_EXTERNAL = 29
- ERR_ENTITY_IS_PARAMETER = 30
- ERR_ENTITY_LOOP = 89
- ERR_ENTITY_NOT_FINISHED = 37
- ERR_ENTITY_NOT_STARTED = 36
- ERR_ENTITY_PE_INTERNAL = 88
- ERR_ENTITY_PROCESSING = 104
- ERR_EQUAL_REQUIRED = 75
- ERR_EXTRA_CONTENT = 86
- ERR_EXT_ENTITY_STANDALONE = 82
- ERR_EXT_SUBSET_NOT_FINISHED = 60
- ERR_GT_REQUIRED = 73
- ERR_HYPHEN_IN_COMMENT = 80
- ERR_INTERNAL_ERROR = 1
- ERR_INVALID_CHAR = 9
- ERR_INVALID_CHARREF = 8
- ERR_INVALID_DEC_CHARREF = 7
- ERR_INVALID_ENCODING = 81
- ERR_INVALID_HEX_CHARREF = 6
- ERR_INVALID_URI = 91
- ERR_LITERAL_NOT_FINISHED = 44
- ERR_LITERAL_NOT_STARTED = 43
- ERR_LTSLASH_REQUIRED = 74
- ERR_LT_IN_ATTRIBUTE = 38
- ERR_LT_REQUIRED = 72
- ERR_MISPLACED_CDATA_END = 62
- ERR_MISSING_ENCODING = 101
- ERR_MIXED_NOT_FINISHED = 53
- ERR_MIXED_NOT_STARTED = 52
- ERR_NAME_REQUIRED = 68
- ERR_NAME_TOO_LONG = 110
- ERR_NMTOKEN_REQUIRED = 67
- ERR_NOTATION_NOT_FINISHED = 49
- ERR_NOTATION_NOT_STARTED = 48
- ERR_NOTATION_PROCESSING = 105
- ERR_NOT_STANDALONE = 103
- ERR_NOT_WELL_BALANCED = 85
- ERR_NO_DTD = 94
- ERR_NO_MEMORY = 2
- ERR_NS_DECL_ERROR = 35
- ERR_OK = 0
- ERR_PCDATA_REQUIRED = 69
- ERR_PEREF_AT_EOF = 18
- ERR_PEREF_IN_EPILOG = 20
- ERR_PEREF_IN_INT_SUBSET = 21
- ERR_PEREF_IN_PROLOG = 19
- ERR_PEREF_NO_NAME = 24
- ERR_PEREF_SEMICOL_MISSING = 25
- ERR_PI_NOT_FINISHED = 47
- ERR_PI_NOT_STARTED = 46
- ERR_PUBID_REQUIRED = 71
- ERR_RESERVED_XML_NAME = 64
- ERR_SEPARATOR_REQUIRED = 66
- ERR_SPACE_REQUIRED = 65
- ERR_STANDALONE_VALUE = 78
- ERR_STRING_NOT_CLOSED = 34
- ERR_STRING_NOT_STARTED = 33
- ERR_TAG_NAME_MISMATCH = 76
- ERR_TAG_NOT_FINISHED = 77
- ERR_UNDECLARED_ENTITY = 26
- ERR_UNKNOWN_ENCODING = 31
- ERR_UNKNOWN_VERSION = 108
- ERR_UNPARSED_ENTITY = 28
- ERR_UNSUPPORTED_ENCODING = 32
- ERR_URI_FRAGMENT = 92
- ERR_URI_REQUIRED = 70
- ERR_USER_STOP = 111
- ERR_VALUE_REQUIRED = 84
- ERR_VERSION_MISMATCH = 109
- ERR_VERSION_MISSING = 96
- ERR_XMLDECL_NOT_FINISHED = 57
- ERR_XMLDECL_NOT_STARTED = 56
- FTP_ACCNT = 2002
- FTP_EPSV_ANSWER = 2001
- FTP_PASV_ANSWER = 2000
- FTP_URL_SYNTAX = 2003
- HTML_STRUCURE_ERROR = 800
- HTML_UNKNOWN_TAG = 801
- HTTP_UNKNOWN_HOST = 2022
- HTTP_URL_SYNTAX = 2020
- HTTP_USE_IP = 2021
- I18N_CONV_FAILED = 6003
- I18N_EXCESS_HANDLER = 6002
- I18N_NO_HANDLER = 6001
- I18N_NO_NAME = 6000
- I18N_NO_OUTPUT = 6004
- IO_BUFFER_FULL = 1548
- IO_EACCES = 1501
- IO_EADDRINUSE = 1554
- IO_EAFNOSUPPORT = 1556
- IO_EAGAIN = 1502
- IO_EALREADY = 1555
- IO_EBADF = 1503
- IO_EBADMSG = 1504
- IO_EBUSY = 1505
- IO_ECANCELED = 1506
- IO_ECHILD = 1507
- IO_ECONNREFUSED = 1552
- IO_EDEADLK = 1508
- IO_EDOM = 1509
- IO_EEXIST = 1510
- IO_EFAULT = 1511
- IO_EFBIG = 1512
- IO_EINPROGRESS = 1513
- IO_EINTR = 1514
- IO_EINVAL = 1515
- IO_EIO = 1516
- IO_EISCONN = 1551
- IO_EISDIR = 1517
- IO_EMFILE = 1518
- IO_EMLINK = 1519
- IO_EMSGSIZE = 1520
- IO_ENAMETOOLONG = 1521
- IO_ENCODER = 1544
- IO_ENETUNREACH = 1553
- IO_ENFILE = 1522
- IO_ENODEV = 1523
- IO_ENOENT = 1524
- IO_ENOEXEC = 1525
- IO_ENOLCK = 1526
- IO_ENOMEM = 1527
- IO_ENOSPC = 1528
- IO_ENOSYS = 1529
- IO_ENOTDIR = 1530
- IO_ENOTEMPTY = 1531
- IO_ENOTSOCK = 1550
- IO_ENOTSUP = 1532
- IO_ENOTTY = 1533
- IO_ENXIO = 1534
- IO_EPERM = 1535
- IO_EPIPE = 1536
- IO_ERANGE = 1537
- IO_EROFS = 1538
- IO_ESPIPE = 1539
- IO_ESRCH = 1540
- IO_ETIMEDOUT = 1541
- IO_EXDEV = 1542
- IO_FLUSH = 1545
- IO_LOAD_ERROR = 1549
- IO_NETWORK_ATTEMPT = 1543
- IO_NO_INPUT = 1547
- IO_UNKNOWN = 1500
- IO_WRITE = 1546
- MODULE_CLOSE = 4901
- MODULE_OPEN = 4900
- NS_ERR_ATTRIBUTE_REDEFINED = 203
- NS_ERR_COLON = 205
- NS_ERR_EMPTY = 204
- NS_ERR_QNAME = 202
- NS_ERR_UNDEFINED_NAMESPACE = 201
- NS_ERR_XML_NAMESPACE = 200
- REGEXP_COMPILE_ERROR = 1450
- RNGP_ANYNAME_ATTR_ANCESTOR = 1000
- RNGP_ATTRIBUTE_CHILDREN = 1002
- RNGP_ATTRIBUTE_CONTENT = 1003
- RNGP_ATTRIBUTE_EMPTY = 1004
- RNGP_ATTRIBUTE_NOOP = 1005
- RNGP_ATTR_CONFLICT = 1001
- RNGP_CHOICE_CONTENT = 1006
- RNGP_CHOICE_EMPTY = 1007
- RNGP_CREATE_FAILURE = 1008
- RNGP_DATA_CONTENT = 1009
- RNGP_DEFINE_CREATE_FAILED = 1011
- RNGP_DEFINE_EMPTY = 1012
- RNGP_DEFINE_MISSING = 1013
- RNGP_DEFINE_NAME_MISSING = 1014
- RNGP_DEF_CHOICE_AND_INTERLEAVE = 1010
- RNGP_ELEMENT_CONTENT = 1018
- RNGP_ELEMENT_EMPTY = 1017
- RNGP_ELEMENT_NAME = 1019
- RNGP_ELEMENT_NO_CONTENT = 1020
- RNGP_ELEM_CONTENT_EMPTY = 1015
- RNGP_ELEM_CONTENT_ERROR = 1016
- RNGP_ELEM_TEXT_CONFLICT = 1021
- RNGP_EMPTY = 1022
- RNGP_EMPTY_CONSTRUCT = 1023
- RNGP_EMPTY_CONTENT = 1024
- RNGP_EMPTY_NOT_EMPTY = 1025
- RNGP_ERROR_TYPE_LIB = 1026
- RNGP_EXCEPT_EMPTY = 1027
- RNGP_EXCEPT_MISSING = 1028
- RNGP_EXCEPT_MULTIPLE = 1029
- RNGP_EXCEPT_NO_CONTENT = 1030
- RNGP_EXTERNALREF_EMTPY = 1031
- RNGP_EXTERNALREF_RECURSE = 1033
- RNGP_EXTERNAL_REF_FAILURE = 1032
- RNGP_FORBIDDEN_ATTRIBUTE = 1034
- RNGP_FOREIGN_ELEMENT = 1035
- RNGP_GRAMMAR_CONTENT = 1036
- RNGP_GRAMMAR_EMPTY = 1037
- RNGP_GRAMMAR_MISSING = 1038
- RNGP_GRAMMAR_NO_START = 1039
- RNGP_GROUP_ATTR_CONFLICT = 1040
- RNGP_HREF_ERROR = 1041
- RNGP_INCLUDE_EMPTY = 1042
- RNGP_INCLUDE_FAILURE = 1043
- RNGP_INCLUDE_RECURSE = 1044
- RNGP_INTERLEAVE_ADD = 1045
- RNGP_INTERLEAVE_CREATE_FAILED = 1046
- RNGP_INTERLEAVE_EMPTY = 1047
- RNGP_INTERLEAVE_NO_CONTENT = 1048
- RNGP_INVALID_DEFINE_NAME = 1049
- RNGP_INVALID_URI = 1050
- RNGP_INVALID_VALUE = 1051
- RNGP_MISSING_HREF = 1052
- RNGP_NAME_MISSING = 1053
- RNGP_NEED_COMBINE = 1054
- RNGP_NOTALLOWED_NOT_EMPTY = 1055
- RNGP_NSNAME_ATTR_ANCESTOR = 1056
- RNGP_NSNAME_NO_NS = 1057
- RNGP_PARAM_FORBIDDEN = 1058
- RNGP_PARAM_NAME_MISSING = 1059
- RNGP_PARENTREF_CREATE_FAILED = 1060
- RNGP_PARENTREF_NAME_INVALID = 1061
- RNGP_PARENTREF_NOT_EMPTY = 1064
- RNGP_PARENTREF_NO_NAME = 1062
- RNGP_PARENTREF_NO_PARENT = 1063
- RNGP_PARSE_ERROR = 1065
- RNGP_PAT_ANYNAME_EXCEPT_ANYNAME = 1066
- RNGP_PAT_ATTR_ATTR = 1067
- RNGP_PAT_ATTR_ELEM = 1068
- RNGP_PAT_DATA_EXCEPT_ATTR = 1069
- RNGP_PAT_DATA_EXCEPT_ELEM = 1070
- RNGP_PAT_DATA_EXCEPT_EMPTY = 1071
- RNGP_PAT_DATA_EXCEPT_GROUP = 1072
- RNGP_PAT_DATA_EXCEPT_INTERLEAVE = 1073
- RNGP_PAT_DATA_EXCEPT_LIST = 1074
- RNGP_PAT_DATA_EXCEPT_ONEMORE = 1075
- RNGP_PAT_DATA_EXCEPT_REF = 1076
- RNGP_PAT_DATA_EXCEPT_TEXT = 1077
- RNGP_PAT_LIST_ATTR = 1078
- RNGP_PAT_LIST_ELEM = 1079
- RNGP_PAT_LIST_INTERLEAVE = 1080
- RNGP_PAT_LIST_LIST = 1081
- RNGP_PAT_LIST_REF = 1082
- RNGP_PAT_LIST_TEXT = 1083
- RNGP_PAT_NSNAME_EXCEPT_ANYNAME = 1084
- RNGP_PAT_NSNAME_EXCEPT_NSNAME = 1085
- RNGP_PAT_ONEMORE_GROUP_ATTR = 1086
- RNGP_PAT_ONEMORE_INTERLEAVE_ATTR = 1087
- RNGP_PAT_START_ATTR = 1088
- RNGP_PAT_START_DATA = 1089
- RNGP_PAT_START_EMPTY = 1090
- RNGP_PAT_START_GROUP = 1091
- RNGP_PAT_START_INTERLEAVE = 1092
- RNGP_PAT_START_LIST = 1093
- RNGP_PAT_START_ONEMORE = 1094
- RNGP_PAT_START_TEXT = 1095
- RNGP_PAT_START_VALUE = 1096
- RNGP_PREFIX_UNDEFINED = 1097
- RNGP_REF_CREATE_FAILED = 1098
- RNGP_REF_CYCLE = 1099
- RNGP_REF_NAME_INVALID = 1100
- RNGP_REF_NOT_EMPTY = 1103
- RNGP_REF_NO_DEF = 1101
- RNGP_REF_NO_NAME = 1102
- RNGP_START_CHOICE_AND_INTERLEAVE = 1104
- RNGP_START_CONTENT = 1105
- RNGP_START_EMPTY = 1106
- RNGP_START_MISSING = 1107
- RNGP_TEXT_EXPECTED = 1108
- RNGP_TEXT_HAS_CHILD = 1109
- RNGP_TYPE_MISSING = 1110
- RNGP_TYPE_NOT_FOUND = 1111
- RNGP_TYPE_VALUE = 1112
- RNGP_UNKNOWN_ATTRIBUTE = 1113
- RNGP_UNKNOWN_COMBINE = 1114
- RNGP_UNKNOWN_CONSTRUCT = 1115
- RNGP_UNKNOWN_TYPE_LIB = 1116
- RNGP_URI_FRAGMENT = 1117
- RNGP_URI_NOT_ABSOLUTE = 1118
- RNGP_VALUE_EMPTY = 1119
- RNGP_VALUE_NO_CONTENT = 1120
- RNGP_XMLNS_NAME = 1121
- RNGP_XML_NS = 1122
- SAVE_CHAR_INVALID = 1401
- SAVE_NOT_UTF8 = 1400
- SAVE_NO_DOCTYPE = 1402
- SAVE_UNKNOWN_ENCODING = 1403
- SCHEMAP_AG_PROPS_CORRECT = 3087
- SCHEMAP_ATTRFORMDEFAULT_VALUE = 1701
- SCHEMAP_ATTRGRP_NONAME_NOREF = 1702
- SCHEMAP_ATTR_NONAME_NOREF = 1703
- SCHEMAP_AU_PROPS_CORRECT = 3089
- SCHEMAP_AU_PROPS_CORRECT_2 = 3078
- SCHEMAP_A_PROPS_CORRECT_2 = 3079
- SCHEMAP_A_PROPS_CORRECT_3 = 3090
- SCHEMAP_COMPLEXTYPE_NONAME_NOREF = 1704
- SCHEMAP_COS_ALL_LIMITED = 3091
- SCHEMAP_COS_CT_EXTENDS_1_1 = 3063
- SCHEMAP_COS_CT_EXTENDS_1_2 = 3088
- SCHEMAP_COS_CT_EXTENDS_1_3 = 1800
- SCHEMAP_COS_ST_DERIVED_OK_2_1 = 3031
- SCHEMAP_COS_ST_DERIVED_OK_2_2 = 3032
- SCHEMAP_COS_ST_RESTRICTS_1_1 = 3011
- SCHEMAP_COS_ST_RESTRICTS_1_2 = 3012
- SCHEMAP_COS_ST_RESTRICTS_1_3_1 = 3013
- SCHEMAP_COS_ST_RESTRICTS_1_3_2 = 3014
- SCHEMAP_COS_ST_RESTRICTS_2_1 = 3015
- SCHEMAP_COS_ST_RESTRICTS_2_3_1_1 = 3016
- SCHEMAP_COS_ST_RESTRICTS_2_3_1_2 = 3017
- SCHEMAP_COS_ST_RESTRICTS_2_3_2_1 = 3018
- SCHEMAP_COS_ST_RESTRICTS_2_3_2_2 = 3019
- SCHEMAP_COS_ST_RESTRICTS_2_3_2_3 = 3020
- SCHEMAP_COS_ST_RESTRICTS_2_3_2_4 = 3021
- SCHEMAP_COS_ST_RESTRICTS_2_3_2_5 = 3022
- SCHEMAP_COS_ST_RESTRICTS_3_1 = 3023
- SCHEMAP_COS_ST_RESTRICTS_3_3_1 = 3024
- SCHEMAP_COS_ST_RESTRICTS_3_3_1_2 = 3025
- SCHEMAP_COS_ST_RESTRICTS_3_3_2_1 = 3027
- SCHEMAP_COS_ST_RESTRICTS_3_3_2_2 = 3026
- SCHEMAP_COS_ST_RESTRICTS_3_3_2_3 = 3028
- SCHEMAP_COS_ST_RESTRICTS_3_3_2_4 = 3029
- SCHEMAP_COS_ST_RESTRICTS_3_3_2_5 = 3030
- SCHEMAP_COS_VALID_DEFAULT_1 = 3058
- SCHEMAP_COS_VALID_DEFAULT_2_1 = 3059
- SCHEMAP_COS_VALID_DEFAULT_2_2_1 = 3060
- SCHEMAP_COS_VALID_DEFAULT_2_2_2 = 3061
- SCHEMAP_CT_PROPS_CORRECT_1 = 1782
- SCHEMAP_CT_PROPS_CORRECT_2 = 1783
- SCHEMAP_CT_PROPS_CORRECT_3 = 1784
- SCHEMAP_CT_PROPS_CORRECT_4 = 1785
- SCHEMAP_CT_PROPS_CORRECT_5 = 1786
- SCHEMAP_CVC_SIMPLE_TYPE = 3062
- SCHEMAP_C_PROPS_CORRECT = 3080
- SCHEMAP_DEF_AND_PREFIX = 1768
- SCHEMAP_DERIVATION_OK_RESTRICTION_1 = 1787
- SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_1 = 1788
- SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_2 = 1789
- SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_3 = 3077
- SCHEMAP_DERIVATION_OK_RESTRICTION_2_2 = 1790
- SCHEMAP_DERIVATION_OK_RESTRICTION_3 = 1791
- SCHEMAP_DERIVATION_OK_RESTRICTION_4_1 = 1797
- SCHEMAP_DERIVATION_OK_RESTRICTION_4_2 = 1798
- SCHEMAP_DERIVATION_OK_RESTRICTION_4_3 = 1799
- SCHEMAP_ELEMFORMDEFAULT_VALUE = 1705
- SCHEMAP_ELEM_DEFAULT_FIXED = 1755
- SCHEMAP_ELEM_NONAME_NOREF = 1706
- SCHEMAP_EXTENSION_NO_BASE = 1707
- SCHEMAP_E_PROPS_CORRECT_2 = 3045
- SCHEMAP_E_PROPS_CORRECT_3 = 3046
- SCHEMAP_E_PROPS_CORRECT_4 = 3047
- SCHEMAP_E_PROPS_CORRECT_5 = 3048
- SCHEMAP_E_PROPS_CORRECT_6 = 3049
- SCHEMAP_FACET_NO_VALUE = 1708
- SCHEMAP_FAILED_BUILD_IMPORT = 1709
- SCHEMAP_FAILED_LOAD = 1757
- SCHEMAP_FAILED_PARSE = 1766
- SCHEMAP_GROUP_NONAME_NOREF = 1710
- SCHEMAP_IMPORT_NAMESPACE_NOT_URI = 1711
- SCHEMAP_IMPORT_REDEFINE_NSNAME = 1712
- SCHEMAP_IMPORT_SCHEMA_NOT_URI = 1713
- SCHEMAP_INCLUDE_SCHEMA_NOT_URI = 1770
- SCHEMAP_INCLUDE_SCHEMA_NO_URI = 1771
- SCHEMAP_INTERNAL = 3069
- SCHEMAP_INTERSECTION_NOT_EXPRESSIBLE = 1793
- SCHEMAP_INVALID_ATTR_COMBINATION = 1777
- SCHEMAP_INVALID_ATTR_INLINE_COMBINATION = 1778
- SCHEMAP_INVALID_ATTR_NAME = 1780
- SCHEMAP_INVALID_ATTR_USE = 1774
- SCHEMAP_INVALID_BOOLEAN = 1714
- SCHEMAP_INVALID_ENUM = 1715
- SCHEMAP_INVALID_FACET = 1716
- SCHEMAP_INVALID_FACET_VALUE = 1717
- SCHEMAP_INVALID_MAXOCCURS = 1718
- SCHEMAP_INVALID_MINOCCURS = 1719
- SCHEMAP_INVALID_REF_AND_SUBTYPE = 1720
- SCHEMAP_INVALID_WHITE_SPACE = 1721
- SCHEMAP_MG_PROPS_CORRECT_1 = 3074
- SCHEMAP_MG_PROPS_CORRECT_2 = 3075
- SCHEMAP_MISSING_SIMPLETYPE_CHILD = 1779
- SCHEMAP_NOATTR_NOREF = 1722
- SCHEMAP_NOROOT = 1759
- SCHEMAP_NOTATION_NO_NAME = 1723
- SCHEMAP_NOTHING_TO_PARSE = 1758
- SCHEMAP_NOTYPE_NOREF = 1724
- SCHEMAP_NOT_DETERMINISTIC = 3070
- SCHEMAP_NOT_SCHEMA = 1772
- SCHEMAP_NO_XMLNS = 3056
- SCHEMAP_NO_XSI = 3057
- SCHEMAP_PREFIX_UNDEFINED = 1700
- SCHEMAP_P_PROPS_CORRECT_1 = 3042
- SCHEMAP_P_PROPS_CORRECT_2_1 = 3043
- SCHEMAP_P_PROPS_CORRECT_2_2 = 3044
- SCHEMAP_RECURSIVE = 1775
- SCHEMAP_REDEFINED_ATTR = 1764
- SCHEMAP_REDEFINED_ATTRGROUP = 1763
- SCHEMAP_REDEFINED_ELEMENT = 1762
- SCHEMAP_REDEFINED_GROUP = 1760
- SCHEMAP_REDEFINED_NOTATION = 1765
- SCHEMAP_REDEFINED_TYPE = 1761
- SCHEMAP_REF_AND_CONTENT = 1781
- SCHEMAP_REF_AND_SUBTYPE = 1725
- SCHEMAP_REGEXP_INVALID = 1756
- SCHEMAP_RESTRICTION_NONAME_NOREF = 1726
- SCHEMAP_S4S_ATTR_INVALID_VALUE = 3037
- SCHEMAP_S4S_ATTR_MISSING = 3036
- SCHEMAP_S4S_ATTR_NOT_ALLOWED = 3035
- SCHEMAP_S4S_ELEM_MISSING = 3034
- SCHEMAP_S4S_ELEM_NOT_ALLOWED = 3033
- SCHEMAP_SIMPLETYPE_NONAME = 1727
- SCHEMAP_SRC_ATTRIBUTE_1 = 3051
- SCHEMAP_SRC_ATTRIBUTE_2 = 3052
- SCHEMAP_SRC_ATTRIBUTE_3_1 = 3053
- SCHEMAP_SRC_ATTRIBUTE_3_2 = 3054
- SCHEMAP_SRC_ATTRIBUTE_4 = 3055
- SCHEMAP_SRC_ATTRIBUTE_GROUP_1 = 3071
- SCHEMAP_SRC_ATTRIBUTE_GROUP_2 = 3072
- SCHEMAP_SRC_ATTRIBUTE_GROUP_3 = 3073
- SCHEMAP_SRC_CT_1 = 3076
- SCHEMAP_SRC_ELEMENT_1 = 3038
- SCHEMAP_SRC_ELEMENT_2_1 = 3039
- SCHEMAP_SRC_ELEMENT_2_2 = 3040
- SCHEMAP_SRC_ELEMENT_3 = 3041
- SCHEMAP_SRC_IMPORT = 3082
- SCHEMAP_SRC_IMPORT_1_1 = 3064
- SCHEMAP_SRC_IMPORT_1_2 = 3065
- SCHEMAP_SRC_IMPORT_2 = 3066
- SCHEMAP_SRC_IMPORT_2_1 = 3067
- SCHEMAP_SRC_IMPORT_2_2 = 3068
- SCHEMAP_SRC_IMPORT_3_1 = 1795
- SCHEMAP_SRC_IMPORT_3_2 = 1796
- SCHEMAP_SRC_INCLUDE = 3050
- SCHEMAP_SRC_LIST_ITEMTYPE_OR_SIMPLETYPE = 3006
- SCHEMAP_SRC_REDEFINE = 3081
- SCHEMAP_SRC_RESOLVE = 3004
- SCHEMAP_SRC_RESTRICTION_BASE_OR_SIMPLETYPE = 3005
- SCHEMAP_SRC_SIMPLE_TYPE_1 = 3000
- SCHEMAP_SRC_SIMPLE_TYPE_2 = 3001
- SCHEMAP_SRC_SIMPLE_TYPE_3 = 3002
- SCHEMAP_SRC_SIMPLE_TYPE_4 = 3003
- SCHEMAP_SRC_UNION_MEMBERTYPES_OR_SIMPLETYPES = 3007
- SCHEMAP_ST_PROPS_CORRECT_1 = 3008
- SCHEMAP_ST_PROPS_CORRECT_2 = 3009
- SCHEMAP_ST_PROPS_CORRECT_3 = 3010
- SCHEMAP_SUPERNUMEROUS_LIST_ITEM_TYPE = 1776
- SCHEMAP_TYPE_AND_SUBTYPE = 1728
- SCHEMAP_UNION_NOT_EXPRESSIBLE = 1794
- SCHEMAP_UNKNOWN_ALL_CHILD = 1729
- SCHEMAP_UNKNOWN_ANYATTRIBUTE_CHILD = 1730
- SCHEMAP_UNKNOWN_ATTRGRP_CHILD = 1732
- SCHEMAP_UNKNOWN_ATTRIBUTE_GROUP = 1733
- SCHEMAP_UNKNOWN_ATTR_CHILD = 1731
- SCHEMAP_UNKNOWN_BASE_TYPE = 1734
- SCHEMAP_UNKNOWN_CHOICE_CHILD = 1735
- SCHEMAP_UNKNOWN_COMPLEXCONTENT_CHILD = 1736
- SCHEMAP_UNKNOWN_COMPLEXTYPE_CHILD = 1737
- SCHEMAP_UNKNOWN_ELEM_CHILD = 1738
- SCHEMAP_UNKNOWN_EXTENSION_CHILD = 1739
- SCHEMAP_UNKNOWN_FACET_CHILD = 1740
- SCHEMAP_UNKNOWN_FACET_TYPE = 1741
- SCHEMAP_UNKNOWN_GROUP_CHILD = 1742
- SCHEMAP_UNKNOWN_IMPORT_CHILD = 1743
- SCHEMAP_UNKNOWN_INCLUDE_CHILD = 1769
- SCHEMAP_UNKNOWN_LIST_CHILD = 1744
- SCHEMAP_UNKNOWN_MEMBER_TYPE = 1773
- SCHEMAP_UNKNOWN_NOTATION_CHILD = 1745
- SCHEMAP_UNKNOWN_PREFIX = 1767
- SCHEMAP_UNKNOWN_PROCESSCONTENT_CHILD = 1746
- SCHEMAP_UNKNOWN_REF = 1747
- SCHEMAP_UNKNOWN_RESTRICTION_CHILD = 1748
- SCHEMAP_UNKNOWN_SCHEMAS_CHILD = 1749
- SCHEMAP_UNKNOWN_SEQUENCE_CHILD = 1750
- SCHEMAP_UNKNOWN_SIMPLECONTENT_CHILD = 1751
- SCHEMAP_UNKNOWN_SIMPLETYPE_CHILD = 1752
- SCHEMAP_UNKNOWN_TYPE = 1753
- SCHEMAP_UNKNOWN_UNION_CHILD = 1754
- SCHEMAP_WARN_ATTR_POINTLESS_PROH = 3086
- SCHEMAP_WARN_ATTR_REDECL_PROH = 3085
- SCHEMAP_WARN_SKIP_SCHEMA = 3083
- SCHEMAP_WARN_UNLOCATED_SCHEMA = 3084
- SCHEMAP_WILDCARD_INVALID_NS_MEMBER = 1792
- SCHEMATRONV_ASSERT = 4000
- SCHEMATRONV_REPORT = 4001
- SCHEMAV_ATTRINVALID = 1821
- SCHEMAV_ATTRUNKNOWN = 1820
- SCHEMAV_CONSTRUCT = 1817
- SCHEMAV_CVC_ATTRIBUTE_1 = 1861
- SCHEMAV_CVC_ATTRIBUTE_2 = 1862
- SCHEMAV_CVC_ATTRIBUTE_3 = 1863
- SCHEMAV_CVC_ATTRIBUTE_4 = 1864
- SCHEMAV_CVC_AU = 1874
- SCHEMAV_CVC_COMPLEX_TYPE_1 = 1873
- SCHEMAV_CVC_COMPLEX_TYPE_2_1 = 1841
- SCHEMAV_CVC_COMPLEX_TYPE_2_2 = 1842
- SCHEMAV_CVC_COMPLEX_TYPE_2_3 = 1843
- SCHEMAV_CVC_COMPLEX_TYPE_2_4 = 1844
- SCHEMAV_CVC_COMPLEX_TYPE_3_1 = 1865
- SCHEMAV_CVC_COMPLEX_TYPE_3_2_1 = 1866
- SCHEMAV_CVC_COMPLEX_TYPE_3_2_2 = 1867
- SCHEMAV_CVC_COMPLEX_TYPE_4 = 1868
- SCHEMAV_CVC_COMPLEX_TYPE_5_1 = 1869
- SCHEMAV_CVC_COMPLEX_TYPE_5_2 = 1870
- SCHEMAV_CVC_DATATYPE_VALID_1_2_1 = 1824
- SCHEMAV_CVC_DATATYPE_VALID_1_2_2 = 1825
- SCHEMAV_CVC_DATATYPE_VALID_1_2_3 = 1826
- SCHEMAV_CVC_ELT_1 = 1845
- SCHEMAV_CVC_ELT_2 = 1846
- SCHEMAV_CVC_ELT_3_1 = 1847
- SCHEMAV_CVC_ELT_3_2_1 = 1848
- SCHEMAV_CVC_ELT_3_2_2 = 1849
- SCHEMAV_CVC_ELT_4_1 = 1850
- SCHEMAV_CVC_ELT_4_2 = 1851
- SCHEMAV_CVC_ELT_4_3 = 1852
- SCHEMAV_CVC_ELT_5_1_1 = 1853
- SCHEMAV_CVC_ELT_5_1_2 = 1854
- SCHEMAV_CVC_ELT_5_2_1 = 1855
- SCHEMAV_CVC_ELT_5_2_2_1 = 1856
- SCHEMAV_CVC_ELT_5_2_2_2_1 = 1857
- SCHEMAV_CVC_ELT_5_2_2_2_2 = 1858
- SCHEMAV_CVC_ELT_6 = 1859
- SCHEMAV_CVC_ELT_7 = 1860
- SCHEMAV_CVC_ENUMERATION_VALID = 1840
- SCHEMAV_CVC_FACET_VALID = 1829
- SCHEMAV_CVC_FRACTIONDIGITS_VALID = 1838
- SCHEMAV_CVC_IDC = 1877
- SCHEMAV_CVC_LENGTH_VALID = 1830
- SCHEMAV_CVC_MAXEXCLUSIVE_VALID = 1836
- SCHEMAV_CVC_MAXINCLUSIVE_VALID = 1834
- SCHEMAV_CVC_MAXLENGTH_VALID = 1832
- SCHEMAV_CVC_MINEXCLUSIVE_VALID = 1835
- SCHEMAV_CVC_MININCLUSIVE_VALID = 1833
- SCHEMAV_CVC_MINLENGTH_VALID = 1831
- SCHEMAV_CVC_PATTERN_VALID = 1839
- SCHEMAV_CVC_TOTALDIGITS_VALID = 1837
- SCHEMAV_CVC_TYPE_1 = 1875
- SCHEMAV_CVC_TYPE_2 = 1876
- SCHEMAV_CVC_TYPE_3_1_1 = 1827
- SCHEMAV_CVC_TYPE_3_1_2 = 1828
- SCHEMAV_CVC_WILDCARD = 1878
- SCHEMAV_DOCUMENT_ELEMENT_MISSING = 1872
- SCHEMAV_ELEMCONT = 1810
- SCHEMAV_ELEMENT_CONTENT = 1871
- SCHEMAV_EXTRACONTENT = 1813
- SCHEMAV_FACET = 1823
- SCHEMAV_HAVEDEFAULT = 1811
- SCHEMAV_INTERNAL = 1818
- SCHEMAV_INVALIDATTR = 1814
- SCHEMAV_INVALIDELEM = 1815
- SCHEMAV_ISABSTRACT = 1808
- SCHEMAV_MISC = 1879
- SCHEMAV_MISSING = 1804
- SCHEMAV_NOROLLBACK = 1807
- SCHEMAV_NOROOT = 1801
- SCHEMAV_NOTDETERMINIST = 1816
- SCHEMAV_NOTEMPTY = 1809
- SCHEMAV_NOTNILLABLE = 1812
- SCHEMAV_NOTSIMPLE = 1819
- SCHEMAV_NOTTOPLEVEL = 1803
- SCHEMAV_NOTYPE = 1806
- SCHEMAV_UNDECLAREDELEM = 1802
- SCHEMAV_VALUE = 1822
- SCHEMAV_WRONGELEM = 1805
- TREE_INVALID_DEC = 1301
- TREE_INVALID_HEX = 1300
- TREE_NOT_UTF8 = 1303
- TREE_UNTERMINATED_ENTITY = 1302
- WAR_CATALOG_PI = 93
- WAR_ENTITY_REDEFINED = 107
- WAR_LANG_VALUE = 98
- WAR_NS_COLUMN = 106
- WAR_NS_URI = 99
- WAR_NS_URI_RELATIVE = 100
- WAR_SPACE_VALUE = 102
- WAR_UNDECLARED_ENTITY = 27
- WAR_UNKNOWN_VERSION = 97
- XINCLUDE_BUILD_FAILED = 1609
- XINCLUDE_DEPRECATED_NS = 1617
- XINCLUDE_ENTITY_DEF_MISMATCH = 1602
- XINCLUDE_FALLBACKS_IN_INCLUDE = 1615
- XINCLUDE_FALLBACK_NOT_IN_INCLUDE = 1616
- XINCLUDE_FRAGMENT_ID = 1618
- XINCLUDE_HREF_URI = 1605
- XINCLUDE_INCLUDE_IN_INCLUDE = 1614
- XINCLUDE_INVALID_CHAR = 1608
- XINCLUDE_MULTIPLE_ROOT = 1611
- XINCLUDE_NO_FALLBACK = 1604
- XINCLUDE_NO_HREF = 1603
- XINCLUDE_PARSE_VALUE = 1601
- XINCLUDE_RECURSION = 1600
- XINCLUDE_TEXT_DOCUMENT = 1607
- XINCLUDE_TEXT_FRAGMENT = 1606
- XINCLUDE_UNKNOWN_ENCODING = 1610
- XINCLUDE_XPTR_FAILED = 1612
- XINCLUDE_XPTR_RESULT = 1613
- XPATH_ENCODING_ERROR = 1220
- XPATH_EXPRESSION_OK = 1200
- XPATH_EXPR_ERROR = 1207
- XPATH_INVALID_ARITY = 1212
- XPATH_INVALID_CHAR_ERROR = 1221
- XPATH_INVALID_CTXT_POSITION = 1214
- XPATH_INVALID_CTXT_SIZE = 1213
- XPATH_INVALID_OPERAND = 1210
- XPATH_INVALID_PREDICATE_ERROR = 1206
- XPATH_INVALID_TYPE = 1211
- XPATH_MEMORY_ERROR = 1215
- XPATH_NUMBER_ERROR = 1201
- XPATH_START_LITERAL_ERROR = 1203
- XPATH_UNCLOSED_ERROR = 1208
- XPATH_UNDEF_PREFIX_ERROR = 1219
- XPATH_UNDEF_VARIABLE_ERROR = 1205
- XPATH_UNFINISHED_LITERAL_ERROR = 1202
- XPATH_UNKNOWN_FUNC_ERROR = 1209
- XPATH_VARIABLE_REF_ERROR = 1204
- XPTR_CHILDSEQ_START = 1901
- XPTR_EVAL_FAILED = 1902
- XPTR_EXTRA_OBJECTS = 1903
- XPTR_RESOURCE_ERROR = 1217
- XPTR_SUB_RESOURCE_ERROR = 1218
- XPTR_SYNTAX_ERROR = 1216
- XPTR_UNKNOWN_SCHEME = 1900
- _names = {0: 'ERR_OK', 1: 'ERR_INTERNAL_ERROR', 2: 'ERR_NO_MEMORY', 3: 'ERR_DOCUMENT_START', 4: 'ERR_DOCUMENT_EMPTY', 5: 'ERR_DOCUMENT_END', 6: 'ERR_INVALID_HEX_CHARREF', 7: 'ERR_INVALID_DEC_CHARREF', 8: 'ERR_INVALID_CHARREF', 9: 'ERR_INVALID_CHAR', 10: 'ERR_CHARREF_AT_EOF', 11: 'ERR_CHARREF_IN_PROLOG', 12: 'ERR_CHARREF_IN_EPILOG', 13: 'ERR_CHARREF_IN_DTD', 14: 'ERR_ENTITYREF_AT_EOF', 15: 'ERR_ENTITYREF_IN_PROLOG', 16: 'ERR_ENTITYREF_IN_EPILOG', 17: 'ERR_ENTITYREF_IN_DTD', 18: 'ERR_PEREF_AT_EOF', 19: 'ERR_PEREF_IN_PROLOG', 20: 'ERR_PEREF_IN_EPILOG', 21: 'ERR_PEREF_IN_INT_SUBSET', 22: 'ERR_ENTITYREF_NO_NAME', 23: 'ERR_ENTITYREF_SEMICOL_MISSING', 24: 'ERR_PEREF_NO_NAME', 25: 'ERR_PEREF_SEMICOL_MISSING', 26: 'ERR_UNDECLARED_ENTITY', 27: 'WAR_UNDECLARED_ENTITY', 28: 'ERR_UNPARSED_ENTITY', 29: 'ERR_ENTITY_IS_EXTERNAL', 30: 'ERR_ENTITY_IS_PARAMETER', 31: 'ERR_UNKNOWN_ENCODING', 32: 'ERR_UNSUPPORTED_ENCODING', 33: 'ERR_STRING_NOT_STARTED', 34: 'ERR_STRING_NOT_CLOSED', 35: 'ERR_NS_DECL_ERROR', 36: 'ERR_ENTITY_NOT_STARTED', 37: 'ERR_ENTITY_NOT_FINISHED', 38: 'ERR_LT_IN_ATTRIBUTE', 39: 'ERR_ATTRIBUTE_NOT_STARTED', 40: 'ERR_ATTRIBUTE_NOT_FINISHED', 41: 'ERR_ATTRIBUTE_WITHOUT_VALUE', 42: 'ERR_ATTRIBUTE_REDEFINED', 43: 'ERR_LITERAL_NOT_STARTED', 44: 'ERR_LITERAL_NOT_FINISHED', 45: 'ERR_COMMENT_NOT_FINISHED', 46: 'ERR_PI_NOT_STARTED', 47: 'ERR_PI_NOT_FINISHED', 48: 'ERR_NOTATION_NOT_STARTED', 49: 'ERR_NOTATION_NOT_FINISHED', 50: 'ERR_ATTLIST_NOT_STARTED', 51: 'ERR_ATTLIST_NOT_FINISHED', 52: 'ERR_MIXED_NOT_STARTED', 53: 'ERR_MIXED_NOT_FINISHED', 54: 'ERR_ELEMCONTENT_NOT_STARTED', 55: 'ERR_ELEMCONTENT_NOT_FINISHED', 56: 'ERR_XMLDECL_NOT_STARTED', 57: 'ERR_XMLDECL_NOT_FINISHED', 58: 'ERR_CONDSEC_NOT_STARTED', 59: 'ERR_CONDSEC_NOT_FINISHED', 60: 'ERR_EXT_SUBSET_NOT_FINISHED', 61: 'ERR_DOCTYPE_NOT_FINISHED', 62: 'ERR_MISPLACED_CDATA_END', 63: 'ERR_CDATA_NOT_FINISHED', 64: 'ERR_RESERVED_XML_NAME', 65: 'ERR_SPACE_REQUIRED', 66: 'ERR_SEPARATOR_REQUIRED', 67: 'ERR_NMTOKEN_REQUIRED', 68: 'ERR_NAME_REQUIRED', 69: 'ERR_PCDATA_REQUIRED', 70: 'ERR_URI_REQUIRED', 71: 'ERR_PUBID_REQUIRED', 72: 'ERR_LT_REQUIRED', 73: 'ERR_GT_REQUIRED', 74: 'ERR_LTSLASH_REQUIRED', 75: 'ERR_EQUAL_REQUIRED', 76: 'ERR_TAG_NAME_MISMATCH', 77: 'ERR_TAG_NOT_FINISHED', 78: 'ERR_STANDALONE_VALUE', 79: 'ERR_ENCODING_NAME', 80: 'ERR_HYPHEN_IN_COMMENT', 81: 'ERR_INVALID_ENCODING', 82: 'ERR_EXT_ENTITY_STANDALONE', 83: 'ERR_CONDSEC_INVALID', 84: 'ERR_VALUE_REQUIRED', 85: 'ERR_NOT_WELL_BALANCED', 86: 'ERR_EXTRA_CONTENT', 87: 'ERR_ENTITY_CHAR_ERROR', 88: 'ERR_ENTITY_PE_INTERNAL', 89: 'ERR_ENTITY_LOOP', 90: 'ERR_ENTITY_BOUNDARY', 91: 'ERR_INVALID_URI', 92: 'ERR_URI_FRAGMENT', 93: 'WAR_CATALOG_PI', 94: 'ERR_NO_DTD', 95: 'ERR_CONDSEC_INVALID_KEYWORD', 96: 'ERR_VERSION_MISSING', 97: 'WAR_UNKNOWN_VERSION', 98: 'WAR_LANG_VALUE', 99: 'WAR_NS_URI', 100: 'WAR_NS_URI_RELATIVE', 101: 'ERR_MISSING_ENCODING', 102: 'WAR_SPACE_VALUE', 103: 'ERR_NOT_STANDALONE', 104: 'ERR_ENTITY_PROCESSING', 105: 'ERR_NOTATION_PROCESSING', 106: 'WAR_NS_COLUMN', 107: 'WAR_ENTITY_REDEFINED', 108: 'ERR_UNKNOWN_VERSION', 109: 'ERR_VERSION_MISMATCH', 110: 'ERR_NAME_TOO_LONG', 111: 'ERR_USER_STOP', 112: 'ERR_COMMENT_ABRUPTLY_ENDED', 200: 'NS_ERR_XML_NAMESPACE', 201: 'NS_ERR_UNDEFINED_NAMESPACE', 202: 'NS_ERR_QNAME', 203: 'NS_ERR_ATTRIBUTE_REDEFINED', 204: 'NS_ERR_EMPTY', 205: 'NS_ERR_COLON', 500: 'DTD_ATTRIBUTE_DEFAULT', 501: 'DTD_ATTRIBUTE_REDEFINED', 502: 'DTD_ATTRIBUTE_VALUE', 503: 'DTD_CONTENT_ERROR', 504: 'DTD_CONTENT_MODEL', 505: 'DTD_CONTENT_NOT_DETERMINIST', 506: 'DTD_DIFFERENT_PREFIX', 507: 'DTD_ELEM_DEFAULT_NAMESPACE', 508: 'DTD_ELEM_NAMESPACE', 509: 'DTD_ELEM_REDEFINED', 510: 'DTD_EMPTY_NOTATION', 511: 'DTD_ENTITY_TYPE', 512: 'DTD_ID_FIXED', 513: 'DTD_ID_REDEFINED', 514: 'DTD_ID_SUBSET', 515: 'DTD_INVALID_CHILD', 516: 'DTD_INVALID_DEFAULT', 517: 'DTD_LOAD_ERROR', 518: 'DTD_MISSING_ATTRIBUTE', 519: 'DTD_MIXED_CORRUPT', 520: 'DTD_MULTIPLE_ID', 521: 'DTD_NO_DOC', 522: 'DTD_NO_DTD', 523: 'DTD_NO_ELEM_NAME', 524: 'DTD_NO_PREFIX', 525: 'DTD_NO_ROOT', 526: 'DTD_NOTATION_REDEFINED', 527: 'DTD_NOTATION_VALUE', 528: 'DTD_NOT_EMPTY', 529: 'DTD_NOT_PCDATA', 530: 'DTD_NOT_STANDALONE', 531: 'DTD_ROOT_NAME', 532: 'DTD_STANDALONE_WHITE_SPACE', 533: 'DTD_UNKNOWN_ATTRIBUTE', 534: 'DTD_UNKNOWN_ELEM', 535: 'DTD_UNKNOWN_ENTITY', 536: 'DTD_UNKNOWN_ID', 537: 'DTD_UNKNOWN_NOTATION', 538: 'DTD_STANDALONE_DEFAULTED', 539: 'DTD_XMLID_VALUE', 540: 'DTD_XMLID_TYPE', 541: 'DTD_DUP_TOKEN', 800: 'HTML_STRUCURE_ERROR', 801: 'HTML_UNKNOWN_TAG', 1000: 'RNGP_ANYNAME_ATTR_ANCESTOR', 1001: 'RNGP_ATTR_CONFLICT', 1002: 'RNGP_ATTRIBUTE_CHILDREN', 1003: 'RNGP_ATTRIBUTE_CONTENT', 1004: 'RNGP_ATTRIBUTE_EMPTY', 1005: 'RNGP_ATTRIBUTE_NOOP', 1006: 'RNGP_CHOICE_CONTENT', 1007: 'RNGP_CHOICE_EMPTY', 1008: 'RNGP_CREATE_FAILURE', 1009: 'RNGP_DATA_CONTENT', 1010: 'RNGP_DEF_CHOICE_AND_INTERLEAVE', 1011: 'RNGP_DEFINE_CREATE_FAILED', 1012: 'RNGP_DEFINE_EMPTY', 1013: 'RNGP_DEFINE_MISSING', 1014: 'RNGP_DEFINE_NAME_MISSING', 1015: 'RNGP_ELEM_CONTENT_EMPTY', 1016: 'RNGP_ELEM_CONTENT_ERROR', 1017: 'RNGP_ELEMENT_EMPTY', 1018: 'RNGP_ELEMENT_CONTENT', 1019: 'RNGP_ELEMENT_NAME', 1020: 'RNGP_ELEMENT_NO_CONTENT', 1021: 'RNGP_ELEM_TEXT_CONFLICT', 1022: 'RNGP_EMPTY', 1023: 'RNGP_EMPTY_CONSTRUCT', 1024: 'RNGP_EMPTY_CONTENT', 1025: 'RNGP_EMPTY_NOT_EMPTY', 1026: 'RNGP_ERROR_TYPE_LIB', 1027: 'RNGP_EXCEPT_EMPTY', 1028: 'RNGP_EXCEPT_MISSING', 1029: 'RNGP_EXCEPT_MULTIPLE', 1030: 'RNGP_EXCEPT_NO_CONTENT', 1031: 'RNGP_EXTERNALREF_EMTPY', 1032: 'RNGP_EXTERNAL_REF_FAILURE', 1033: 'RNGP_EXTERNALREF_RECURSE', 1034: 'RNGP_FORBIDDEN_ATTRIBUTE', 1035: 'RNGP_FOREIGN_ELEMENT', 1036: 'RNGP_GRAMMAR_CONTENT', 1037: 'RNGP_GRAMMAR_EMPTY', 1038: 'RNGP_GRAMMAR_MISSING', 1039: 'RNGP_GRAMMAR_NO_START', 1040: 'RNGP_GROUP_ATTR_CONFLICT', 1041: 'RNGP_HREF_ERROR', 1042: 'RNGP_INCLUDE_EMPTY', 1043: 'RNGP_INCLUDE_FAILURE', 1044: 'RNGP_INCLUDE_RECURSE', 1045: 'RNGP_INTERLEAVE_ADD', 1046: 'RNGP_INTERLEAVE_CREATE_FAILED', 1047: 'RNGP_INTERLEAVE_EMPTY', 1048: 'RNGP_INTERLEAVE_NO_CONTENT', 1049: 'RNGP_INVALID_DEFINE_NAME', 1050: 'RNGP_INVALID_URI', 1051: 'RNGP_INVALID_VALUE', 1052: 'RNGP_MISSING_HREF', 1053: 'RNGP_NAME_MISSING', 1054: 'RNGP_NEED_COMBINE', 1055: 'RNGP_NOTALLOWED_NOT_EMPTY', 1056: 'RNGP_NSNAME_ATTR_ANCESTOR', 1057: 'RNGP_NSNAME_NO_NS', 1058: 'RNGP_PARAM_FORBIDDEN', 1059: 'RNGP_PARAM_NAME_MISSING', 1060: 'RNGP_PARENTREF_CREATE_FAILED', 1061: 'RNGP_PARENTREF_NAME_INVALID', 1062: 'RNGP_PARENTREF_NO_NAME', 1063: 'RNGP_PARENTREF_NO_PARENT', 1064: 'RNGP_PARENTREF_NOT_EMPTY', 1065: 'RNGP_PARSE_ERROR', 1066: 'RNGP_PAT_ANYNAME_EXCEPT_ANYNAME', 1067: 'RNGP_PAT_ATTR_ATTR', 1068: 'RNGP_PAT_ATTR_ELEM', 1069: 'RNGP_PAT_DATA_EXCEPT_ATTR', 1070: 'RNGP_PAT_DATA_EXCEPT_ELEM', 1071: 'RNGP_PAT_DATA_EXCEPT_EMPTY', 1072: 'RNGP_PAT_DATA_EXCEPT_GROUP', 1073: 'RNGP_PAT_DATA_EXCEPT_INTERLEAVE', 1074: 'RNGP_PAT_DATA_EXCEPT_LIST', 1075: 'RNGP_PAT_DATA_EXCEPT_ONEMORE', 1076: 'RNGP_PAT_DATA_EXCEPT_REF', 1077: 'RNGP_PAT_DATA_EXCEPT_TEXT', 1078: 'RNGP_PAT_LIST_ATTR', 1079: 'RNGP_PAT_LIST_ELEM', 1080: 'RNGP_PAT_LIST_INTERLEAVE', 1081: 'RNGP_PAT_LIST_LIST', 1082: 'RNGP_PAT_LIST_REF', 1083: 'RNGP_PAT_LIST_TEXT', 1084: 'RNGP_PAT_NSNAME_EXCEPT_ANYNAME', 1085: 'RNGP_PAT_NSNAME_EXCEPT_NSNAME', 1086: 'RNGP_PAT_ONEMORE_GROUP_ATTR', 1087: 'RNGP_PAT_ONEMORE_INTERLEAVE_ATTR', 1088: 'RNGP_PAT_START_ATTR', 1089: 'RNGP_PAT_START_DATA', 1090: 'RNGP_PAT_START_EMPTY', 1091: 'RNGP_PAT_START_GROUP', 1092: 'RNGP_PAT_START_INTERLEAVE', 1093: 'RNGP_PAT_START_LIST', 1094: 'RNGP_PAT_START_ONEMORE', 1095: 'RNGP_PAT_START_TEXT', 1096: 'RNGP_PAT_START_VALUE', 1097: 'RNGP_PREFIX_UNDEFINED', 1098: 'RNGP_REF_CREATE_FAILED', 1099: 'RNGP_REF_CYCLE', 1100: 'RNGP_REF_NAME_INVALID', 1101: 'RNGP_REF_NO_DEF', 1102: 'RNGP_REF_NO_NAME', 1103: 'RNGP_REF_NOT_EMPTY', 1104: 'RNGP_START_CHOICE_AND_INTERLEAVE', 1105: 'RNGP_START_CONTENT', 1106: 'RNGP_START_EMPTY', 1107: 'RNGP_START_MISSING', 1108: 'RNGP_TEXT_EXPECTED', 1109: 'RNGP_TEXT_HAS_CHILD', 1110: 'RNGP_TYPE_MISSING', 1111: 'RNGP_TYPE_NOT_FOUND', 1112: 'RNGP_TYPE_VALUE', 1113: 'RNGP_UNKNOWN_ATTRIBUTE', 1114: 'RNGP_UNKNOWN_COMBINE', 1115: 'RNGP_UNKNOWN_CONSTRUCT', 1116: 'RNGP_UNKNOWN_TYPE_LIB', 1117: 'RNGP_URI_FRAGMENT', 1118: 'RNGP_URI_NOT_ABSOLUTE', 1119: 'RNGP_VALUE_EMPTY', 1120: 'RNGP_VALUE_NO_CONTENT', 1121: 'RNGP_XMLNS_NAME', 1122: 'RNGP_XML_NS', 1200: 'XPATH_EXPRESSION_OK', 1201: 'XPATH_NUMBER_ERROR', 1202: 'XPATH_UNFINISHED_LITERAL_ERROR', 1203: 'XPATH_START_LITERAL_ERROR', 1204: 'XPATH_VARIABLE_REF_ERROR', 1205: 'XPATH_UNDEF_VARIABLE_ERROR', 1206: 'XPATH_INVALID_PREDICATE_ERROR', 1207: 'XPATH_EXPR_ERROR', 1208: 'XPATH_UNCLOSED_ERROR', 1209: 'XPATH_UNKNOWN_FUNC_ERROR', 1210: 'XPATH_INVALID_OPERAND', 1211: 'XPATH_INVALID_TYPE', 1212: 'XPATH_INVALID_ARITY', 1213: 'XPATH_INVALID_CTXT_SIZE', 1214: 'XPATH_INVALID_CTXT_POSITION', 1215: 'XPATH_MEMORY_ERROR', 1216: 'XPTR_SYNTAX_ERROR', 1217: 'XPTR_RESOURCE_ERROR', 1218: 'XPTR_SUB_RESOURCE_ERROR', 1219: 'XPATH_UNDEF_PREFIX_ERROR', 1220: 'XPATH_ENCODING_ERROR', 1221: 'XPATH_INVALID_CHAR_ERROR', 1300: 'TREE_INVALID_HEX', 1301: 'TREE_INVALID_DEC', 1302: 'TREE_UNTERMINATED_ENTITY', 1303: 'TREE_NOT_UTF8', 1400: 'SAVE_NOT_UTF8', 1401: 'SAVE_CHAR_INVALID', 1402: 'SAVE_NO_DOCTYPE', 1403: 'SAVE_UNKNOWN_ENCODING', 1450: 'REGEXP_COMPILE_ERROR', 1500: 'IO_UNKNOWN', 1501: 'IO_EACCES', 1502: 'IO_EAGAIN', 1503: 'IO_EBADF', 1504: 'IO_EBADMSG', 1505: 'IO_EBUSY', 1506: 'IO_ECANCELED', 1507: 'IO_ECHILD', 1508: 'IO_EDEADLK', 1509: 'IO_EDOM', 1510: 'IO_EEXIST', 1511: 'IO_EFAULT', 1512: 'IO_EFBIG', 1513: 'IO_EINPROGRESS', 1514: 'IO_EINTR', 1515: 'IO_EINVAL', 1516: 'IO_EIO', 1517: 'IO_EISDIR', 1518: 'IO_EMFILE', 1519: 'IO_EMLINK', 1520: 'IO_EMSGSIZE', 1521: 'IO_ENAMETOOLONG', 1522: 'IO_ENFILE', 1523: 'IO_ENODEV', 1524: 'IO_ENOENT', 1525: 'IO_ENOEXEC', 1526: 'IO_ENOLCK', 1527: 'IO_ENOMEM', 1528: 'IO_ENOSPC', 1529: 'IO_ENOSYS', 1530: 'IO_ENOTDIR', 1531: 'IO_ENOTEMPTY', 1532: 'IO_ENOTSUP', 1533: 'IO_ENOTTY', 1534: 'IO_ENXIO', 1535: 'IO_EPERM', 1536: 'IO_EPIPE', 1537: 'IO_ERANGE', 1538: 'IO_EROFS', 1539: 'IO_ESPIPE', 1540: 'IO_ESRCH', 1541: 'IO_ETIMEDOUT', 1542: 'IO_EXDEV', 1543: 'IO_NETWORK_ATTEMPT', 1544: 'IO_ENCODER', 1545: 'IO_FLUSH', 1546: 'IO_WRITE', 1547: 'IO_NO_INPUT', 1548: 'IO_BUFFER_FULL', 1549: 'IO_LOAD_ERROR', 1550: 'IO_ENOTSOCK', 1551: 'IO_EISCONN', 1552: 'IO_ECONNREFUSED', 1553: 'IO_ENETUNREACH', 1554: 'IO_EADDRINUSE', 1555: 'IO_EALREADY', 1556: 'IO_EAFNOSUPPORT', 1600: 'XINCLUDE_RECURSION', 1601: 'XINCLUDE_PARSE_VALUE', 1602: 'XINCLUDE_ENTITY_DEF_MISMATCH', 1603: 'XINCLUDE_NO_HREF', 1604: 'XINCLUDE_NO_FALLBACK', 1605: 'XINCLUDE_HREF_URI', 1606: 'XINCLUDE_TEXT_FRAGMENT', 1607: 'XINCLUDE_TEXT_DOCUMENT', 1608: 'XINCLUDE_INVALID_CHAR', 1609: 'XINCLUDE_BUILD_FAILED', 1610: 'XINCLUDE_UNKNOWN_ENCODING', 1611: 'XINCLUDE_MULTIPLE_ROOT', 1612: 'XINCLUDE_XPTR_FAILED', 1613: 'XINCLUDE_XPTR_RESULT', 1614: 'XINCLUDE_INCLUDE_IN_INCLUDE', 1615: 'XINCLUDE_FALLBACKS_IN_INCLUDE', 1616: 'XINCLUDE_FALLBACK_NOT_IN_INCLUDE', 1617: 'XINCLUDE_DEPRECATED_NS', 1618: 'XINCLUDE_FRAGMENT_ID', 1650: 'CATALOG_MISSING_ATTR', 1651: 'CATALOG_ENTRY_BROKEN', 1652: 'CATALOG_PREFER_VALUE', 1653: 'CATALOG_NOT_CATALOG', 1654: 'CATALOG_RECURSION', 1700: 'SCHEMAP_PREFIX_UNDEFINED', 1701: 'SCHEMAP_ATTRFORMDEFAULT_VALUE', 1702: 'SCHEMAP_ATTRGRP_NONAME_NOREF', 1703: 'SCHEMAP_ATTR_NONAME_NOREF', 1704: 'SCHEMAP_COMPLEXTYPE_NONAME_NOREF', 1705: 'SCHEMAP_ELEMFORMDEFAULT_VALUE', 1706: 'SCHEMAP_ELEM_NONAME_NOREF', 1707: 'SCHEMAP_EXTENSION_NO_BASE', 1708: 'SCHEMAP_FACET_NO_VALUE', 1709: 'SCHEMAP_FAILED_BUILD_IMPORT', 1710: 'SCHEMAP_GROUP_NONAME_NOREF', 1711: 'SCHEMAP_IMPORT_NAMESPACE_NOT_URI', 1712: 'SCHEMAP_IMPORT_REDEFINE_NSNAME', 1713: 'SCHEMAP_IMPORT_SCHEMA_NOT_URI', 1714: 'SCHEMAP_INVALID_BOOLEAN', 1715: 'SCHEMAP_INVALID_ENUM', 1716: 'SCHEMAP_INVALID_FACET', 1717: 'SCHEMAP_INVALID_FACET_VALUE', 1718: 'SCHEMAP_INVALID_MAXOCCURS', 1719: 'SCHEMAP_INVALID_MINOCCURS', 1720: 'SCHEMAP_INVALID_REF_AND_SUBTYPE', 1721: 'SCHEMAP_INVALID_WHITE_SPACE', 1722: 'SCHEMAP_NOATTR_NOREF', 1723: 'SCHEMAP_NOTATION_NO_NAME', 1724: 'SCHEMAP_NOTYPE_NOREF', 1725: 'SCHEMAP_REF_AND_SUBTYPE', 1726: 'SCHEMAP_RESTRICTION_NONAME_NOREF', 1727: 'SCHEMAP_SIMPLETYPE_NONAME', 1728: 'SCHEMAP_TYPE_AND_SUBTYPE', 1729: 'SCHEMAP_UNKNOWN_ALL_CHILD', 1730: 'SCHEMAP_UNKNOWN_ANYATTRIBUTE_CHILD', 1731: 'SCHEMAP_UNKNOWN_ATTR_CHILD', 1732: 'SCHEMAP_UNKNOWN_ATTRGRP_CHILD', 1733: 'SCHEMAP_UNKNOWN_ATTRIBUTE_GROUP', 1734: 'SCHEMAP_UNKNOWN_BASE_TYPE', 1735: 'SCHEMAP_UNKNOWN_CHOICE_CHILD', 1736: 'SCHEMAP_UNKNOWN_COMPLEXCONTENT_CHILD', 1737: 'SCHEMAP_UNKNOWN_COMPLEXTYPE_CHILD', 1738: 'SCHEMAP_UNKNOWN_ELEM_CHILD', 1739: 'SCHEMAP_UNKNOWN_EXTENSION_CHILD', 1740: 'SCHEMAP_UNKNOWN_FACET_CHILD', 1741: 'SCHEMAP_UNKNOWN_FACET_TYPE', 1742: 'SCHEMAP_UNKNOWN_GROUP_CHILD', 1743: 'SCHEMAP_UNKNOWN_IMPORT_CHILD', 1744: 'SCHEMAP_UNKNOWN_LIST_CHILD', 1745: 'SCHEMAP_UNKNOWN_NOTATION_CHILD', 1746: 'SCHEMAP_UNKNOWN_PROCESSCONTENT_CHILD', 1747: 'SCHEMAP_UNKNOWN_REF', 1748: 'SCHEMAP_UNKNOWN_RESTRICTION_CHILD', 1749: 'SCHEMAP_UNKNOWN_SCHEMAS_CHILD', 1750: 'SCHEMAP_UNKNOWN_SEQUENCE_CHILD', 1751: 'SCHEMAP_UNKNOWN_SIMPLECONTENT_CHILD', 1752: 'SCHEMAP_UNKNOWN_SIMPLETYPE_CHILD', 1753: 'SCHEMAP_UNKNOWN_TYPE', 1754: 'SCHEMAP_UNKNOWN_UNION_CHILD', 1755: 'SCHEMAP_ELEM_DEFAULT_FIXED', 1756: 'SCHEMAP_REGEXP_INVALID', 1757: 'SCHEMAP_FAILED_LOAD', 1758: 'SCHEMAP_NOTHING_TO_PARSE', 1759: 'SCHEMAP_NOROOT', 1760: 'SCHEMAP_REDEFINED_GROUP', 1761: 'SCHEMAP_REDEFINED_TYPE', 1762: 'SCHEMAP_REDEFINED_ELEMENT', 1763: 'SCHEMAP_REDEFINED_ATTRGROUP', 1764: 'SCHEMAP_REDEFINED_ATTR', 1765: 'SCHEMAP_REDEFINED_NOTATION', 1766: 'SCHEMAP_FAILED_PARSE', 1767: 'SCHEMAP_UNKNOWN_PREFIX', 1768: 'SCHEMAP_DEF_AND_PREFIX', 1769: 'SCHEMAP_UNKNOWN_INCLUDE_CHILD', 1770: 'SCHEMAP_INCLUDE_SCHEMA_NOT_URI', 1771: 'SCHEMAP_INCLUDE_SCHEMA_NO_URI', 1772: 'SCHEMAP_NOT_SCHEMA', 1773: 'SCHEMAP_UNKNOWN_MEMBER_TYPE', 1774: 'SCHEMAP_INVALID_ATTR_USE', 1775: 'SCHEMAP_RECURSIVE', 1776: 'SCHEMAP_SUPERNUMEROUS_LIST_ITEM_TYPE', 1777: 'SCHEMAP_INVALID_ATTR_COMBINATION', 1778: 'SCHEMAP_INVALID_ATTR_INLINE_COMBINATION', 1779: 'SCHEMAP_MISSING_SIMPLETYPE_CHILD', 1780: 'SCHEMAP_INVALID_ATTR_NAME', 1781: 'SCHEMAP_REF_AND_CONTENT', 1782: 'SCHEMAP_CT_PROPS_CORRECT_1', 1783: 'SCHEMAP_CT_PROPS_CORRECT_2', 1784: 'SCHEMAP_CT_PROPS_CORRECT_3', 1785: 'SCHEMAP_CT_PROPS_CORRECT_4', 1786: 'SCHEMAP_CT_PROPS_CORRECT_5', 1787: 'SCHEMAP_DERIVATION_OK_RESTRICTION_1', 1788: 'SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_1', 1789: 'SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_2', 1790: 'SCHEMAP_DERIVATION_OK_RESTRICTION_2_2', 1791: 'SCHEMAP_DERIVATION_OK_RESTRICTION_3', 1792: 'SCHEMAP_WILDCARD_INVALID_NS_MEMBER', 1793: 'SCHEMAP_INTERSECTION_NOT_EXPRESSIBLE', 1794: 'SCHEMAP_UNION_NOT_EXPRESSIBLE', 1795: 'SCHEMAP_SRC_IMPORT_3_1', 1796: 'SCHEMAP_SRC_IMPORT_3_2', 1797: 'SCHEMAP_DERIVATION_OK_RESTRICTION_4_1', 1798: 'SCHEMAP_DERIVATION_OK_RESTRICTION_4_2', 1799: 'SCHEMAP_DERIVATION_OK_RESTRICTION_4_3', 1800: 'SCHEMAP_COS_CT_EXTENDS_1_3', 1801: 'SCHEMAV_NOROOT', 1802: 'SCHEMAV_UNDECLAREDELEM', 1803: 'SCHEMAV_NOTTOPLEVEL', 1804: 'SCHEMAV_MISSING', 1805: 'SCHEMAV_WRONGELEM', 1806: 'SCHEMAV_NOTYPE', 1807: 'SCHEMAV_NOROLLBACK', 1808: 'SCHEMAV_ISABSTRACT', 1809: 'SCHEMAV_NOTEMPTY', 1810: 'SCHEMAV_ELEMCONT', 1811: 'SCHEMAV_HAVEDEFAULT', 1812: 'SCHEMAV_NOTNILLABLE', 1813: 'SCHEMAV_EXTRACONTENT', 1814: 'SCHEMAV_INVALIDATTR', 1815: 'SCHEMAV_INVALIDELEM', 1816: 'SCHEMAV_NOTDETERMINIST', 1817: 'SCHEMAV_CONSTRUCT', 1818: 'SCHEMAV_INTERNAL', 1819: 'SCHEMAV_NOTSIMPLE', 1820: 'SCHEMAV_ATTRUNKNOWN', 1821: 'SCHEMAV_ATTRINVALID', 1822: 'SCHEMAV_VALUE', 1823: 'SCHEMAV_FACET', 1824: 'SCHEMAV_CVC_DATATYPE_VALID_1_2_1', 1825: 'SCHEMAV_CVC_DATATYPE_VALID_1_2_2', 1826: 'SCHEMAV_CVC_DATATYPE_VALID_1_2_3', 1827: 'SCHEMAV_CVC_TYPE_3_1_1', 1828: 'SCHEMAV_CVC_TYPE_3_1_2', 1829: 'SCHEMAV_CVC_FACET_VALID', 1830: 'SCHEMAV_CVC_LENGTH_VALID', 1831: 'SCHEMAV_CVC_MINLENGTH_VALID', 1832: 'SCHEMAV_CVC_MAXLENGTH_VALID', 1833: 'SCHEMAV_CVC_MININCLUSIVE_VALID', 1834: 'SCHEMAV_CVC_MAXINCLUSIVE_VALID', 1835: 'SCHEMAV_CVC_MINEXCLUSIVE_VALID', 1836: 'SCHEMAV_CVC_MAXEXCLUSIVE_VALID', 1837: 'SCHEMAV_CVC_TOTALDIGITS_VALID', 1838: 'SCHEMAV_CVC_FRACTIONDIGITS_VALID', 1839: 'SCHEMAV_CVC_PATTERN_VALID', 1840: 'SCHEMAV_CVC_ENUMERATION_VALID', 1841: 'SCHEMAV_CVC_COMPLEX_TYPE_2_1', 1842: 'SCHEMAV_CVC_COMPLEX_TYPE_2_2', 1843: 'SCHEMAV_CVC_COMPLEX_TYPE_2_3', 1844: 'SCHEMAV_CVC_COMPLEX_TYPE_2_4', 1845: 'SCHEMAV_CVC_ELT_1', 1846: 'SCHEMAV_CVC_ELT_2', 1847: 'SCHEMAV_CVC_ELT_3_1', 1848: 'SCHEMAV_CVC_ELT_3_2_1', 1849: 'SCHEMAV_CVC_ELT_3_2_2', 1850: 'SCHEMAV_CVC_ELT_4_1', 1851: 'SCHEMAV_CVC_ELT_4_2', 1852: 'SCHEMAV_CVC_ELT_4_3', 1853: 'SCHEMAV_CVC_ELT_5_1_1', 1854: 'SCHEMAV_CVC_ELT_5_1_2', 1855: 'SCHEMAV_CVC_ELT_5_2_1', 1856: 'SCHEMAV_CVC_ELT_5_2_2_1', 1857: 'SCHEMAV_CVC_ELT_5_2_2_2_1', 1858: 'SCHEMAV_CVC_ELT_5_2_2_2_2', 1859: 'SCHEMAV_CVC_ELT_6', 1860: 'SCHEMAV_CVC_ELT_7', 1861: 'SCHEMAV_CVC_ATTRIBUTE_1', 1862: 'SCHEMAV_CVC_ATTRIBUTE_2', 1863: 'SCHEMAV_CVC_ATTRIBUTE_3', 1864: 'SCHEMAV_CVC_ATTRIBUTE_4', 1865: 'SCHEMAV_CVC_COMPLEX_TYPE_3_1', 1866: 'SCHEMAV_CVC_COMPLEX_TYPE_3_2_1', 1867: 'SCHEMAV_CVC_COMPLEX_TYPE_3_2_2', 1868: 'SCHEMAV_CVC_COMPLEX_TYPE_4', 1869: 'SCHEMAV_CVC_COMPLEX_TYPE_5_1', 1870: 'SCHEMAV_CVC_COMPLEX_TYPE_5_2', 1871: 'SCHEMAV_ELEMENT_CONTENT', 1872: 'SCHEMAV_DOCUMENT_ELEMENT_MISSING', 1873: 'SCHEMAV_CVC_COMPLEX_TYPE_1', 1874: 'SCHEMAV_CVC_AU', 1875: 'SCHEMAV_CVC_TYPE_1', 1876: 'SCHEMAV_CVC_TYPE_2', 1877: 'SCHEMAV_CVC_IDC', 1878: 'SCHEMAV_CVC_WILDCARD', 1879: 'SCHEMAV_MISC', 1900: 'XPTR_UNKNOWN_SCHEME', 1901: 'XPTR_CHILDSEQ_START', 1902: 'XPTR_EVAL_FAILED', 1903: 'XPTR_EXTRA_OBJECTS', 1950: 'C14N_CREATE_CTXT', 1951: 'C14N_REQUIRES_UTF8', 1952: 'C14N_CREATE_STACK', 1953: 'C14N_INVALID_NODE', 1954: 'C14N_UNKNOW_NODE', 1955: 'C14N_RELATIVE_NAMESPACE', 2000: 'FTP_PASV_ANSWER', 2001: 'FTP_EPSV_ANSWER', 2002: 'FTP_ACCNT', 2003: 'FTP_URL_SYNTAX', 2020: 'HTTP_URL_SYNTAX', 2021: 'HTTP_USE_IP', 2022: 'HTTP_UNKNOWN_HOST', 3000: 'SCHEMAP_SRC_SIMPLE_TYPE_1', 3001: 'SCHEMAP_SRC_SIMPLE_TYPE_2', 3002: 'SCHEMAP_SRC_SIMPLE_TYPE_3', 3003: 'SCHEMAP_SRC_SIMPLE_TYPE_4', 3004: 'SCHEMAP_SRC_RESOLVE', 3005: 'SCHEMAP_SRC_RESTRICTION_BASE_OR_SIMPLETYPE', 3006: 'SCHEMAP_SRC_LIST_ITEMTYPE_OR_SIMPLETYPE', 3007: 'SCHEMAP_SRC_UNION_MEMBERTYPES_OR_SIMPLETYPES', 3008: 'SCHEMAP_ST_PROPS_CORRECT_1', 3009: 'SCHEMAP_ST_PROPS_CORRECT_2', 3010: 'SCHEMAP_ST_PROPS_CORRECT_3', 3011: 'SCHEMAP_COS_ST_RESTRICTS_1_1', 3012: 'SCHEMAP_COS_ST_RESTRICTS_1_2', 3013: 'SCHEMAP_COS_ST_RESTRICTS_1_3_1', 3014: 'SCHEMAP_COS_ST_RESTRICTS_1_3_2', 3015: 'SCHEMAP_COS_ST_RESTRICTS_2_1', 3016: 'SCHEMAP_COS_ST_RESTRICTS_2_3_1_1', 3017: 'SCHEMAP_COS_ST_RESTRICTS_2_3_1_2', 3018: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_1', 3019: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_2', 3020: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_3', 3021: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_4', 3022: 'SCHEMAP_COS_ST_RESTRICTS_2_3_2_5', 3023: 'SCHEMAP_COS_ST_RESTRICTS_3_1', 3024: 'SCHEMAP_COS_ST_RESTRICTS_3_3_1', 3025: 'SCHEMAP_COS_ST_RESTRICTS_3_3_1_2', 3026: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_2', 3027: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_1', 3028: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_3', 3029: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_4', 3030: 'SCHEMAP_COS_ST_RESTRICTS_3_3_2_5', 3031: 'SCHEMAP_COS_ST_DERIVED_OK_2_1', 3032: 'SCHEMAP_COS_ST_DERIVED_OK_2_2', 3033: 'SCHEMAP_S4S_ELEM_NOT_ALLOWED', 3034: 'SCHEMAP_S4S_ELEM_MISSING', 3035: 'SCHEMAP_S4S_ATTR_NOT_ALLOWED', 3036: 'SCHEMAP_S4S_ATTR_MISSING', 3037: 'SCHEMAP_S4S_ATTR_INVALID_VALUE', 3038: 'SCHEMAP_SRC_ELEMENT_1', 3039: 'SCHEMAP_SRC_ELEMENT_2_1', 3040: 'SCHEMAP_SRC_ELEMENT_2_2', 3041: 'SCHEMAP_SRC_ELEMENT_3', 3042: 'SCHEMAP_P_PROPS_CORRECT_1', 3043: 'SCHEMAP_P_PROPS_CORRECT_2_1', 3044: 'SCHEMAP_P_PROPS_CORRECT_2_2', 3045: 'SCHEMAP_E_PROPS_CORRECT_2', 3046: 'SCHEMAP_E_PROPS_CORRECT_3', 3047: 'SCHEMAP_E_PROPS_CORRECT_4', 3048: 'SCHEMAP_E_PROPS_CORRECT_5', 3049: 'SCHEMAP_E_PROPS_CORRECT_6', 3050: 'SCHEMAP_SRC_INCLUDE', 3051: 'SCHEMAP_SRC_ATTRIBUTE_1', 3052: 'SCHEMAP_SRC_ATTRIBUTE_2', 3053: 'SCHEMAP_SRC_ATTRIBUTE_3_1', 3054: 'SCHEMAP_SRC_ATTRIBUTE_3_2', 3055: 'SCHEMAP_SRC_ATTRIBUTE_4', 3056: 'SCHEMAP_NO_XMLNS', 3057: 'SCHEMAP_NO_XSI', 3058: 'SCHEMAP_COS_VALID_DEFAULT_1', 3059: 'SCHEMAP_COS_VALID_DEFAULT_2_1', 3060: 'SCHEMAP_COS_VALID_DEFAULT_2_2_1', 3061: 'SCHEMAP_COS_VALID_DEFAULT_2_2_2', 3062: 'SCHEMAP_CVC_SIMPLE_TYPE', 3063: 'SCHEMAP_COS_CT_EXTENDS_1_1', 3064: 'SCHEMAP_SRC_IMPORT_1_1', 3065: 'SCHEMAP_SRC_IMPORT_1_2', 3066: 'SCHEMAP_SRC_IMPORT_2', 3067: 'SCHEMAP_SRC_IMPORT_2_1', 3068: 'SCHEMAP_SRC_IMPORT_2_2', 3069: 'SCHEMAP_INTERNAL', 3070: 'SCHEMAP_NOT_DETERMINISTIC', 3071: 'SCHEMAP_SRC_ATTRIBUTE_GROUP_1', 3072: 'SCHEMAP_SRC_ATTRIBUTE_GROUP_2', 3073: 'SCHEMAP_SRC_ATTRIBUTE_GROUP_3', 3074: 'SCHEMAP_MG_PROPS_CORRECT_1', 3075: 'SCHEMAP_MG_PROPS_CORRECT_2', 3076: 'SCHEMAP_SRC_CT_1', 3077: 'SCHEMAP_DERIVATION_OK_RESTRICTION_2_1_3', 3078: 'SCHEMAP_AU_PROPS_CORRECT_2', 3079: 'SCHEMAP_A_PROPS_CORRECT_2', 3080: 'SCHEMAP_C_PROPS_CORRECT', 3081: 'SCHEMAP_SRC_REDEFINE', 3082: 'SCHEMAP_SRC_IMPORT', 3083: 'SCHEMAP_WARN_SKIP_SCHEMA', 3084: 'SCHEMAP_WARN_UNLOCATED_SCHEMA', 3085: 'SCHEMAP_WARN_ATTR_REDECL_PROH', 3086: 'SCHEMAP_WARN_ATTR_POINTLESS_PROH', 3087: 'SCHEMAP_AG_PROPS_CORRECT', 3088: 'SCHEMAP_COS_CT_EXTENDS_1_2', 3089: 'SCHEMAP_AU_PROPS_CORRECT', 3090: 'SCHEMAP_A_PROPS_CORRECT_3', 3091: 'SCHEMAP_COS_ALL_LIMITED', 4000: 'SCHEMATRONV_ASSERT', 4001: 'SCHEMATRONV_REPORT', 4900: 'MODULE_OPEN', 4901: 'MODULE_CLOSE', 5000: 'CHECK_FOUND_ELEMENT', 5001: 'CHECK_FOUND_ATTRIBUTE', 5002: 'CHECK_FOUND_TEXT', 5003: 'CHECK_FOUND_CDATA', 5004: 'CHECK_FOUND_ENTITYREF', 5005: 'CHECK_FOUND_ENTITY', 5006: 'CHECK_FOUND_PI', 5007: 'CHECK_FOUND_COMMENT', 5008: 'CHECK_FOUND_DOCTYPE', 5009: 'CHECK_FOUND_FRAGMENT', 5010: 'CHECK_FOUND_NOTATION', 5011: 'CHECK_UNKNOWN_NODE', 5012: 'CHECK_ENTITY_TYPE', 5013: 'CHECK_NO_PARENT', 5014: 'CHECK_NO_DOC', 5015: 'CHECK_NO_NAME', 5016: 'CHECK_NO_ELEM', 5017: 'CHECK_WRONG_DOC', 5018: 'CHECK_NO_PREV', 5019: 'CHECK_WRONG_PREV', 5020: 'CHECK_NO_NEXT', 5021: 'CHECK_WRONG_NEXT', 5022: 'CHECK_NOT_DTD', 5023: 'CHECK_NOT_ATTR', 5024: 'CHECK_NOT_ATTR_DECL', 5025: 'CHECK_NOT_ELEM_DECL', 5026: 'CHECK_NOT_ENTITY_DECL', 5027: 'CHECK_NOT_NS_DECL', 5028: 'CHECK_NO_HREF', 5029: 'CHECK_WRONG_PARENT', 5030: 'CHECK_NS_SCOPE', 5031: 'CHECK_NS_ANCESTOR', 5032: 'CHECK_NOT_UTF8', 5033: 'CHECK_NO_DICT', 5034: 'CHECK_NOT_NCNAME', 5035: 'CHECK_OUTSIDE_DICT', 5036: 'CHECK_WRONG_NAME', 5037: 'CHECK_NAME_NOT_NULL', 6000: 'I18N_NO_NAME', 6001: 'I18N_NO_HANDLER', 6002: 'I18N_EXCESS_HANDLER', 6003: 'I18N_CONV_FAILED', 6004: 'I18N_NO_OUTPUT', 7000: 'BUF_OVERFLOW'}
- class lxml.etree.FallbackElementClassLookup(self, fallback=None)
Bases:
ElementClassLookup
Superclass of Element class lookups with additional fallback.
- set_fallback(self, lookup)
Sets the fallback scheme for this lookup method.
- fallback
- class lxml.etree.HTMLParser(self, encoding=None, remove_blank_text=False, remove_comments=False, remove_pis=False, strip_cdata=True, no_network=True, target=None, schema: XMLSchema = None, recover=True, compact=True, collect_ids=True, huge_tree=False)
Bases:
_FeedParser
The HTML parser.
This parser allows reading HTML into a normal XML tree. By default, it can read broken (non well-formed) HTML, depending on the capabilities of libxml2. Use the ‘recover’ option to switch this off.
Available boolean keyword arguments:
recover - try hard to parse through broken HTML (default: True)
no_network - prevent network access for related files (default: True)
remove_blank_text - discard empty text nodes that are ignorable (i.e. not actual text content)
remove_comments - discard comments
remove_pis - discard processing instructions
strip_cdata - replace CDATA sections by normal text content (default: True)
compact - save memory for short text content (default: True)
default_doctype - add a default doctype even if it is not found in the HTML (default: True)
collect_ids - use a hash table of XML IDs for fast access (default: True)
- huge_tree - disable security restrictions and support very deep trees
and very long text content (only affects libxml2 2.7+)
Other keyword arguments:
encoding - override the document encoding
target - a parser target object that will receive the parse events
schema - an XMLSchema to validate against
Note that you should avoid sharing parsers between threads for performance reasons.
- close(self)
Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.
This method must be called after passing the last chunk of data into the
feed()
method. It should only be called when using the feed parser interface, all other usage is undefined.
- copy(self)
Create a new parser with the same configuration.
- feed(self, data)
Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.
This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the
close()
method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to callclose()
.The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the
parse()
function concurrently.
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with this parser.
- setElementClassLookup(lookup)
- Deprecated
use
parser.set_element_class_lookup(lookup)
instead.
- set_element_class_lookup(self, lookup=None)
Set a lookup scheme for element classes generated from this parser.
Reset it by passing None or nothing.
- error_log
The error log of the last parser run.
- feed_error_log
The error log of the last (or current) run of the feed parser.
Note that this is local to the feed parser and thus is different from what the
error_log
property returns.
- resolvers
The custom resolver registry of this parser.
- target
- version
The version of the underlying XML parser.
- class lxml.etree.HTMLPullParser(self, events=None, *, tag=None, base_url=None, **kwargs)
Bases:
HTMLParser
HTML parser that collects parse events in an iterator.
The collected events are the same as for iterparse(), but the parser itself is non-blocking in the sense that it receives data chunks incrementally through its .feed() method, instead of reading them directly from a file(-like) object all by itself.
By default, it collects Element end events. To change that, pass any subset of the available events into the
events
argument:'start'
,'end'
,'start-ns'
,'end-ns'
,'comment'
,'pi'
.To support loading external dependencies relative to the input source, you can pass the
base_url
.- close(self)
Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.
This method must be called after passing the last chunk of data into the
feed()
method. It should only be called when using the feed parser interface, all other usage is undefined.
- copy(self)
Create a new parser with the same configuration.
- feed(self, data)
Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.
This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the
close()
method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to callclose()
.The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the
parse()
function concurrently.
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with this parser.
- read_events()
- setElementClassLookup(lookup)
- Deprecated
use
parser.set_element_class_lookup(lookup)
instead.
- set_element_class_lookup(self, lookup=None)
Set a lookup scheme for element classes generated from this parser.
Reset it by passing None or nothing.
- error_log
The error log of the last parser run.
- feed_error_log
The error log of the last (or current) run of the feed parser.
Note that this is local to the feed parser and thus is different from what the
error_log
property returns.
- resolvers
The custom resolver registry of this parser.
- target
- version
The version of the underlying XML parser.
- class lxml.etree.PIBase
Bases:
_ProcessingInstruction
All custom Processing Instruction classes must inherit from this one.
To create an XML ProcessingInstruction instance, use the
PI()
factory.Subclasses must not override __init__ or __new__ as it is absolutely undefined when these objects will be created or destroyed. All persistent state of PIs must be stored in the underlying XML. If you really need to initialize the object after creation, you can implement an
_init(self)
method that will be called after object creation.- _init(self)
Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.
- addnext(self, element)
Adds the element as a following sibling directly after this element.
This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- addprevious(self, element)
Adds the element as a preceding sibling directly before this element.
This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- append(self, value)
- clear(self, keep_tail=False)
Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.
Pass
keep_tail=True
to leave the tail text untouched.
- cssselect(expr, *, translator)
Run the CSS expression on this element and its children, returning a list of the results.
Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.
- extend(self, elements)
Extends the current children by the elements in the iterable.
- find(self, path, namespaces=None)
Finds the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds text for the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- get(self, key, default=None)
Try to parse pseudo-attributes from the text content of the processing instruction, search for one with the given key as name and return its associated value.
Note that this is only a convenience method for the most common case that all text content is structured in attribute-like name-value pairs with properly quoted values. It is not guaranteed to work for all possible text content.
- getchildren(self)
Returns all direct children. The elements are returned in document order.
- Deprecated
Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use
list(element)
or simply iterate over elements.
- getiterator(self, tag=None, *tags)
Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags, see iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
element.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getnext(self)
Returns the following sibling of this element or None.
- getparent(self)
Returns the parent of this element or None for the root element.
- getprevious(self)
Returns the preceding sibling of this element or None.
- getroottree(self)
Return an ElementTree for the root node of the document that contains this element.
This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.
- index(self, child, start=None, stop=None)
Find the position of the child within the parent.
This method is not part of the original ElementTree API.
- insert(self, index, value)
- items(self)
- iter(self, tag=None, *tags)
Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags: pass
"{ns}localname"
as tag. Either or both ofns
andlocalname
can be*
for a wildcard;ns
can be empty for no namespace."localname"
is equivalent to"{}localname"
(i.e. no namespace) but"*"
is"{*}*"
(any or no namespace), not"{}*"
.You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.
Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.
- iterancestors(self, tag=None, *tags)
Iterate over the ancestors of this element (from parent to parent).
Can be restricted to find only elements with specific tags, see iter.
- iterchildren(self, tag=None, *tags, reversed=False)
Iterate over the children of this element.
As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.
- iterdescendants(self, tag=None, *tags)
Iterate over the descendants of this element in document order.
As opposed to
el.iter()
, this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.
- iterfind(self, path, namespaces=None)
Iterates over all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- itersiblings(self, tag=None, *tags, preceding=False)
Iterate over the following or preceding siblings of this element.
The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.
Can be restricted to find only elements with specific tags, see iter.
- itertext(self, tag=None, *tags, with_tail=True)
Iterates over the text content of a subtree.
You can pass tag names to restrict text content to specific elements, see iter.
You can set the
with_tail
keyword argument toFalse
to skip over tail text.
- keys(self)
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with the same document.
- remove(self, element)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
- replace(self, old_element, new_element)
Replaces a subelement with the element passed as second argument.
- set(self, key, value)
- values(self)
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
Evaluate an xpath expression using the element as context node.
- attrib
Returns a dict containing all pseudo-attributes that can be parsed from the text content of this processing instruction. Note that modifying the dict currently has no effect on the XML node, although this is not guaranteed to stay this way.
- base
The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.
Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.
Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).
- nsmap
Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.
Note that changing the returned dict has no effect on the Element.
- prefix
Namespace prefix or None.
- sourceline
Original line number as found by the parser or None if unknown.
- tag
- tail
Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.
- target
- text
- class lxml.etree.ParserBasedElementClassLookup(self, fallback=None)
Bases:
FallbackElementClassLookup
Element class lookup based on the XML parser.
- set_fallback(self, lookup)
Sets the fallback scheme for this lookup method.
- fallback
- class lxml.etree.PyErrorLog(self, logger_name=None, logger=None)
Bases:
_BaseErrorLog
A global error log that connects to the Python stdlib logging package.
The constructor accepts an optional logger name or a readily instantiated logger instance.
If you want to change the mapping between libxml2’s ErrorLevels and Python logging levels, you can modify the level_map dictionary from a subclass.
The default mapping is:
ErrorLevels.WARNING = logging.WARNING ErrorLevels.ERROR = logging.ERROR ErrorLevels.FATAL = logging.CRITICAL
You can also override the method
receive()
that takes a LogEntry object and callsself.log(log_entry, format_string, arg1, arg2, ...)
with appropriate data.- copy()
Dummy method that returns an empty error log.
- log(self, log_entry, message, *args)
Called by the .receive() method to log a _LogEntry instance to the Python logging system. This handles the error level mapping.
In the default implementation, the
message
argument receives a complete log line, and there are no furtherargs
. To change the message format, it is best to override the .receive() method instead of this one.
- receive(self, log_entry)
Receive a _LogEntry instance from the logging system. Calls the .log() method with appropriate parameters:
self.log(log_entry, repr(log_entry))
You can override this method to provide your own log output format.
- last_error
- level_map
- class lxml.etree.PythonElementClassLookup(self, fallback=None)
Bases:
FallbackElementClassLookup
Element class lookup based on a subclass method.
This class lookup scheme allows access to the entire XML tree in read-only mode. To use it, re-implement the
lookup(self, doc, root)
method in a subclass:from lxml import etree, pyclasslookup class MyElementClass(etree.ElementBase): honkey = True class MyLookup(pyclasslookup.PythonElementClassLookup): def lookup(self, doc, root): if root.tag == "sometag": return MyElementClass else: for child in root: if child.tag == "someothertag": return MyElementClass # delegate to default return None
If you return None from this method, the fallback will be called.
The first argument is the opaque document instance that contains the Element. The second argument is a lightweight Element proxy implementation that is only valid during the lookup. Do not try to keep a reference to it. Once the lookup is done, the proxy will be invalid.
Also, you cannot wrap such a read-only Element in an ElementTree, and you must take care not to keep a reference to them outside of the lookup() method.
Note that the API of the Element objects is not complete. It is purely read-only and does not support all features of the normal lxml.etree API (such as XPath, extended slicing or some iteration methods).
See https://lxml.de/element_classes.html
- lookup(self, doc, element)
Override this method to implement your own lookup scheme.
- set_fallback(self, lookup)
Sets the fallback scheme for this lookup method.
- fallback
- class lxml.etree.QName(text_or_uri_or_element, tag=None)
Bases:
object
QName wrapper for qualified XML names.
Pass a tag name by itself or a namespace URI and a tag name to create a qualified name. Alternatively, pass an Element to extract its tag name.
None
as first argument is ignored in order to allow for generic 2-argument usage.The
text
property holds the qualified name in{namespace}tagname
notation. Thenamespace
andlocalname
properties hold the respective parts of the tag name.You can pass QName objects wherever a tag name is expected. Also, setting Element text from a QName will resolve the namespace prefix on assignment and set a qualified text value. This is helpful in XML languages like SOAP or XML-Schema that use prefixed tag names in their text content.
- localname
- namespace
- text
- class lxml.etree.RelaxNG(self, etree=None, file=None)
Bases:
_Validator
Turn a document into a Relax NG validator.
Either pass a schema as Element or ElementTree, or pass a file or filename through the
file
keyword argument.- _append_log_message(domain, type, level, line, message, filename)
- _clear_error_log()
- assertValid(self, etree)
Raises DocumentInvalid if the document does not comply with the schema.
- assert_(self, etree)
Raises AssertionError if the document does not comply with the schema.
- classmethod from_rnc_string(src, base_url)
Parse a RelaxNG schema in compact syntax from a text string
Requires the rnc2rng package to be installed.
Passing the source URL or file path of the source as ‘base_url’ will enable resolving resource references relative to the source.
- validate(self, etree)
Validate the document using this schema.
Returns true if document is valid, false if not.
- error_log
The log of validation errors and warnings.
- class lxml.etree.RelaxNGErrorTypes
Bases:
object
Libxml2 RelaxNG error types
- _getName(default=None, /)
Return the value for key if key is in the dictionary, else default.
- RELAXNG_ERR_ATTREXTRANS = 20
- RELAXNG_ERR_ATTRNAME = 14
- RELAXNG_ERR_ATTRNONS = 16
- RELAXNG_ERR_ATTRVALID = 24
- RELAXNG_ERR_ATTRWRONGNS = 18
- RELAXNG_ERR_CONTENTVALID = 25
- RELAXNG_ERR_DATAELEM = 28
- RELAXNG_ERR_DATATYPE = 31
- RELAXNG_ERR_DUPID = 4
- RELAXNG_ERR_ELEMEXTRANS = 19
- RELAXNG_ERR_ELEMNAME = 13
- RELAXNG_ERR_ELEMNONS = 15
- RELAXNG_ERR_ELEMNOTEMPTY = 21
- RELAXNG_ERR_ELEMWRONG = 38
- RELAXNG_ERR_ELEMWRONGNS = 17
- RELAXNG_ERR_EXTRACONTENT = 26
- RELAXNG_ERR_EXTRADATA = 35
- RELAXNG_ERR_INTEREXTRA = 12
- RELAXNG_ERR_INTERNAL = 37
- RELAXNG_ERR_INTERNODATA = 10
- RELAXNG_ERR_INTERSEQ = 11
- RELAXNG_ERR_INVALIDATTR = 27
- RELAXNG_ERR_LACKDATA = 36
- RELAXNG_ERR_LIST = 33
- RELAXNG_ERR_LISTELEM = 30
- RELAXNG_ERR_LISTEMPTY = 9
- RELAXNG_ERR_LISTEXTRA = 8
- RELAXNG_ERR_MEMORY = 1
- RELAXNG_ERR_NODEFINE = 7
- RELAXNG_ERR_NOELEM = 22
- RELAXNG_ERR_NOGRAMMAR = 34
- RELAXNG_ERR_NOSTATE = 6
- RELAXNG_ERR_NOTELEM = 23
- RELAXNG_ERR_TEXTWRONG = 39
- RELAXNG_ERR_TYPE = 2
- RELAXNG_ERR_TYPECMP = 5
- RELAXNG_ERR_TYPEVAL = 3
- RELAXNG_ERR_VALELEM = 29
- RELAXNG_ERR_VALUE = 32
- RELAXNG_OK = 0
- _names = {0: 'RELAXNG_OK', 1: 'RELAXNG_ERR_MEMORY', 2: 'RELAXNG_ERR_TYPE', 3: 'RELAXNG_ERR_TYPEVAL', 4: 'RELAXNG_ERR_DUPID', 5: 'RELAXNG_ERR_TYPECMP', 6: 'RELAXNG_ERR_NOSTATE', 7: 'RELAXNG_ERR_NODEFINE', 8: 'RELAXNG_ERR_LISTEXTRA', 9: 'RELAXNG_ERR_LISTEMPTY', 10: 'RELAXNG_ERR_INTERNODATA', 11: 'RELAXNG_ERR_INTERSEQ', 12: 'RELAXNG_ERR_INTEREXTRA', 13: 'RELAXNG_ERR_ELEMNAME', 14: 'RELAXNG_ERR_ATTRNAME', 15: 'RELAXNG_ERR_ELEMNONS', 16: 'RELAXNG_ERR_ATTRNONS', 17: 'RELAXNG_ERR_ELEMWRONGNS', 18: 'RELAXNG_ERR_ATTRWRONGNS', 19: 'RELAXNG_ERR_ELEMEXTRANS', 20: 'RELAXNG_ERR_ATTREXTRANS', 21: 'RELAXNG_ERR_ELEMNOTEMPTY', 22: 'RELAXNG_ERR_NOELEM', 23: 'RELAXNG_ERR_NOTELEM', 24: 'RELAXNG_ERR_ATTRVALID', 25: 'RELAXNG_ERR_CONTENTVALID', 26: 'RELAXNG_ERR_EXTRACONTENT', 27: 'RELAXNG_ERR_INVALIDATTR', 28: 'RELAXNG_ERR_DATAELEM', 29: 'RELAXNG_ERR_VALELEM', 30: 'RELAXNG_ERR_LISTELEM', 31: 'RELAXNG_ERR_DATATYPE', 32: 'RELAXNG_ERR_VALUE', 33: 'RELAXNG_ERR_LIST', 34: 'RELAXNG_ERR_NOGRAMMAR', 35: 'RELAXNG_ERR_EXTRADATA', 36: 'RELAXNG_ERR_LACKDATA', 37: 'RELAXNG_ERR_INTERNAL', 38: 'RELAXNG_ERR_ELEMWRONG', 39: 'RELAXNG_ERR_TEXTWRONG'}
- class lxml.etree.Resolver
Bases:
object
This is the base class of all resolvers.
- resolve(self, system_url, public_id, context)
Override this method to resolve an external source by
system_url
andpublic_id
. The third argument is an opaque context object.Return the result of one of the
resolve_*()
methods.
- resolve_empty(self, context)
Return an empty input document.
Pass context as parameter.
- resolve_file(self, f, context, base_url=None, close=True)
Return an open file-like object as input document.
Pass open file and context as parameters. You can pass the base URL or filename of the file through the
base_url
keyword argument. If theclose
flag is True (the default), the file will be closed after reading.Note that using
.resolve_filename()
is more efficient, especially in threaded environments.
- resolve_filename(self, filename, context)
Return the name of a parsable file as input document.
Pass filename and context as parameters. You can also pass a URL with an HTTP, FTP or file target.
- resolve_string(self, string, context, base_url=None)
Return a parsable string as input document.
Pass data string and context as parameters. You can pass the source URL or filename through the
base_url
keyword argument.
- class lxml.etree.Schematron(self, etree=None, file=None)
Bases:
_Validator
A Schematron validator.
Pass a root Element or an ElementTree to turn it into a validator. Alternatively, pass a filename as keyword argument ‘file’ to parse from the file system.
Schematron is a less well known, but very powerful schema language. The main idea is to use the capabilities of XPath to put restrictions on the structure and the content of XML documents. Here is a simple example:
>>> schematron = Schematron(XML(''' ... <schema xmlns="http://www.ascc.net/xml/schematron" > ... <pattern name="id is the only permitted attribute name"> ... <rule context="*"> ... <report test="@*[not(name()='id')]">Attribute ... <name path="@*[not(name()='id')]"/> is forbidden<name/> ... </report> ... </rule> ... </pattern> ... </schema> ... ''')) >>> xml = XML(''' ... <AAA name="aaa"> ... <BBB id="bbb"/> ... <CCC color="ccc"/> ... </AAA> ... ''') >>> schematron.validate(xml) 0 >>> xml = XML(''' ... <AAA id="aaa"> ... <BBB id="bbb"/> ... <CCC/> ... </AAA> ... ''') >>> schematron.validate(xml) 1
Schematron was added to libxml2 in version 2.6.21. Before version 2.6.32, however, Schematron lacked support for error reporting other than to stderr. This version is therefore required to retrieve validation warnings and errors in lxml.
- _append_log_message(domain, type, level, line, message, filename)
- _clear_error_log()
- assertValid(self, etree)
Raises DocumentInvalid if the document does not comply with the schema.
- assert_(self, etree)
Raises AssertionError if the document does not comply with the schema.
- validate(self, etree)
Validate the document using this schema.
Returns true if document is valid, false if not.
- error_log
The log of validation errors and warnings.
- class lxml.etree.SiblingsIterator(self, node, tag=None, preceding=False)
Bases:
_ElementMatchIterator
Iterates over the siblings of an element.
You can pass the boolean keyword
preceding
to specify the direction.
- class lxml.etree.TreeBuilder
Bases:
_SaxParserTarget
- TreeBuilder(self, element_factory=None, parser=None,
comment_factory=None, pi_factory=None, insert_comments=True, insert_pis=True)
Parser target that builds a tree from parse event callbacks.
The factory arguments can be used to influence the creation of elements, comments and processing instructions.
By default, comments and processing instructions are inserted into the tree, but they can be ignored by passing the respective flags.
The final tree is returned by the
close()
method.- close(self)
Flushes the builder buffers, and returns the toplevel document element. Raises XMLSyntaxError on inconsistencies.
- comment(self, comment)
Creates a comment using the factory, appends it (unless disabled) and returns it.
- data(self, data)
Adds text to the current element. The value should be either an 8-bit string containing ASCII text, or a Unicode string.
- end(self, tag)
Closes the current element.
- pi(self, target, data=None)
Creates a processing instruction using the factory, appends it (unless disabled) and returns it.
- start(self, tag, attrs, nsmap=None)
Opens a new element.
- class lxml.etree.XInclude(self)
Bases:
object
XInclude processor.
Create an instance and call it on an Element to run XInclude processing.
- error_log
- class lxml.etree.XMLParser(self, encoding=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, ns_clean=False, recover=False, schema: XMLSchema = None, huge_tree=False, remove_blank_text=False, resolve_entities=True, remove_comments=False, remove_pis=False, strip_cdata=True, collect_ids=True, target=None, compact=True)
Bases:
_FeedParser
The XML parser.
Parsers can be supplied as additional argument to various parse functions of the lxml API. A default parser is always available and can be replaced by a call to the global function ‘set_default_parser’. New parsers can be created at any time without a major run-time overhead.
The keyword arguments in the constructor are mainly based on the libxml2 parser configuration. A DTD will also be loaded if DTD validation or attribute default values are requested (unless you additionally provide an XMLSchema from which the default attributes can be read).
Available boolean keyword arguments:
attribute_defaults - inject default attributes from DTD or XMLSchema
dtd_validation - validate against a DTD referenced by the document
load_dtd - use DTD for parsing
no_network - prevent network access for related files (default: True)
ns_clean - clean up redundant namespace declarations
recover - try hard to parse through broken XML
remove_blank_text - discard blank text nodes that appear ignorable
remove_comments - discard comments
remove_pis - discard processing instructions
strip_cdata - replace CDATA sections by normal text content (default: True)
compact - save memory for short text content (default: True)
collect_ids - use a hash table of XML IDs for fast access (default: True, always True with DTD validation)
resolve_entities - replace entities by their text value (default: True)
- huge_tree - disable security restrictions and support very deep trees
and very long text content (only affects libxml2 2.7+)
Other keyword arguments:
encoding - override the document encoding
target - a parser target object that will receive the parse events
schema - an XMLSchema to validate against
Note that you should avoid sharing parsers between threads. While this is not harmful, it is more efficient to use separate parsers. This does not apply to the default parser.
- close(self)
Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.
This method must be called after passing the last chunk of data into the
feed()
method. It should only be called when using the feed parser interface, all other usage is undefined.
- copy(self)
Create a new parser with the same configuration.
- feed(self, data)
Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.
This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the
close()
method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to callclose()
.The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the
parse()
function concurrently.
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with this parser.
- setElementClassLookup(lookup)
- Deprecated
use
parser.set_element_class_lookup(lookup)
instead.
- set_element_class_lookup(self, lookup=None)
Set a lookup scheme for element classes generated from this parser.
Reset it by passing None or nothing.
- error_log
The error log of the last parser run.
- feed_error_log
The error log of the last (or current) run of the feed parser.
Note that this is local to the feed parser and thus is different from what the
error_log
property returns.
- resolvers
The custom resolver registry of this parser.
- target
- version
The version of the underlying XML parser.
- class lxml.etree.XMLPullParser(self, events=None, *, tag=None, **kwargs)
Bases:
XMLParser
XML parser that collects parse events in an iterator.
The collected events are the same as for iterparse(), but the parser itself is non-blocking in the sense that it receives data chunks incrementally through its .feed() method, instead of reading them directly from a file(-like) object all by itself.
By default, it collects Element end events. To change that, pass any subset of the available events into the
events
argument:'start'
,'end'
,'start-ns'
,'end-ns'
,'comment'
,'pi'
.To support loading external dependencies relative to the input source, you can pass the
base_url
.- close(self)
Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.
This method must be called after passing the last chunk of data into the
feed()
method. It should only be called when using the feed parser interface, all other usage is undefined.
- copy(self)
Create a new parser with the same configuration.
- feed(self, data)
Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.
This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the
close()
method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to callclose()
.The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the
parse()
function concurrently.
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with this parser.
- read_events()
- setElementClassLookup(lookup)
- Deprecated
use
parser.set_element_class_lookup(lookup)
instead.
- set_element_class_lookup(self, lookup=None)
Set a lookup scheme for element classes generated from this parser.
Reset it by passing None or nothing.
- error_log
The error log of the last parser run.
- feed_error_log
The error log of the last (or current) run of the feed parser.
Note that this is local to the feed parser and thus is different from what the
error_log
property returns.
- resolvers
The custom resolver registry of this parser.
- target
- version
The version of the underlying XML parser.
- class lxml.etree.XMLSchema(self, etree=None, file=None)
Bases:
_Validator
Turn a document into an XML Schema validator.
Either pass a schema as Element or ElementTree, or pass a file or filename through the
file
keyword argument.Passing the
attribute_defaults
boolean option will make the schema insert default/fixed attributes into validated documents.- _append_log_message(domain, type, level, line, message, filename)
- _clear_error_log()
- assertValid(self, etree)
Raises DocumentInvalid if the document does not comply with the schema.
- assert_(self, etree)
Raises AssertionError if the document does not comply with the schema.
- validate(self, etree)
Validate the document using this schema.
Returns true if document is valid, false if not.
- error_log
The log of validation errors and warnings.
- lxml.etree.XMLTreeBuilder
alias of
ETCompatXMLParser
- class lxml.etree.XPath(self, path, namespaces=None, extensions=None, regexp=True, smart_strings=True)
Bases:
_XPathEvaluatorBase
A compiled XPath expression that can be called on Elements and ElementTrees.
Besides the XPath expression, you can pass prefix-namespace mappings and extension functions to the constructor through the keyword arguments
namespaces
andextensions
. EXSLT regular expression support can be disabled with the ‘regexp’ boolean keyword (defaults to True). Smart strings will be returned for string results unless you passsmart_strings=False
.- evaluate(self, _eval_arg, **_variables)
Evaluate an XPath expression.
Instead of calling this method, you can also call the evaluator object itself.
Variables may be provided as keyword arguments. Note that namespaces are currently not supported for variables.
- Deprecated
call the object, not its method.
- error_log
- path
The literal XPath expression.
- class lxml.etree.XPathDocumentEvaluator(self, etree, namespaces=None, extensions=None, regexp=True, smart_strings=True)
Bases:
XPathElementEvaluator
Create an XPath evaluator for an ElementTree.
Additional namespace declarations can be passed with the ‘namespace’ keyword argument. EXSLT regular expression support can be disabled with the ‘regexp’ boolean keyword (defaults to True). Smart strings will be returned for string results unless you pass
smart_strings=False
.- evaluate(self, _eval_arg, **_variables)
Evaluate an XPath expression.
Instead of calling this method, you can also call the evaluator object itself.
Variables may be provided as keyword arguments. Note that namespaces are currently not supported for variables.
- Deprecated
call the object, not its method.
- register_namespace(prefix, uri)
Register a namespace with the XPath context.
- register_namespaces(namespaces)
Register a prefix -> uri dict.
- error_log
- class lxml.etree.XPathElementEvaluator(self, element, namespaces=None, extensions=None, regexp=True, smart_strings=True)
Bases:
_XPathEvaluatorBase
Create an XPath evaluator for an element.
Absolute XPath expressions (starting with ‘/’) will be evaluated against the ElementTree as returned by getroottree().
Additional namespace declarations can be passed with the ‘namespace’ keyword argument. EXSLT regular expression support can be disabled with the ‘regexp’ boolean keyword (defaults to True). Smart strings will be returned for string results unless you pass
smart_strings=False
.- evaluate(self, _eval_arg, **_variables)
Evaluate an XPath expression.
Instead of calling this method, you can also call the evaluator object itself.
Variables may be provided as keyword arguments. Note that namespaces are currently not supported for variables.
- Deprecated
call the object, not its method.
- register_namespace(prefix, uri)
Register a namespace with the XPath context.
- register_namespaces(namespaces)
Register a prefix -> uri dict.
- error_log
- class lxml.etree.XSLT(self, xslt_input, extensions=None, regexp=True, access_control=None)
Bases:
object
Turn an XSL document into an XSLT object.
Calling this object on a tree or Element will execute the XSLT:
transform = etree.XSLT(xsl_tree) result = transform(xml_tree)
Keyword arguments of the constructor:
extensions: a dict mapping
(namespace, name)
pairs to extension functions or extension elementsregexp: enable exslt regular expression support in XPath (default: True)
access_control: access restrictions for network or file system (see XSLTAccessControl)
Keyword arguments of the XSLT call:
profile_run: enable XSLT profiling and make the profile available as XML document in
result.xslt_profile
(default: False)
Other keyword arguments of the call are passed to the stylesheet as parameters.
- apply(self, _input, profile_run=False, **kw)
- Deprecated
call the object, not this method.
- static set_global_max_depth(max_depth)
The maximum traversal depth that the stylesheet engine will allow. This does not only count the template recursion depth but also takes the number of variables/parameters into account. The required setting for a run depends on both the stylesheet and the input data.
Example:
XSLT.set_global_max_depth(5000)
Note that this is currently a global, module-wide setting because libxslt does not support it at a per-stylesheet level.
- static strparam(strval)
Mark an XSLT string parameter that requires quote escaping before passing it into the transformation. Use it like this:
result = transform(doc, some_strval = XSLT.strparam( '''it's "Monty Python's" ...'''))
Escaped string parameters can be reused without restriction.
- tostring(self, result_tree)
Save result doc to string based on stylesheet output method.
- Deprecated
use str(result_tree) instead.
- error_log
The log of errors and warnings of an XSLT execution.
- class lxml.etree.XSLTAccessControl(self, read_file=True, write_file=True, create_dir=True, read_network=True, write_network=True)
Bases:
object
Access control for XSLT: reading/writing files, directories and network I/O. Access to a type of resource is granted or denied by passing any of the following boolean keyword arguments. All of them default to True to allow access.
read_file
write_file
create_dir
read_network
write_network
For convenience, there is also a class member DENY_ALL that provides an XSLTAccessControl instance that is readily configured to deny everything, and a DENY_WRITE member that denies all write access but allows read access.
See XSLT.
- DENY_ALL = XSLTAccessControl(create_dir=False, read_file=False, read_network=False, write_file=False, write_network=False)
- DENY_WRITE = XSLTAccessControl(create_dir=False, read_file=True, read_network=True, write_file=False, write_network=False)
- options
The access control configuration as a map of options.
- class lxml.etree.XSLTExtension
Bases:
object
Base class of an XSLT extension element.
- apply_templates(self, context, node, output_parent=None, elements_only=False, remove_blank_text=False)
Call this method to retrieve the result of applying templates to an element.
The return value is a list of elements or text strings that were generated by the XSLT processor. If you pass
elements_only=True
, strings will be discarded from the result list. The optionremove_blank_text=True
will only discard strings that consist entirely of whitespace (e.g. formatting). These options do not apply to Elements, only to bare string results.If you pass an Element as output_parent parameter, the result will instead be appended to the element (including attributes etc.) and the return value will be None. This is a safe way to generate content into the output document directly, without having to take care of special values like text or attributes. Note that the string discarding options will be ignored in this case.
- execute(self, context, self_node, input_node, output_parent)
Execute this extension element.
Subclasses must override this method. They may append elements to the output_parent element here, or set its text content. To this end, the input_node provides read-only access to the current node in the input document, and the self_node points to the extension element in the stylesheet.
Note that the output_parent parameter may be None if there is no parent element in the current context (e.g. no content was added to the output tree yet).
- process_children(self, context, output_parent=None, elements_only=False, remove_blank_text=False)
Call this method to process the XSLT content of the extension element itself.
The return value is a list of elements or text strings that were generated by the XSLT processor. If you pass
elements_only=True
, strings will be discarded from the result list. The optionremove_blank_text=True
will only discard strings that consist entirely of whitespace (e.g. formatting). These options do not apply to Elements, only to bare string results.If you pass an Element as output_parent parameter, the result will instead be appended to the element (including attributes etc.) and the return value will be None. This is a safe way to generate content into the output document directly, without having to take care of special values like text or attributes. Note that the string discarding options will be ignored in this case.
- class lxml.etree._Attrib
Bases:
object
A dict-like proxy for the
Element.attrib
property.- clear()
- get(key, default)
- has_key(key)
- items()
- iteritems()
- iterkeys()
- itervalues()
- keys()
- pop(key, *default)
- update(sequence_or_dict)
- values()
- class lxml.etree._Comment
Bases:
__ContentOnlyElement
- _init(self)
Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.
- addnext(self, element)
Adds the element as a following sibling directly after this element.
This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- addprevious(self, element)
Adds the element as a preceding sibling directly before this element.
This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- append(self, value)
- clear(self, keep_tail=False)
Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.
Pass
keep_tail=True
to leave the tail text untouched.
- cssselect(expr, *, translator)
Run the CSS expression on this element and its children, returning a list of the results.
Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.
- extend(self, elements)
Extends the current children by the elements in the iterable.
- find(self, path, namespaces=None)
Finds the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds text for the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- get(self, key, default=None)
- getchildren(self)
Returns all direct children. The elements are returned in document order.
- Deprecated
Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use
list(element)
or simply iterate over elements.
- getiterator(self, tag=None, *tags)
Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags, see iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
element.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getnext(self)
Returns the following sibling of this element or None.
- getparent(self)
Returns the parent of this element or None for the root element.
- getprevious(self)
Returns the preceding sibling of this element or None.
- getroottree(self)
Return an ElementTree for the root node of the document that contains this element.
This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.
- index(self, child, start=None, stop=None)
Find the position of the child within the parent.
This method is not part of the original ElementTree API.
- insert(self, index, value)
- items(self)
- iter(self, tag=None, *tags)
Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags: pass
"{ns}localname"
as tag. Either or both ofns
andlocalname
can be*
for a wildcard;ns
can be empty for no namespace."localname"
is equivalent to"{}localname"
(i.e. no namespace) but"*"
is"{*}*"
(any or no namespace), not"{}*"
.You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.
Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.
- iterancestors(self, tag=None, *tags)
Iterate over the ancestors of this element (from parent to parent).
Can be restricted to find only elements with specific tags, see iter.
- iterchildren(self, tag=None, *tags, reversed=False)
Iterate over the children of this element.
As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.
- iterdescendants(self, tag=None, *tags)
Iterate over the descendants of this element in document order.
As opposed to
el.iter()
, this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.
- iterfind(self, path, namespaces=None)
Iterates over all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- itersiblings(self, tag=None, *tags, preceding=False)
Iterate over the following or preceding siblings of this element.
The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.
Can be restricted to find only elements with specific tags, see iter.
- itertext(self, tag=None, *tags, with_tail=True)
Iterates over the text content of a subtree.
You can pass tag names to restrict text content to specific elements, see iter.
You can set the
with_tail
keyword argument toFalse
to skip over tail text.
- keys(self)
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with the same document.
- remove(self, element)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
- replace(self, old_element, new_element)
Replaces a subelement with the element passed as second argument.
- set(self, key, value)
- values(self)
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
Evaluate an xpath expression using the element as context node.
- attrib
- base
The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.
Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.
Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).
- nsmap
Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.
Note that changing the returned dict has no effect on the Element.
- prefix
Namespace prefix or None.
- sourceline
Original line number as found by the parser or None if unknown.
- tag
- tail
Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.
- text
- class lxml.etree._Document
Bases:
object
Internal base class to reference a libxml document.
When instances of this class are garbage collected, the libxml document is cleaned up.
- class lxml.etree._DomainErrorLog
Bases:
_ErrorLog
- clear()
- copy()
Creates a shallow copy of this error log and the list of entries.
- filter_domains(domains)
Filter the errors by the given domains and return a new error log containing the matches.
- filter_from_errors(self)
Convenience method to get all error messages or worse.
- filter_from_fatals(self)
Convenience method to get all fatal error messages.
- filter_from_level(self, level)
Return a log with all messages of the requested level of worse.
- filter_from_warnings(self)
Convenience method to get all warnings or worse.
- filter_levels(self, levels)
Filter the errors by the given error levels and return a new error log containing the matches.
- filter_types(self, types)
Filter the errors by the given types and return a new error log containing the matches.
- receive(entry)
- last_error
- class lxml.etree._Element
Bases:
object
Element class.
References a document object and a libxml node.
By pointing to a Document instance, a reference is kept to _Document as long as there is some pointer to a node in it.
- _init(self)
Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.
- addnext(self, element)
Adds the element as a following sibling directly after this element.
This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- addprevious(self, element)
Adds the element as a preceding sibling directly before this element.
This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- append(self, element)
Adds a subelement to the end of this element.
- clear(self, keep_tail=False)
Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.
Pass
keep_tail=True
to leave the tail text untouched.
- cssselect(expr, *, translator)
Run the CSS expression on this element and its children, returning a list of the results.
Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.
- extend(self, elements)
Extends the current children by the elements in the iterable.
- find(self, path, namespaces=None)
Finds the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds text for the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- get(self, key, default=None)
Gets an element attribute.
- getchildren(self)
Returns all direct children. The elements are returned in document order.
- Deprecated
Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use
list(element)
or simply iterate over elements.
- getiterator(self, tag=None, *tags)
Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags, see iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
element.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getnext(self)
Returns the following sibling of this element or None.
- getparent(self)
Returns the parent of this element or None for the root element.
- getprevious(self)
Returns the preceding sibling of this element or None.
- getroottree(self)
Return an ElementTree for the root node of the document that contains this element.
This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.
- index(self, child, start=None, stop=None)
Find the position of the child within the parent.
This method is not part of the original ElementTree API.
- insert(self, index, element)
Inserts a subelement at the given position in this element
- items(self)
Gets element attributes, as a sequence. The attributes are returned in an arbitrary order.
- iter(self, tag=None, *tags)
Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags: pass
"{ns}localname"
as tag. Either or both ofns
andlocalname
can be*
for a wildcard;ns
can be empty for no namespace."localname"
is equivalent to"{}localname"
(i.e. no namespace) but"*"
is"{*}*"
(any or no namespace), not"{}*"
.You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.
Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.
- iterancestors(self, tag=None, *tags)
Iterate over the ancestors of this element (from parent to parent).
Can be restricted to find only elements with specific tags, see iter.
- iterchildren(self, tag=None, *tags, reversed=False)
Iterate over the children of this element.
As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.
- iterdescendants(self, tag=None, *tags)
Iterate over the descendants of this element in document order.
As opposed to
el.iter()
, this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.
- iterfind(self, path, namespaces=None)
Iterates over all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- itersiblings(self, tag=None, *tags, preceding=False)
Iterate over the following or preceding siblings of this element.
The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.
Can be restricted to find only elements with specific tags, see iter.
- itertext(self, tag=None, *tags, with_tail=True)
Iterates over the text content of a subtree.
You can pass tag names to restrict text content to specific elements, see iter.
You can set the
with_tail
keyword argument toFalse
to skip over tail text.
- keys(self)
Gets a list of attribute names. The names are returned in an arbitrary order (just like for an ordinary Python dictionary).
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with the same document.
- remove(self, element)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
- replace(self, old_element, new_element)
Replaces a subelement with the element passed as second argument.
- set(self, key, value)
Sets an element attribute. In HTML documents (not XML or XHTML), the value None is allowed and creates an attribute without value (just the attribute name).
- values(self)
Gets element attribute values as a sequence of strings. The attributes are returned in an arbitrary order.
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
Evaluate an xpath expression using the element as context node.
- attrib
Element attribute dictionary. Where possible, use get(), set(), keys(), values() and items() to access element attributes.
- base
The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.
Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.
Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).
- nsmap
Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.
Note that changing the returned dict has no effect on the Element.
- prefix
Namespace prefix or None.
- sourceline
Original line number as found by the parser or None if unknown.
- tag
Element tag
- tail
Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.
- text
Text before the first subelement. This is either a string or the value None, if there was no text.
- class lxml.etree._ElementIterator
Bases:
_ElementTagMatcher
Dead but public. :)
- class lxml.etree._ElementMatchIterator
Bases:
object
- class lxml.etree._ElementStringResult
Bases:
bytes
- capitalize() copy of B
Return a copy of B with only its first character capitalized (ASCII) and the rest lower-cased.
- center(width, fillchar=b' ', /)
Return a centered string of length width.
Padding is done using the specified fill character.
- count(sub[, start[, end]]) int
Return the number of non-overlapping occurrences of subsection sub in bytes B[start:end]. Optional arguments start and end are interpreted as in slice notation.
- decode(encoding='utf-8', errors='strict')
Decode the bytes using the codec registered for encoding.
- encoding
The encoding with which to decode the bytes.
- errors
The error handling scheme to use for the handling of decoding errors. The default is ‘strict’ meaning that decoding errors raise a UnicodeDecodeError. Other possible values are ‘ignore’ and ‘replace’ as well as any other name registered with codecs.register_error that can handle UnicodeDecodeErrors.
- endswith(suffix[, start[, end]]) bool
Return True if B ends with the specified suffix, False otherwise. With optional start, test B beginning at that position. With optional end, stop comparing B at that position. suffix can also be a tuple of bytes to try.
- expandtabs(tabsize=8)
Return a copy where all tab characters are expanded using spaces.
If tabsize is not given, a tab size of 8 characters is assumed.
- find(sub[, start[, end]]) int
Return the lowest index in B where subsection sub is found, such that sub is contained within B[start,end]. Optional arguments start and end are interpreted as in slice notation.
Return -1 on failure.
- fromhex()
Create a bytes object from a string of hexadecimal numbers.
Spaces between two numbers are accepted. Example: bytes.fromhex(‘B9 01EF’) -> b’\xb9\x01\xef’.
- getparent()
- hex()
Create a string of hexadecimal numbers from a bytes object.
- sep
An optional single character or byte to separate hex bytes.
- bytes_per_sep
How many bytes between separators. Positive values count from the right, negative values count from the left.
Example: >>> value = b’xb9x01xef’ >>> value.hex() ‘b901ef’ >>> value.hex(‘:’) ‘b9:01:ef’ >>> value.hex(‘:’, 2) ‘b9:01ef’ >>> value.hex(‘:’, -2) ‘b901:ef’
- index(sub[, start[, end]]) int
Return the lowest index in B where subsection sub is found, such that sub is contained within B[start,end]. Optional arguments start and end are interpreted as in slice notation.
Raises ValueError when the subsection is not found.
- isalnum() bool
Return True if all characters in B are alphanumeric and there is at least one character in B, False otherwise.
- isalpha() bool
Return True if all characters in B are alphabetic and there is at least one character in B, False otherwise.
- isascii() bool
Return True if B is empty or all characters in B are ASCII, False otherwise.
- isdigit() bool
Return True if all characters in B are digits and there is at least one character in B, False otherwise.
- islower() bool
Return True if all cased characters in B are lowercase and there is at least one cased character in B, False otherwise.
- isspace() bool
Return True if all characters in B are whitespace and there is at least one character in B, False otherwise.
- istitle() bool
Return True if B is a titlecased string and there is at least one character in B, i.e. uppercase characters may only follow uncased characters and lowercase characters only cased ones. Return False otherwise.
- isupper() bool
Return True if all cased characters in B are uppercase and there is at least one cased character in B, False otherwise.
- join(iterable_of_bytes, /)
Concatenate any number of bytes objects.
The bytes whose method is called is inserted in between each pair.
The result is returned as a new bytes object.
Example: b’.’.join([b’ab’, b’pq’, b’rs’]) -> b’ab.pq.rs’.
- ljust(width, fillchar=b' ', /)
Return a left-justified string of length width.
Padding is done using the specified fill character.
- lower() copy of B
Return a copy of B with all ASCII characters converted to lowercase.
- lstrip(bytes=None, /)
Strip leading bytes contained in the argument.
If the argument is omitted or None, strip leading ASCII whitespace.
- static maketrans(frm, to, /)
Return a translation table useable for the bytes or bytearray translate method.
The returned table will be one where each byte in frm is mapped to the byte at the same position in to.
The bytes objects frm and to must be of the same length.
- partition(sep, /)
Partition the bytes into three parts using the given separator.
This will search for the separator sep in the bytes. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.
If the separator is not found, returns a 3-tuple containing the original bytes object and two empty bytes objects.
- removeprefix(prefix, /)
Return a bytes object with the given prefix string removed if present.
If the bytes starts with the prefix string, return bytes[len(prefix):]. Otherwise, return a copy of the original bytes.
- removesuffix(suffix, /)
Return a bytes object with the given suffix string removed if present.
If the bytes ends with the suffix string and that suffix is not empty, return bytes[:-len(prefix)]. Otherwise, return a copy of the original bytes.
- replace(old, new, count=-1, /)
Return a copy with all occurrences of substring old replaced by new.
- count
Maximum number of occurrences to replace. -1 (the default value) means replace all occurrences.
If the optional argument count is given, only the first count occurrences are replaced.
- rfind(sub[, start[, end]]) int
Return the highest index in B where subsection sub is found, such that sub is contained within B[start,end]. Optional arguments start and end are interpreted as in slice notation.
Return -1 on failure.
- rindex(sub[, start[, end]]) int
Return the highest index in B where subsection sub is found, such that sub is contained within B[start,end]. Optional arguments start and end are interpreted as in slice notation.
Raise ValueError when the subsection is not found.
- rjust(width, fillchar=b' ', /)
Return a right-justified string of length width.
Padding is done using the specified fill character.
- rpartition(sep, /)
Partition the bytes into three parts using the given separator.
This will search for the separator sep in the bytes, starting at the end. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.
If the separator is not found, returns a 3-tuple containing two empty bytes objects and the original bytes object.
- rsplit(sep=None, maxsplit=-1)
Return a list of the sections in the bytes, using sep as the delimiter.
- sep
The delimiter according which to split the bytes. None (the default value) means split on ASCII whitespace characters (space, tab, return, newline, formfeed, vertical tab).
- maxsplit
Maximum number of splits to do. -1 (the default value) means no limit.
Splitting is done starting at the end of the bytes and working to the front.
- rstrip(bytes=None, /)
Strip trailing bytes contained in the argument.
If the argument is omitted or None, strip trailing ASCII whitespace.
- split(sep=None, maxsplit=-1)
Return a list of the sections in the bytes, using sep as the delimiter.
- sep
The delimiter according which to split the bytes. None (the default value) means split on ASCII whitespace characters (space, tab, return, newline, formfeed, vertical tab).
- maxsplit
Maximum number of splits to do. -1 (the default value) means no limit.
- splitlines(keepends=False)
Return a list of the lines in the bytes, breaking at line boundaries.
Line breaks are not included in the resulting list unless keepends is given and true.
- startswith(prefix[, start[, end]]) bool
Return True if B starts with the specified prefix, False otherwise. With optional start, test B beginning at that position. With optional end, stop comparing B at that position. prefix can also be a tuple of bytes to try.
- strip(bytes=None, /)
Strip leading and trailing bytes contained in the argument.
If the argument is omitted or None, strip leading and trailing ASCII whitespace.
- swapcase() copy of B
Return a copy of B with uppercase ASCII characters converted to lowercase ASCII and vice versa.
- title() copy of B
Return a titlecased version of B, i.e. ASCII words start with uppercase characters, all remaining cased characters have lowercase.
- translate(table, /, delete=b'')
Return a copy with each character mapped by the given translation table.
- table
Translation table, which must be a bytes object of length 256.
All characters occurring in the optional argument delete are removed. The remaining characters are mapped through the given translation table.
- upper() copy of B
Return a copy of B with all ASCII characters converted to uppercase.
- zfill(width, /)
Pad a numeric string with zeros on the left, to fill a field of the given width.
The original string is never truncated.
- class lxml.etree._ElementTagMatcher
Bases:
object
Dead but public. :)
- class lxml.etree._ElementTree
Bases:
object
- _setroot(self, root)
Relocate the ElementTree to a new root node.
- find(self, path, namespaces=None)
Finds the first toplevel element with given tag. Same as
tree.getroot().find(path)
.The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all elements matching the ElementPath expression. Same as getroot().findall(path).
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds the text for the first element matching the ElementPath expression. Same as getroot().findtext(path)
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- getelementpath(self, element)
Returns a structural, absolute ElementPath expression to find the element. This path can be used in the .find() method to look up the element, provided that the elements along the path and their list of immediate children were not modified in between.
ElementPath has the advantage over an XPath expression (as returned by the .getpath() method) that it does not require additional prefix declarations. It is always self-contained.
- getiterator(self, *tags, tag=None)
Returns a sequence or iterator of all elements in document order (depth first pre-order), starting with the root element.
Can be restricted to find only elements with specific tags, see _Element.iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
tree.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getpath(self, element)
Returns a structural, absolute XPath expression to find the element.
For namespaced elements, the expression uses prefixes from the document, which therefore need to be provided in order to make any use of the expression in XPath.
Also see the method getelementpath(self, element), which returns a self-contained ElementPath expression.
- getroot(self)
Gets the root element for this tree.
- iter(self, tag=None, *tags)
Creates an iterator for the root element. The iterator loops over all elements in this tree, in document order. Note that siblings of the root element (comments or processing instructions) are not returned by the iterator.
Can be restricted to find only elements with specific tags, see _Element.iter.
- iterfind(self, path, namespaces=None)
Iterates over all elements matching the ElementPath expression. Same as getroot().iterfind(path).
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- parse(self, source, parser=None, base_url=None)
Updates self with the content of source and returns its root.
- relaxng(self, relaxng)
Validate this document using other document.
The relaxng argument is a tree that should contain a Relax NG schema.
Returns True or False, depending on whether validation succeeded.
Note: if you are going to apply the same Relax NG schema against multiple documents, it is more efficient to use the RelaxNG class directly.
- write(file, *, encoding, method, pretty_print, xml_declaration, with_tail, standalone, doctype, compression, exclusive, inclusive_ns_prefixes, with_comments, strip_text, docstring)
- write(self, file, encoding=None, method=”xml”,
pretty_print=False, xml_declaration=None, with_tail=True, standalone=None, doctype=None, compression=0, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False)
Write the tree to a filename, file or file-like object.
Defaults to ASCII encoding and writing a declaration as needed.
The keyword argument ‘method’ selects the output method: ‘xml’, ‘html’, ‘text’ or ‘c14n’. Default is ‘xml’.
With
method="c14n"
(C14N version 1), the optionsexclusive
,with_comments
andinclusive_ns_prefixes
request exclusive C14N, include comments, and list the inclusive prefixes respectively.With
method="c14n2"
(C14N version 2), thewith_comments
andstrip_text
options control the output of comments and text space according to C14N 2.0.Passing a boolean value to the
standalone
option will output an XML declaration with the correspondingstandalone
flag.The
doctype
option allows passing in a plain string that will be serialised before the XML tree. Note that passing in non well-formed content here will make the XML output non well-formed. Also, an existing doctype in the document tree will not be removed when serialising an ElementTree instance.The
compression
option enables GZip compression level 1-9.The
inclusive_ns_prefixes
should be a list of namespace strings (i.e. [‘xs’, ‘xsi’]) that will be promoted to the top-level element during exclusive C14N serialisation. This parameter is ignored if exclusive mode=False.If exclusive=True and no list is provided, a namespace will only be rendered if it is used by the immediate parent or one of its attributes and its prefix and values have not already been rendered by an ancestor of the namespace node’s parent element.
- write_c14n(file, *, exclusive, with_comments, compression, inclusive_ns_prefixes)
- write_c14n(self, file, exclusive=False, with_comments=True,
compression=0, inclusive_ns_prefixes=None)
C14N write of document. Always writes UTF-8.
The
compression
option enables GZip compression level 1-9.The
inclusive_ns_prefixes
should be a list of namespace strings (i.e. [‘xs’, ‘xsi’]) that will be promoted to the top-level element during exclusive C14N serialisation. This parameter is ignored if exclusive mode=False.If exclusive=True and no list is provided, a namespace will only be rendered if it is used by the immediate parent or one of its attributes and its prefix and values have not already been rendered by an ancestor of the namespace node’s parent element.
NOTE: This method is deprecated as of lxml 4.4 and will be removed in a future release. Use
.write(f, method="c14n")
instead.
- xinclude(self)
Process the XInclude nodes in this document and include the referenced XML fragments.
There is support for loading files through the file system, HTTP and FTP.
Note that XInclude does not support custom resolvers in Python space due to restrictions of libxml2 <= 2.6.29.
- xmlschema(self, xmlschema)
Validate this document using other document.
The xmlschema argument is a tree that should contain an XML Schema.
Returns True or False, depending on whether validation succeeded.
Note: If you are going to apply the same XML Schema against multiple documents, it is more efficient to use the XMLSchema class directly.
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
XPath evaluate in context of document.
namespaces
is an optional dictionary with prefix to namespace URI mappings, used by XPath.extensions
defines additional extension functions.Returns a list (nodeset), or bool, float or string.
In case of a list result, return Element for element nodes, string for text and attribute values.
Note: if you are going to apply multiple XPath expressions against the same document, it is more efficient to use XPathEvaluator directly.
- xslt(self, _xslt, extensions=None, access_control=None, **_kw)
Transform this document using other document.
xslt is a tree that should be XSLT keyword parameters are XSLT transformation parameters.
Returns the transformed tree.
Note: if you are going to apply the same XSLT stylesheet against multiple documents, it is more efficient to use the XSLT class directly.
- docinfo
Information about the document provided by parser and DTD.
- parser
The parser that was used to parse the document in this ElementTree.
- class lxml.etree._ElementUnicodeResult
Bases:
str
- capitalize()
Return a capitalized version of the string.
More specifically, make the first character have upper case and the rest lower case.
- casefold()
Return a version of the string suitable for caseless comparisons.
- center(width, fillchar=' ', /)
Return a centered string of length width.
Padding is done using the specified fill character (default is a space).
- count(sub[, start[, end]]) int
Return the number of non-overlapping occurrences of substring sub in string S[start:end]. Optional arguments start and end are interpreted as in slice notation.
- encode(encoding='utf-8', errors='strict')
Encode the string using the codec registered for encoding.
- encoding
The encoding in which to encode the string.
- errors
The error handling scheme to use for encoding errors. The default is ‘strict’ meaning that encoding errors raise a UnicodeEncodeError. Other possible values are ‘ignore’, ‘replace’ and ‘xmlcharrefreplace’ as well as any other name registered with codecs.register_error that can handle UnicodeEncodeErrors.
- endswith(suffix[, start[, end]]) bool
Return True if S ends with the specified suffix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. suffix can also be a tuple of strings to try.
- expandtabs(tabsize=8)
Return a copy where all tab characters are expanded using spaces.
If tabsize is not given, a tab size of 8 characters is assumed.
- find(sub[, start[, end]]) int
Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Return -1 on failure.
- format(*args, **kwargs) str
Return a formatted version of S, using substitutions from args and kwargs. The substitutions are identified by braces (‘{’ and ‘}’).
- format_map(mapping) str
Return a formatted version of S, using substitutions from mapping. The substitutions are identified by braces (‘{’ and ‘}’).
- getparent()
- index(sub[, start[, end]]) int
Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Raises ValueError when the substring is not found.
- isalnum()
Return True if the string is an alpha-numeric string, False otherwise.
A string is alpha-numeric if all characters in the string are alpha-numeric and there is at least one character in the string.
- isalpha()
Return True if the string is an alphabetic string, False otherwise.
A string is alphabetic if all characters in the string are alphabetic and there is at least one character in the string.
- isascii()
Return True if all characters in the string are ASCII, False otherwise.
ASCII characters have code points in the range U+0000-U+007F. Empty string is ASCII too.
- isdecimal()
Return True if the string is a decimal string, False otherwise.
A string is a decimal string if all characters in the string are decimal and there is at least one character in the string.
- isdigit()
Return True if the string is a digit string, False otherwise.
A string is a digit string if all characters in the string are digits and there is at least one character in the string.
- isidentifier()
Return True if the string is a valid Python identifier, False otherwise.
Call keyword.iskeyword(s) to test whether string s is a reserved identifier, such as “def” or “class”.
- islower()
Return True if the string is a lowercase string, False otherwise.
A string is lowercase if all cased characters in the string are lowercase and there is at least one cased character in the string.
- isnumeric()
Return True if the string is a numeric string, False otherwise.
A string is numeric if all characters in the string are numeric and there is at least one character in the string.
- isprintable()
Return True if the string is printable, False otherwise.
A string is printable if all of its characters are considered printable in repr() or if it is empty.
- isspace()
Return True if the string is a whitespace string, False otherwise.
A string is whitespace if all characters in the string are whitespace and there is at least one character in the string.
- istitle()
Return True if the string is a title-cased string, False otherwise.
In a title-cased string, upper- and title-case characters may only follow uncased characters and lowercase characters only cased ones.
- isupper()
Return True if the string is an uppercase string, False otherwise.
A string is uppercase if all cased characters in the string are uppercase and there is at least one cased character in the string.
- join(iterable, /)
Concatenate any number of strings.
The string whose method is called is inserted in between each given string. The result is returned as a new string.
Example: ‘.’.join([‘ab’, ‘pq’, ‘rs’]) -> ‘ab.pq.rs’
- ljust(width, fillchar=' ', /)
Return a left-justified string of length width.
Padding is done using the specified fill character (default is a space).
- lower()
Return a copy of the string converted to lowercase.
- lstrip(chars=None, /)
Return a copy of the string with leading whitespace removed.
If chars is given and not None, remove characters in chars instead.
- static maketrans()
Return a translation table usable for str.translate().
If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers) or characters to Unicode ordinals, strings or None. Character keys will be then converted to ordinals. If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to None in the result.
- partition(sep, /)
Partition the string into three parts using the given separator.
This will search for the separator in the string. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.
If the separator is not found, returns a 3-tuple containing the original string and two empty strings.
- removeprefix(prefix, /)
Return a str with the given prefix string removed if present.
If the string starts with the prefix string, return string[len(prefix):]. Otherwise, return a copy of the original string.
- removesuffix(suffix, /)
Return a str with the given suffix string removed if present.
If the string ends with the suffix string and that suffix is not empty, return string[:-len(suffix)]. Otherwise, return a copy of the original string.
- replace(old, new, count=-1, /)
Return a copy with all occurrences of substring old replaced by new.
- count
Maximum number of occurrences to replace. -1 (the default value) means replace all occurrences.
If the optional argument count is given, only the first count occurrences are replaced.
- rfind(sub[, start[, end]]) int
Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Return -1 on failure.
- rindex(sub[, start[, end]]) int
Return the highest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation.
Raises ValueError when the substring is not found.
- rjust(width, fillchar=' ', /)
Return a right-justified string of length width.
Padding is done using the specified fill character (default is a space).
- rpartition(sep, /)
Partition the string into three parts using the given separator.
This will search for the separator in the string, starting at the end. If the separator is found, returns a 3-tuple containing the part before the separator, the separator itself, and the part after it.
If the separator is not found, returns a 3-tuple containing two empty strings and the original string.
- rsplit(sep=None, maxsplit=-1)
Return a list of the substrings in the string, using sep as the separator string.
- sep
The separator used to split the string.
When set to None (the default value), will split on any whitespace character (including \n \r \t \f and spaces) and will discard empty strings from the result.
- maxsplit
Maximum number of splits (starting from the left). -1 (the default value) means no limit.
Splitting starts at the end of the string and works to the front.
- rstrip(chars=None, /)
Return a copy of the string with trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
- split(sep=None, maxsplit=-1)
Return a list of the substrings in the string, using sep as the separator string.
- sep
The separator used to split the string.
When set to None (the default value), will split on any whitespace character (including \n \r \t \f and spaces) and will discard empty strings from the result.
- maxsplit
Maximum number of splits (starting from the left). -1 (the default value) means no limit.
Note, str.split() is mainly useful for data that has been intentionally delimited. With natural text that includes punctuation, consider using the regular expression module.
- splitlines(keepends=False)
Return a list of the lines in the string, breaking at line boundaries.
Line breaks are not included in the resulting list unless keepends is given and true.
- startswith(prefix[, start[, end]]) bool
Return True if S starts with the specified prefix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. prefix can also be a tuple of strings to try.
- strip(chars=None, /)
Return a copy of the string with leading and trailing whitespace removed.
If chars is given and not None, remove characters in chars instead.
- swapcase()
Convert uppercase characters to lowercase and lowercase characters to uppercase.
- title()
Return a version of the string where each word is titlecased.
More specifically, words start with uppercased characters and all remaining cased characters have lower case.
- translate(table, /)
Replace each character in the string using the given translation table.
- table
Translation table, which must be a mapping of Unicode ordinals to Unicode ordinals, strings, or None.
The table must implement lookup/indexing via __getitem__, for instance a dictionary or list. If this operation raises LookupError, the character is left untouched. Characters mapped to None are deleted.
- upper()
Return a copy of the string converted to uppercase.
- zfill(width, /)
Pad a numeric string with zeros on the left, to fill a field of the given width.
The string is never truncated.
- attrname
- is_attribute
- is_tail
- is_text
- class lxml.etree._Entity
Bases:
__ContentOnlyElement
- _init(self)
Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.
- addnext(self, element)
Adds the element as a following sibling directly after this element.
This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- addprevious(self, element)
Adds the element as a preceding sibling directly before this element.
This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- append(self, value)
- clear(self, keep_tail=False)
Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.
Pass
keep_tail=True
to leave the tail text untouched.
- cssselect(expr, *, translator)
Run the CSS expression on this element and its children, returning a list of the results.
Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.
- extend(self, elements)
Extends the current children by the elements in the iterable.
- find(self, path, namespaces=None)
Finds the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds text for the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- get(self, key, default=None)
- getchildren(self)
Returns all direct children. The elements are returned in document order.
- Deprecated
Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use
list(element)
or simply iterate over elements.
- getiterator(self, tag=None, *tags)
Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags, see iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
element.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getnext(self)
Returns the following sibling of this element or None.
- getparent(self)
Returns the parent of this element or None for the root element.
- getprevious(self)
Returns the preceding sibling of this element or None.
- getroottree(self)
Return an ElementTree for the root node of the document that contains this element.
This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.
- index(self, child, start=None, stop=None)
Find the position of the child within the parent.
This method is not part of the original ElementTree API.
- insert(self, index, value)
- items(self)
- iter(self, tag=None, *tags)
Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags: pass
"{ns}localname"
as tag. Either or both ofns
andlocalname
can be*
for a wildcard;ns
can be empty for no namespace."localname"
is equivalent to"{}localname"
(i.e. no namespace) but"*"
is"{*}*"
(any or no namespace), not"{}*"
.You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.
Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.
- iterancestors(self, tag=None, *tags)
Iterate over the ancestors of this element (from parent to parent).
Can be restricted to find only elements with specific tags, see iter.
- iterchildren(self, tag=None, *tags, reversed=False)
Iterate over the children of this element.
As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.
- iterdescendants(self, tag=None, *tags)
Iterate over the descendants of this element in document order.
As opposed to
el.iter()
, this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.
- iterfind(self, path, namespaces=None)
Iterates over all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- itersiblings(self, tag=None, *tags, preceding=False)
Iterate over the following or preceding siblings of this element.
The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.
Can be restricted to find only elements with specific tags, see iter.
- itertext(self, tag=None, *tags, with_tail=True)
Iterates over the text content of a subtree.
You can pass tag names to restrict text content to specific elements, see iter.
You can set the
with_tail
keyword argument toFalse
to skip over tail text.
- keys(self)
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with the same document.
- remove(self, element)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
- replace(self, old_element, new_element)
Replaces a subelement with the element passed as second argument.
- set(self, key, value)
- values(self)
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
Evaluate an xpath expression using the element as context node.
- attrib
- base
The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.
Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.
Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).
- name
- nsmap
Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.
Note that changing the returned dict has no effect on the Element.
- prefix
Namespace prefix or None.
- sourceline
Original line number as found by the parser or None if unknown.
- tag
- tail
Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.
- text
- class lxml.etree._ErrorLog
Bases:
_ListErrorLog
- clear()
- copy()
Creates a shallow copy of this error log and the list of entries.
- filter_domains(domains)
Filter the errors by the given domains and return a new error log containing the matches.
- filter_from_errors(self)
Convenience method to get all error messages or worse.
- filter_from_fatals(self)
Convenience method to get all fatal error messages.
- filter_from_level(self, level)
Return a log with all messages of the requested level of worse.
- filter_from_warnings(self)
Convenience method to get all warnings or worse.
- filter_levels(self, levels)
Filter the errors by the given error levels and return a new error log containing the matches.
- filter_types(self, types)
Filter the errors by the given types and return a new error log containing the matches.
- receive(entry)
- last_error
- class lxml.etree._FeedParser
Bases:
_BaseParser
- close(self)
Terminates feeding data to this parser. This tells the parser to process any remaining data in the feed buffer, and then returns the root Element of the tree that was parsed.
This method must be called after passing the last chunk of data into the
feed()
method. It should only be called when using the feed parser interface, all other usage is undefined.
- copy(self)
Create a new parser with the same configuration.
- feed(self, data)
Feeds data to the parser. The argument should be an 8-bit string buffer containing encoded data, although Unicode is supported as long as both string types are not mixed.
This is the main entry point to the consumer interface of a parser. The parser will parse as much of the XML stream as it can on each call. To finish parsing or to reset the parser, call the
close()
method. Both methods may raise ParseError if errors occur in the input data. If an error is raised, there is no longer a need to callclose()
.The feed parser interface is independent of the normal parser usage. You can use the same parser as a feed parser and in the
parse()
function concurrently.
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with this parser.
- setElementClassLookup(lookup)
- Deprecated
use
parser.set_element_class_lookup(lookup)
instead.
- set_element_class_lookup(self, lookup=None)
Set a lookup scheme for element classes generated from this parser.
Reset it by passing None or nothing.
- error_log
The error log of the last parser run.
- feed_error_log
The error log of the last (or current) run of the feed parser.
Note that this is local to the feed parser and thus is different from what the
error_log
property returns.
- resolvers
The custom resolver registry of this parser.
- target
- version
The version of the underlying XML parser.
- class lxml.etree._IDDict
Bases:
object
IDDict(self, etree) A dictionary-like proxy class that mapps ID attributes to elements.
The dictionary must be instantiated with the root element of a parsed XML document, otherwise the behaviour is undefined. Elements and XML trees that were created or modified ‘by hand’ are not supported.
- copy()
- get(id_name)
- has_key(id_name)
- items()
- iteritems()
- iterkeys()
- itervalues()
- keys()
- values()
- class lxml.etree._ListErrorLog
Bases:
_BaseErrorLog
Immutable base version of a list based error log.
- copy()
Creates a shallow copy of this error log. Reuses the list of entries.
- filter_domains(domains)
Filter the errors by the given domains and return a new error log containing the matches.
- filter_from_errors(self)
Convenience method to get all error messages or worse.
- filter_from_fatals(self)
Convenience method to get all fatal error messages.
- filter_from_level(self, level)
Return a log with all messages of the requested level of worse.
- filter_from_warnings(self)
Convenience method to get all warnings or worse.
- filter_levels(self, levels)
Filter the errors by the given error levels and return a new error log containing the matches.
- filter_types(self, types)
Filter the errors by the given types and return a new error log containing the matches.
- receive(entry)
- last_error
- class lxml.etree._LogEntry
Bases:
object
A log message entry from an error log.
Attributes:
message: the message text
domain: the domain ID (see lxml.etree.ErrorDomains)
type: the message type ID (see lxml.etree.ErrorTypes)
level: the log level ID (see lxml.etree.ErrorLevels)
line: the line at which the message originated (if applicable)
column: the character column at which the message originated (if applicable)
filename: the name of the file in which the message originated (if applicable)
path: the location in which the error was found (if available)
- column
- domain
- domain_name
The name of the error domain. See lxml.etree.ErrorDomains
- filename
The file path where the report originated, if any.
- level
- level_name
The name of the error level. See lxml.etree.ErrorLevels
- line
- message
The log message string.
- path
The XPath for the node where the error was detected.
- type
- type_name
The name of the error type. See lxml.etree.ErrorTypes
- class lxml.etree._ProcessingInstruction
Bases:
__ContentOnlyElement
- _init(self)
Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.
- addnext(self, element)
Adds the element as a following sibling directly after this element.
This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- addprevious(self, element)
Adds the element as a preceding sibling directly before this element.
This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- append(self, value)
- clear(self, keep_tail=False)
Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.
Pass
keep_tail=True
to leave the tail text untouched.
- cssselect(expr, *, translator)
Run the CSS expression on this element and its children, returning a list of the results.
Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.
- extend(self, elements)
Extends the current children by the elements in the iterable.
- find(self, path, namespaces=None)
Finds the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds text for the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- get(self, key, default=None)
Try to parse pseudo-attributes from the text content of the processing instruction, search for one with the given key as name and return its associated value.
Note that this is only a convenience method for the most common case that all text content is structured in attribute-like name-value pairs with properly quoted values. It is not guaranteed to work for all possible text content.
- getchildren(self)
Returns all direct children. The elements are returned in document order.
- Deprecated
Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use
list(element)
or simply iterate over elements.
- getiterator(self, tag=None, *tags)
Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags, see iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
element.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getnext(self)
Returns the following sibling of this element or None.
- getparent(self)
Returns the parent of this element or None for the root element.
- getprevious(self)
Returns the preceding sibling of this element or None.
- getroottree(self)
Return an ElementTree for the root node of the document that contains this element.
This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.
- index(self, child, start=None, stop=None)
Find the position of the child within the parent.
This method is not part of the original ElementTree API.
- insert(self, index, value)
- items(self)
- iter(self, tag=None, *tags)
Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags: pass
"{ns}localname"
as tag. Either or both ofns
andlocalname
can be*
for a wildcard;ns
can be empty for no namespace."localname"
is equivalent to"{}localname"
(i.e. no namespace) but"*"
is"{*}*"
(any or no namespace), not"{}*"
.You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.
Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.
- iterancestors(self, tag=None, *tags)
Iterate over the ancestors of this element (from parent to parent).
Can be restricted to find only elements with specific tags, see iter.
- iterchildren(self, tag=None, *tags, reversed=False)
Iterate over the children of this element.
As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.
- iterdescendants(self, tag=None, *tags)
Iterate over the descendants of this element in document order.
As opposed to
el.iter()
, this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.
- iterfind(self, path, namespaces=None)
Iterates over all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- itersiblings(self, tag=None, *tags, preceding=False)
Iterate over the following or preceding siblings of this element.
The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.
Can be restricted to find only elements with specific tags, see iter.
- itertext(self, tag=None, *tags, with_tail=True)
Iterates over the text content of a subtree.
You can pass tag names to restrict text content to specific elements, see iter.
You can set the
with_tail
keyword argument toFalse
to skip over tail text.
- keys(self)
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with the same document.
- remove(self, element)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
- replace(self, old_element, new_element)
Replaces a subelement with the element passed as second argument.
- set(self, key, value)
- values(self)
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
Evaluate an xpath expression using the element as context node.
- attrib
Returns a dict containing all pseudo-attributes that can be parsed from the text content of this processing instruction. Note that modifying the dict currently has no effect on the XML node, although this is not guaranteed to stay this way.
- base
The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.
Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.
Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).
- nsmap
Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.
Note that changing the returned dict has no effect on the Element.
- prefix
Namespace prefix or None.
- sourceline
Original line number as found by the parser or None if unknown.
- tag
- tail
Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.
- target
- text
- class lxml.etree._RotatingErrorLog
Bases:
_ErrorLog
- clear()
- copy()
Creates a shallow copy of this error log and the list of entries.
- filter_domains(domains)
Filter the errors by the given domains and return a new error log containing the matches.
- filter_from_errors(self)
Convenience method to get all error messages or worse.
- filter_from_fatals(self)
Convenience method to get all fatal error messages.
- filter_from_level(self, level)
Return a log with all messages of the requested level of worse.
- filter_from_warnings(self)
Convenience method to get all warnings or worse.
- filter_levels(self, levels)
Filter the errors by the given error levels and return a new error log containing the matches.
- filter_types(self, types)
Filter the errors by the given types and return a new error log containing the matches.
- receive(entry)
- last_error
- class lxml.etree._SaxParserTarget
Bases:
object
- class lxml.etree._Validator
Bases:
object
Base class for XML validators.
- _append_log_message(domain, type, level, line, message, filename)
- _clear_error_log()
- assertValid(self, etree)
Raises DocumentInvalid if the document does not comply with the schema.
- assert_(self, etree)
Raises AssertionError if the document does not comply with the schema.
- validate(self, etree)
Validate the document using this schema.
Returns true if document is valid, false if not.
- error_log
The log of validation errors and warnings.
- class lxml.etree._XPathEvaluatorBase
Bases:
object
- evaluate(self, _eval_arg, **_variables)
Evaluate an XPath expression.
Instead of calling this method, you can also call the evaluator object itself.
Variables may be provided as keyword arguments. Note that namespaces are currently not supported for variables.
- Deprecated
call the object, not its method.
- error_log
- class lxml.etree._XSLTProcessingInstruction
Bases:
PIBase
- _init(self)
Called after object initialisation. Custom subclasses may override this if they recursively call _init() in the superclasses.
- addnext(self, element)
Adds the element as a following sibling directly after this element.
This is normally used to set a processing instruction or comment after the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- addprevious(self, element)
Adds the element as a preceding sibling directly before this element.
This is normally used to set a processing instruction or comment before the root node of a document. Note that tail text is automatically discarded when adding at the root level.
- append(self, value)
- clear(self, keep_tail=False)
Resets an element. This function removes all subelements, clears all attributes and sets the text and tail properties to None.
Pass
keep_tail=True
to leave the tail text untouched.
- cssselect(expr, *, translator)
Run the CSS expression on this element and its children, returning a list of the results.
Equivalent to lxml.cssselect.CSSSelect(expr)(self) – note that pre-compiling the expression can provide a substantial speedup.
- extend(self, elements)
Extends the current children by the elements in the iterable.
- find(self, path, namespaces=None)
Finds the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds text for the first matching subelement, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- get(self, key, default=None)
Try to parse pseudo-attributes from the text content of the processing instruction, search for one with the given key as name and return its associated value.
Note that this is only a convenience method for the most common case that all text content is structured in attribute-like name-value pairs with properly quoted values. It is not guaranteed to work for all possible text content.
- getchildren(self)
Returns all direct children. The elements are returned in document order.
- Deprecated
Note that this method has been deprecated as of ElementTree 1.3 and lxml 2.0. New code should use
list(element)
or simply iterate over elements.
- getiterator(self, tag=None, *tags)
Returns a sequence or iterator of all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags, see iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
element.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getnext(self)
Returns the following sibling of this element or None.
- getparent(self)
Returns the parent of this element or None for the root element.
- getprevious(self)
Returns the preceding sibling of this element or None.
- getroottree(self)
Return an ElementTree for the root node of the document that contains this element.
This is the same as following element.getparent() up the tree until it returns None (for the root element) and then build an ElementTree for the last parent that was returned.
- index(self, child, start=None, stop=None)
Find the position of the child within the parent.
This method is not part of the original ElementTree API.
- insert(self, index, value)
- items(self)
- iter(self, tag=None, *tags)
Iterate over all elements in the subtree in document order (depth first pre-order), starting with this element.
Can be restricted to find only elements with specific tags: pass
"{ns}localname"
as tag. Either or both ofns
andlocalname
can be*
for a wildcard;ns
can be empty for no namespace."localname"
is equivalent to"{}localname"
(i.e. no namespace) but"*"
is"{*}*"
(any or no namespace), not"{}*"
.You can also pass the Element, Comment, ProcessingInstruction and Entity factory functions to look only for the specific element type.
Passing multiple tags (or a sequence of tags) instead of a single tag will let the iterator return all elements matching any of these tags, in document order.
- iterancestors(self, tag=None, *tags)
Iterate over the ancestors of this element (from parent to parent).
Can be restricted to find only elements with specific tags, see iter.
- iterchildren(self, tag=None, *tags, reversed=False)
Iterate over the children of this element.
As opposed to using normal iteration on this element, the returned elements can be reversed with the ‘reversed’ keyword and restricted to find only elements with specific tags, see iter.
- iterdescendants(self, tag=None, *tags)
Iterate over the descendants of this element in document order.
As opposed to
el.iter()
, this iterator does not yield the element itself. The returned elements can be restricted to find only elements with specific tags, see iter.
- iterfind(self, path, namespaces=None)
Iterates over all matching subelements, by tag name or path.
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- itersiblings(self, tag=None, *tags, preceding=False)
Iterate over the following or preceding siblings of this element.
The direction is determined by the ‘preceding’ keyword which defaults to False, i.e. forward iteration over the following siblings. When True, the iterator yields the preceding siblings in reverse document order, i.e. starting right before the current element and going backwards.
Can be restricted to find only elements with specific tags, see iter.
- itertext(self, tag=None, *tags, with_tail=True)
Iterates over the text content of a subtree.
You can pass tag names to restrict text content to specific elements, see iter.
You can set the
with_tail
keyword argument toFalse
to skip over tail text.
- keys(self)
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with the same document.
- parseXSL(self, parser=None)
Try to parse the stylesheet referenced by this PI and return an ElementTree for it. If the stylesheet is embedded in the same document (referenced via xml:id), find and return an ElementTree for the stylesheet Element.
The optional
parser
keyword argument can be passed to specify the parser used to read from external stylesheet URLs.
- remove(self, element)
Removes a matching subelement. Unlike the find methods, this method compares elements based on identity, not on tag value or contents.
- replace(self, old_element, new_element)
Replaces a subelement with the element passed as second argument.
- set(self, key, value)
Supports setting the ‘href’ pseudo-attribute in the text of the processing instruction.
- values(self)
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
Evaluate an xpath expression using the element as context node.
- attrib
Returns a dict containing all pseudo-attributes that can be parsed from the text content of this processing instruction. Note that modifying the dict currently has no effect on the XML node, although this is not guaranteed to stay this way.
- base
The base URI of the Element (xml:base or HTML base URL). None if the base URI is unknown.
Note that the value depends on the URL of the document that holds the Element if there is no xml:base attribute on the Element or its ancestors.
Setting this property will set an xml:base attribute on the Element, regardless of the document type (XML or HTML).
- nsmap
Namespace prefix->URI mapping known in the context of this Element. This includes all namespace declarations of the parents.
Note that changing the returned dict has no effect on the Element.
- prefix
Namespace prefix or None.
- sourceline
Original line number as found by the parser or None if unknown.
- tag
- tail
Text after this element’s end tag, but before the next sibling element’s start tag. This is either a string or the value None, if there was no text.
- target
- text
- class lxml.etree._XSLTResultTree
Bases:
_ElementTree
The result of an XSLT evaluation.
Use
str()
orbytes()
(orunicode()
in Python 2.x) to serialise to a string, and the.write_output()
method to write serialise to a file.- _setroot(self, root)
Relocate the ElementTree to a new root node.
- find(self, path, namespaces=None)
Finds the first toplevel element with given tag. Same as
tree.getroot().find(path)
.The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findall(self, path, namespaces=None)
Finds all elements matching the ElementPath expression. Same as getroot().findall(path).
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- findtext(self, path, default=None, namespaces=None)
Finds the text for the first element matching the ElementPath expression. Same as getroot().findtext(path)
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- getelementpath(self, element)
Returns a structural, absolute ElementPath expression to find the element. This path can be used in the .find() method to look up the element, provided that the elements along the path and their list of immediate children were not modified in between.
ElementPath has the advantage over an XPath expression (as returned by the .getpath() method) that it does not require additional prefix declarations. It is always self-contained.
- getiterator(self, *tags, tag=None)
Returns a sequence or iterator of all elements in document order (depth first pre-order), starting with the root element.
Can be restricted to find only elements with specific tags, see _Element.iter.
- Deprecated
Note that this method is deprecated as of ElementTree 1.3 and lxml 2.0. It returns an iterator in lxml, which diverges from the original ElementTree behaviour. If you want an efficient iterator, use the
tree.iter()
method instead. You should only use this method in new code if you require backwards compatibility with older versions of lxml or ElementTree.
- getpath(self, element)
Returns a structural, absolute XPath expression to find the element.
For namespaced elements, the expression uses prefixes from the document, which therefore need to be provided in order to make any use of the expression in XPath.
Also see the method getelementpath(self, element), which returns a self-contained ElementPath expression.
- getroot(self)
Gets the root element for this tree.
- iter(self, tag=None, *tags)
Creates an iterator for the root element. The iterator loops over all elements in this tree, in document order. Note that siblings of the root element (comments or processing instructions) are not returned by the iterator.
Can be restricted to find only elements with specific tags, see _Element.iter.
- iterfind(self, path, namespaces=None)
Iterates over all elements matching the ElementPath expression. Same as getroot().iterfind(path).
The optional
namespaces
argument accepts a prefix-to-namespace mapping that allows the usage of XPath prefixes in the path expression.
- parse(self, source, parser=None, base_url=None)
Updates self with the content of source and returns its root.
- relaxng(self, relaxng)
Validate this document using other document.
The relaxng argument is a tree that should contain a Relax NG schema.
Returns True or False, depending on whether validation succeeded.
Note: if you are going to apply the same Relax NG schema against multiple documents, it is more efficient to use the RelaxNG class directly.
- write(file, *, encoding, method, pretty_print, xml_declaration, with_tail, standalone, doctype, compression, exclusive, inclusive_ns_prefixes, with_comments, strip_text, docstring)
- write(self, file, encoding=None, method=”xml”,
pretty_print=False, xml_declaration=None, with_tail=True, standalone=None, doctype=None, compression=0, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False)
Write the tree to a filename, file or file-like object.
Defaults to ASCII encoding and writing a declaration as needed.
The keyword argument ‘method’ selects the output method: ‘xml’, ‘html’, ‘text’ or ‘c14n’. Default is ‘xml’.
With
method="c14n"
(C14N version 1), the optionsexclusive
,with_comments
andinclusive_ns_prefixes
request exclusive C14N, include comments, and list the inclusive prefixes respectively.With
method="c14n2"
(C14N version 2), thewith_comments
andstrip_text
options control the output of comments and text space according to C14N 2.0.Passing a boolean value to the
standalone
option will output an XML declaration with the correspondingstandalone
flag.The
doctype
option allows passing in a plain string that will be serialised before the XML tree. Note that passing in non well-formed content here will make the XML output non well-formed. Also, an existing doctype in the document tree will not be removed when serialising an ElementTree instance.The
compression
option enables GZip compression level 1-9.The
inclusive_ns_prefixes
should be a list of namespace strings (i.e. [‘xs’, ‘xsi’]) that will be promoted to the top-level element during exclusive C14N serialisation. This parameter is ignored if exclusive mode=False.If exclusive=True and no list is provided, a namespace will only be rendered if it is used by the immediate parent or one of its attributes and its prefix and values have not already been rendered by an ancestor of the namespace node’s parent element.
- write_c14n(file, *, exclusive, with_comments, compression, inclusive_ns_prefixes)
- write_c14n(self, file, exclusive=False, with_comments=True,
compression=0, inclusive_ns_prefixes=None)
C14N write of document. Always writes UTF-8.
The
compression
option enables GZip compression level 1-9.The
inclusive_ns_prefixes
should be a list of namespace strings (i.e. [‘xs’, ‘xsi’]) that will be promoted to the top-level element during exclusive C14N serialisation. This parameter is ignored if exclusive mode=False.If exclusive=True and no list is provided, a namespace will only be rendered if it is used by the immediate parent or one of its attributes and its prefix and values have not already been rendered by an ancestor of the namespace node’s parent element.
NOTE: This method is deprecated as of lxml 4.4 and will be removed in a future release. Use
.write(f, method="c14n")
instead.
- write_output(self, file, *, compression=0)
Serialise the XSLT output to a file or file-like object.
As opposed to the generic
.write()
method,.write_output()
serialises the result as defined by the<xsl:output>
tag.
- xinclude(self)
Process the XInclude nodes in this document and include the referenced XML fragments.
There is support for loading files through the file system, HTTP and FTP.
Note that XInclude does not support custom resolvers in Python space due to restrictions of libxml2 <= 2.6.29.
- xmlschema(self, xmlschema)
Validate this document using other document.
The xmlschema argument is a tree that should contain an XML Schema.
Returns True or False, depending on whether validation succeeded.
Note: If you are going to apply the same XML Schema against multiple documents, it is more efficient to use the XMLSchema class directly.
- xpath(self, _path, namespaces=None, extensions=None, smart_strings=True, **_variables)
XPath evaluate in context of document.
namespaces
is an optional dictionary with prefix to namespace URI mappings, used by XPath.extensions
defines additional extension functions.Returns a list (nodeset), or bool, float or string.
In case of a list result, return Element for element nodes, string for text and attribute values.
Note: if you are going to apply multiple XPath expressions against the same document, it is more efficient to use XPathEvaluator directly.
- xslt(self, _xslt, extensions=None, access_control=None, **_kw)
Transform this document using other document.
xslt is a tree that should be XSLT keyword parameters are XSLT transformation parameters.
Returns the transformed tree.
Note: if you are going to apply the same XSLT stylesheet against multiple documents, it is more efficient to use the XSLT class directly.
- docinfo
Information about the document provided by parser and DTD.
- parser
The parser that was used to parse the document in this ElementTree.
- xslt_profile
Return an ElementTree with profiling data for the stylesheet run.
- class lxml.etree.htmlfile(self, output_file, encoding=None, compression=None, close=False, buffered=True)
Bases:
xmlfile
A simple mechanism for incremental HTML serialisation. Works the same as xmlfile.
- class lxml.etree.iterparse(self, source, events=('end',), tag=None, attribute_defaults=False, dtd_validation=False, load_dtd=False, no_network=True, remove_blank_text=False, remove_comments=False, remove_pis=False, encoding=None, html=False, recover=None, huge_tree=False, schema=None)
Bases:
object
Incremental parser.
Parses XML into a tree and generates tuples (event, element) in a SAX-like fashion.
event
is any of ‘start’, ‘end’, ‘start-ns’, ‘end-ns’.For ‘start’ and ‘end’,
element
is the Element that the parser just found opening or closing. For ‘start-ns’, it is a tuple (prefix, URI) of a new namespace declaration. For ‘end-ns’, it is simply None. Note that all start and end events are guaranteed to be properly nested.The keyword argument
events
specifies a sequence of event type names that should be generated. By default, only ‘end’ events will be generated.The additional
tag
argument restricts the ‘start’ and ‘end’ events to those elements that match the given tag. Thetag
argument can also be a sequence of tags to allow matching more than one tag. By default, events are generated for all elements. Note that the ‘start-ns’ and ‘end-ns’ events are not impacted by this restriction.The other keyword arguments in the constructor are mainly based on the libxml2 parser configuration. A DTD will also be loaded if validation or attribute default values are requested.
- Available boolean keyword arguments:
attribute_defaults: read default attributes from DTD
dtd_validation: validate (if DTD is available)
load_dtd: use DTD for parsing
no_network: prevent network access for related files
remove_blank_text: discard blank text nodes
remove_comments: discard comments
remove_pis: discard processing instructions
strip_cdata: replace CDATA sections by normal text content (default: True)
compact: safe memory for short text content (default: True)
resolve_entities: replace entities by their text value (default: True)
- huge_tree: disable security restrictions and support very deep trees
and very long text content (only affects libxml2 2.7+)
html: parse input as HTML (default: XML)
- recover: try hard to parse through broken input (default: True for HTML,
False otherwise)
- Other keyword arguments:
encoding: override the document encoding
schema: an XMLSchema to validate against
- makeelement(self, _tag, attrib=None, nsmap=None, **_extra)
Creates a new element associated with this parser.
- set_element_class_lookup(self, lookup=None)
Set a lookup scheme for element classes generated from this parser.
Reset it by passing None or nothing.
- error_log
The error log of the last (or current) parser run.
- resolvers
The custom resolver registry of the last (or current) parser run.
- root
- version
The version of the underlying XML parser.
- class lxml.etree.iterwalk(self, element_or_tree, events=('end',), tag=None)
Bases:
object
A tree walker that generates events from an existing tree as if it was parsing XML data with
iterparse()
.Just as for
iterparse()
, thetag
argument can be a single tag or a sequence of tags.After receiving a ‘start’ or ‘start-ns’ event, the children and descendants of the current element can be excluded from iteration by calling the
skip_subtree()
method.- skip_subtree()
Prevent descending into the current subtree. Instead, the next returned event will be the ‘end’ event of the current element (if included), ignoring any children or descendants.
This has no effect right after an ‘end’ or ‘end-ns’ event.
- class lxml.etree.xmlfile(self, output_file, encoding=None, compression=None, close=False, buffered=True)
Bases:
object
A simple mechanism for incremental XML serialisation.
Usage example:
with xmlfile("somefile.xml", encoding='utf-8') as xf: xf.write_declaration(standalone=True) xf.write_doctype('<!DOCTYPE root SYSTEM "some.dtd">') # generate an element (the root element) with xf.element('root'): # write a complete Element into the open root element xf.write(etree.Element('test')) # generate and write more Elements, e.g. through iterparse for element in generate_some_elements(): # serialise generated elements into the XML file xf.write(element) # or write multiple Elements or strings at once xf.write(etree.Element('start'), "text", etree.Element('end'))
If ‘output_file’ is a file(-like) object, passing
close=True
will close it when exiting the context manager. By default, it is left to the owner to do that. When a file path is used, lxml will take care of opening and closing the file itself. Also, when a compression level is set, lxml will deliberately close the file to make sure all data gets compressed and written.Setting
buffered=False
will flush the output after each operation, such as opening or closing anxf.element()
block or callingxf.write()
. Alternatively, callingxf.flush()
can be used to explicitly flush any pending output when buffering is enabled.
- lxml.etree.Comment(text=None)
Comment element factory. This factory function creates a special element that will be serialized as an XML comment.
- lxml.etree.Element(_tag, attrib=None, nsmap=None, **_extra)
Element factory. This function returns an object implementing the Element interface.
Also look at the _Element.makeelement() and _BaseParser.makeelement() methods, which provide a faster way to create an Element within a specific document or parser context.
- lxml.etree.ElementTree(element=None, file=None, parser=None)
ElementTree wrapper class.
- lxml.etree.Entity(name)
Entity factory. This factory function creates a special element that will be serialized as an XML entity reference or character reference. Note, however, that entities will not be automatically declared in the document. A document that uses entity references requires a DTD to define the entities.
- lxml.etree.Extension(module, function_mapping=None, ns=None)
Build a dictionary of extension functions from the functions defined in a module or the methods of an object.
As second argument, you can pass an additional mapping of attribute names to XPath function names, or a list of function names that should be taken.
The
ns
keyword argument accepts a namespace URI for the XPath functions.
- lxml.etree.FunctionNamespace(ns_uri)
Retrieve the function namespace object associated with the given URI.
Creates a new one if it does not yet exist. A function namespace can only be used to register extension functions.
Usage:
>>> ns_functions = FunctionNamespace("http://schema.org/Movie")
>>> @ns_functions # uses function name ... def add2(x): ... return x + 2
>>> @ns_functions("add3") # uses explicit name ... def add_three(x): ... return x + 3
- lxml.etree.HTML(text, parser=None, base_url=None)
Parses an HTML document from a string constant. Returns the root node (or the result returned by a parser target). This function can be used to embed “HTML literals” in Python code.
To override the parser with a different
HTMLParser
you can pass it to theparser
keyword argument.The
base_url
keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, …).
- lxml.etree.PI(target, text=None)
ProcessingInstruction(target, text=None)
ProcessingInstruction element factory. This factory function creates a special element that will be serialized as an XML processing instruction.
- lxml.etree.ProcessingInstruction(target, text=None)
ProcessingInstruction element factory. This factory function creates a special element that will be serialized as an XML processing instruction.
- lxml.etree.SubElement(_parent, _tag, attrib=None, nsmap=None, **_extra)
Subelement factory. This function creates an element instance, and appends it to an existing element.
- lxml.etree.XML(text, parser=None, base_url=None)
Parses an XML document or fragment from a string constant. Returns the root node (or the result returned by a parser target). This function can be used to embed “XML literals” in Python code, like in
>>> root = XML("<root><test/></root>") >>> print(root.tag) root
To override the parser with a different
XMLParser
you can pass it to theparser
keyword argument.The
base_url
keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, …).
- lxml.etree.XMLDTDID(text, parser=None, base_url=None)
Parse the text and return a tuple (root node, ID dictionary). The root node is the same as returned by the XML() function. The dictionary contains string-element pairs. The dictionary keys are the values of ID attributes as defined by the DTD. The elements referenced by the ID are stored as dictionary values.
Note that you must not modify the XML tree if you use the ID dictionary. The results are undefined.
- lxml.etree.XMLID(text, parser=None, base_url=None)
Parse the text and return a tuple (root node, ID dictionary). The root node is the same as returned by the XML() function. The dictionary contains string-element pairs. The dictionary keys are the values of ‘id’ attributes. The elements referenced by the ID are stored as dictionary values.
- lxml.etree.XPathEvaluator(etree_or_element, namespaces=None, extensions=None, regexp=True, smart_strings=True)
Creates an XPath evaluator for an ElementTree or an Element.
The resulting object can be called with an XPath expression as argument and XPath variables provided as keyword arguments.
Additional namespace declarations can be passed with the ‘namespace’ keyword argument. EXSLT regular expression support can be disabled with the ‘regexp’ boolean keyword (defaults to True). Smart strings will be returned for string results unless you pass
smart_strings=False
.
- lxml.etree.adopt_external_document(capsule, parser=None)
Unpack a libxml2 document pointer from a PyCapsule and wrap it in an lxml ElementTree object.
This allows external libraries to build XML/HTML trees using libxml2 and then pass them efficiently into lxml for further processing.
If a
parser
is provided, it will be used for configuring the lxml document. No parsing will be done.The capsule must have the name
"libxml2:xmlDoc"
and its pointer value must reference a correct libxml2 document of typexmlDoc*
. The creator of the capsule must take care to correctly clean up the document using an appropriate capsule destructor. By default, the libxml2 document will be copied to let lxml safely own the memory of the internal tree that it uses.If the capsule context is non-NULL, it must point to a C string that can be compared using
strcmp()
. If the context string equals"destructor:xmlFreeDoc"
, the libxml2 document will not be copied but the capsule invalidated instead by clearing its destructor and name. That way, lxml takes ownership of the libxml2 document in memory without creating a copy first, and the capsule destructor will not be called. The document will then eventually be cleaned up by lxml using the libxml2 API functionxmlFreeDoc()
once it is no longer used.If no copy is made, later modifications of the tree outside of lxml should not be attempted after transferring the ownership.
- lxml.etree.canonicalize(xml_data=None, *, out=None, from_file=None, **options)
Convert XML to its C14N 2.0 serialised form.
If out is provided, it must be a file or file-like object that receives the serialised canonical XML output (text, not bytes) through its
.write()
method. To write to a file, open it in text mode with encoding “utf-8”. If out is not provided, this function returns the output as text string.Either xml_data (an XML string, tree or Element) or file (a file path or file-like object) must be provided as input.
The configuration options are the same as for the
C14NWriterTarget
.
- lxml.etree.cleanup_namespaces(tree_or_element, top_nsmap=None, keep_ns_prefixes=None)
Remove all namespace declarations from a subtree that are not used by any of the elements or attributes in that tree.
If a ‘top_nsmap’ is provided, it must be a mapping from prefixes to namespace URIs. These namespaces will be declared on the top element of the subtree before running the cleanup, which allows moving namespace declarations to the top of the tree.
If a ‘keep_ns_prefixes’ is provided, it must be a list of prefixes. These prefixes will not be removed as part of the cleanup.
- lxml.etree.clear_error_log()
Clear the global error log. Note that this log is already bound to a fixed size.
Note: since lxml 2.2, the global error log is local to a thread and this function will only clear the global error log of the current thread.
- lxml.etree.dump(elem, pretty_print=True, with_tail=True)
Writes an element tree or element structure to sys.stdout. This function should be used for debugging only.
- lxml.etree.fromstring(text, parser=None, base_url=None)
Parses an XML document or fragment from a string. Returns the root node (or the result returned by a parser target).
To override the default parser with a different parser you can pass it to the
parser
keyword argument.The
base_url
keyword argument allows to set the original base URL of the document to support relative Paths when looking up external entities (DTD, XInclude, …).
- lxml.etree.fromstringlist(strings, parser=None)
Parses an XML document from a sequence of strings. Returns the root node (or the result returned by a parser target).
To override the default parser with a different parser you can pass it to the
parser
keyword argument.
- lxml.etree.get_default_parser()
- lxml.etree.indent(tree, space=' ', level=0)
Indent an XML document by inserting newlines and indentation space after elements.
tree is the ElementTree or Element to modify. The (root) element itself will not be changed, but the tail text of all elements in its subtree will be adapted.
space is the whitespace to insert for each indentation level, two space characters by default.
level is the initial indentation level. Setting this to a higher value than 0 can be used for indenting subtrees that are more deeply nested inside of a document.
- lxml.etree.iselement(element)
Checks if an object appears to be a valid element object.
- lxml.etree.parse(source, parser=None, base_url=None)
Return an ElementTree object loaded with source elements. If no parser is provided as second argument, the default parser is used.
The
source
can be any of the following:a file name/path
a file object
a file-like object
a URL using the HTTP or FTP protocol
To parse from a string, use the
fromstring()
function instead.Note that it is generally faster to parse from a file path or URL than from an open file object or file-like object. Transparent decompression from gzip compressed sources is supported (unless explicitly disabled in libxml2).
The
base_url
keyword allows setting a URL for the document when parsing from a file-like object. This is needed when looking up external entities (DTD, XInclude, …) with relative paths.
- lxml.etree.parseid(source, parser=None)
Parses the source into a tuple containing an ElementTree object and an ID dictionary. If no parser is provided as second argument, the default parser is used.
Note that you must not modify the XML tree if you use the ID dictionary. The results are undefined.
- lxml.etree.register_namespace(prefix, uri)
Registers a namespace prefix that newly created Elements in that namespace will use. The registry is global, and any existing mapping for either the given prefix or the namespace URI will be removed.
- lxml.etree.set_default_parser(parser=None)
Set a default parser for the current thread. This parser is used globally whenever no parser is supplied to the various parse functions of the lxml API. If this function is called without a parser (or if it is None), the default parser is reset to the original configuration.
Note that the pre-installed default parser is not thread-safe. Avoid the default parser in multi-threaded environments. You can create a separate parser for each thread explicitly or use a parser pool.
- lxml.etree.set_element_class_lookup(lookup=None)
Set the global element class lookup method.
This defines the main entry point for looking up element implementations. The standard implementation uses the
ParserBasedElementClassLookup
to delegate to different lookup schemes for each parser.Warning
This should only be changed by applications, not by library packages. In most cases, parser specific lookups should be preferred, which can be configured via
set_element_class_lookup()
(and the same for HTML parsers).Globally replacing the element class lookup by something other than a
ParserBasedElementClassLookup
will prevent parser specific lookup schemes from working. Several tools rely on parser specific lookups, includinglxml.html
andlxml.objectify
.
- lxml.etree.strip_attributes(tree_or_element, *attribute_names)
Delete all attributes with the provided attribute names from an Element (or ElementTree) and its descendants.
Attribute names can contain wildcards as in _Element.iter.
Example usage:
strip_attributes(root_element, 'simpleattr', '{http://some/ns}attrname', '{http://other/ns}*')
- lxml.etree.strip_elements(tree_or_element, *tag_names, with_tail=True)
Delete all elements with the provided tag names from a tree or subtree. This will remove the elements and their entire subtree, including all their attributes, text content and descendants. It will also remove the tail text of the element unless you explicitly set the
with_tail
keyword argument option to False.Tag names can contain wildcards as in _Element.iter.
Note that this will not delete the element (or ElementTree root element) that you passed even if it matches. It will only treat its descendants. If you want to include the root element, check its tag name directly before even calling this function.
Example usage:
strip_elements(some_element, 'simpletagname', # non-namespaced tag '{http://some/ns}tagname', # namespaced tag '{http://some/other/ns}*' # any tag from a namespace lxml.etree.Comment # comments )
- lxml.etree.strip_tags(tree_or_element, *tag_names)
Delete all elements with the provided tag names from a tree or subtree. This will remove the elements and their attributes, but not their text/tail content or descendants. Instead, it will merge the text content and children of the element into its parent.
Tag names can contain wildcards as in _Element.iter.
Note that this will not delete the element (or ElementTree root element) that you passed even if it matches. It will only treat its descendants.
Example usage:
strip_tags(some_element, 'simpletagname', # non-namespaced tag '{http://some/ns}tagname', # namespaced tag '{http://some/other/ns}*' # any tag from a namespace Comment # comments (including their text!) )
- lxml.etree.tostring(element_or_tree, *, encoding=None, method='xml', xml_declaration=None, pretty_print=False, with_tail=True, standalone=None, doctype=None, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False)
- tostring(element_or_tree, encoding=None, method=”xml”,
xml_declaration=None, pretty_print=False, with_tail=True, standalone=None, doctype=None, exclusive=False, inclusive_ns_prefixes=None, with_comments=True, strip_text=False, )
Serialize an element to an encoded string representation of its XML tree.
Defaults to ASCII encoding without XML declaration. This behaviour can be configured with the keyword arguments ‘encoding’ (string) and ‘xml_declaration’ (bool). Note that changing the encoding to a non UTF-8 compatible encoding will enable a declaration by default.
You can also serialise to a Unicode string without declaration by passing the name
'unicode'
as encoding (or thestr
function in Py3 orunicode
in Py2). This changes the return value from a byte string to an unencoded unicode string.The keyword argument ‘pretty_print’ (bool) enables formatted XML.
The keyword argument ‘method’ selects the output method: ‘xml’, ‘html’, plain ‘text’ (text content without tags), ‘c14n’ or ‘c14n2’. Default is ‘xml’.
With
method="c14n"
(C14N version 1), the optionsexclusive
,with_comments
andinclusive_ns_prefixes
request exclusive C14N, include comments, and list the inclusive prefixes respectively.With
method="c14n2"
(C14N version 2), thewith_comments
andstrip_text
options control the output of comments and text space according to C14N 2.0.Passing a boolean value to the
standalone
option will output an XML declaration with the correspondingstandalone
flag.The
doctype
option allows passing in a plain string that will be serialised before the XML tree. Note that passing in non well-formed content here will make the XML output non well-formed. Also, an existing doctype in the document tree will not be removed when serialising an ElementTree instance.You can prevent the tail text of the element from being serialised by passing the boolean
with_tail
option. This has no impact on the tail text of children, which will always be serialised.
- lxml.etree.tostringlist(element_or_tree, *args, **kwargs)
Serialize an element to an encoded string representation of its XML tree, stored in a list of partial strings.
This is purely for ElementTree 1.3 compatibility. The result is a single string wrapped in a list.
- lxml.etree.tounicode(element_or_tree, *, method='xml', pretty_print=False, with_tail=True, doctype=None)
- tounicode(element_or_tree, method=”xml”, pretty_print=False,
with_tail=True, doctype=None)
Serialize an element to the Python unicode representation of its XML tree.
- Deprecated
use
tostring(el, encoding='unicode')
instead.
Note that the result does not carry an XML encoding declaration and is therefore not necessarily suited for serialization to byte streams without further treatment.
The boolean keyword argument ‘pretty_print’ enables formatted XML.
The keyword argument ‘method’ selects the output method: ‘xml’, ‘html’ or plain ‘text’.
You can prevent the tail text of the element from being serialised by passing the boolean
with_tail
option. This has no impact on the tail text of children, which will always be serialised.
- lxml.etree.use_global_python_log(log)
Replace the global error log by an etree.PyErrorLog that uses the standard Python logging package.
Note that this disables access to the global error log from exceptions. Parsers, XSLT etc. will continue to provide their normal local error log.
Note: prior to lxml 2.2, this changed the error log globally. Since lxml 2.2, the global error log is local to a thread and this function will only set the global error log of the current thread.