GMLAS - Geography Markup Language (GML) driven by application schemas

Available in GDAL >= 2.2

This driver can read and write XML files of arbitrary structure, included those containing so called Complex Features, provided that they are accompanied by one or several XML schemas that describe the structure of their content. While this driver is generic to any XML schema, the main target is to be able to read and write documents referencing directly or indirectly to the GML namespace.

The driver requires Xerces-C >= 3.1.

The driver can deal with files of arbitrary size with a very modest RAM usage, due to its working in streaming mode.

Opening syntax

The connection string is GMLAS:/path/to/the.gml. Note the GMLAS: prefix. If this prefix it is omitted, then the GML driver is likely to be used.

It is also possible to only used "GMLAS:" as the connection string, but in that case the schemas must be explicitly provided with the XSD open option.

Mapping of XML structure to OGR layers and fields

The driver scans the XML schemas referenced by the XML/GML to build the OGR layers and fields. It is strictly required that the schemas, directly or indirectly used, are fully valid. The content of the XML/GML file itself is marginally used, mostly to determine the SRS of geometry columns.

XML elements declared at the top level of a schema will generally be exposed as OGR layers. Their attributes and sub-elements of simple XML types (string, integer, real, ...) will be exposed as OGR fields. For sub-elements of complex type, different cases can happen. If the cardinality of the sub-element is at most one and it is not referenced by other elements, then it is "flattened" into its enclosing element. Otherwise it will be exposed as a OGR layer, with either a link to its "parent" layer if the sub-element is specific to its parent element, or through a junction table if the sub-element is shared by several parents.

By default the driver is robust to documents non strictly conforming to the schemas. Unexpected content in the document will be silently ignored, as well as content required by the schema and absent from the document.

Consult the GMLAS mapping examples page for more details.

By default in the configuration, swe:DataRecord and swe:DataArray elements from the Sensor Web Enablement (SWE) Common Data Model namespace will receive a special processing, so they are mapped more naturally to OGR concepts. The swe:field elements will be mapped as OGR fields, and the swe:values element of a swe:DataArray will be parsed into OGR features in a dedicated layer for each swe:DataArray. Note that those conveniency exposure is for read-only purpose. When using the write side of the driver, only the content of the general mapping mechanisms will be used.

Metadata layers

Three special layers "_ogr_fields_metadata", "_ogr_layers_metadata", "_ogr_layer_relationships" and "_ogr_other_metadata" add extra information to the basic ones you can get from the OGR data model on OGR layers and fields.

Those layers are exposed if the EXPOSE_METADATA_LAYERS open option is set to YES (or if enabled in the configuration). They can also be individually retrieved by specifying their name in calls to GetLayerByName(), or on as layer names with the ogrinfo and ogr2ogr utility.

Consult the GMLAS metadata layers page for more details.

Configuration file

A default configuration file gmlasconf.xml file is provided in the data directory of the GDAL installation. Its structure and content is documented in gmlasconf.xsd schema.

This configuration file enables the user to modify the following settings:

This file can be adapted and modified versions can be provided to the driver with the CONFIG_FILE open option. None of the elements of the configuration file are required. When they are absent, the default value indicated in the schema documentation is used.

Configuration can also be provided through other open options. Note that some open options have identical names to settings present in the configuration file. When such open option is provided, then its value will override the one of the configuration file (either the default one, or the one provided through the CONFIG_FILE open option).

Geometry support

XML schemas only indicate the geometry type but do not constraint the spatial reference systems (SRS), so it is theoretically possible to have object instances of the same class having different SRS for the same geometry field. This is not practical to deal with, so when geometry fields are detected, an initial scan of the document is done to find the first geometry of each geometry field that has an explicit srsName set. This one will be used for the whole geometry field. In case other geometries of the same field would have different SRS, they will be reprojected.

By default, only the OGR geometry built from the GML geometry is exposed in the OGR feature. It is possible to change the IncludeGeometryXML setting of the configuration file to true so as to expose a OGR string field with the XML definition of the GML geometry.

Performance issues with large multi-layer GML files.

Traditionnaly to read a OGR datasource, one iterate over layers with GDALDataset::GetLayer(), and for each layer one iterate over features with OGRLayer::GetNextFeature(). While this approach still works for the GMLAS driver, it may result in very poor performance on big documents or documents using complex schemas that are translated in many OGR layers.

It is thus recommended to use GDALDataset::GetNextFeature() to iterate over features as soon as they appear in the .gml/.xml file. This may return features from non-sequential layers, when the features include nested elements.

Open options

Creation support

The GMLAS driver can write XML documents in a schema-driven way by converting a source dataset (contrary to most other drivers that have read support that implement the CreateLayer() and CreateFeature() interfaces). The typical workflow is to use the read side of the GMLAS driver to produce a SQLite/Spatialite/ PostGIS database, potentially modify the features imported and re-export this database as a new XML document.

The driver will identify in the source dataset "top-level" layers, and in those layers will find which features are not referenced by other top-level layers. As the creation of the output XML is schema-driver, the schemas need to be available. There are two possible ways:

By default, the driver will "wrap" the features inside a WFS 2.0 wfs:FeatureCollection / wfs:member element. It is also possible to ask the driver to create instead a custom wrapping .xsd file that declares the ogr_gmlas:FeatureCollection / ogr_gmlas:featureMember XML elements.

Note that while the file resulting from the export should be XML valid, there is no strong guarantee that it will validate against the additional constraints expressed in XML schema(s). This will depend on the content of the features (for example if converting from a GML file that is not conformant to the schemas, the output of the driver will generally be not validating)

If the input layers have geometries stored as GML content in a _xml suffixed field, then the driver will compare the OGR geometry built from that XML content with the OGR geometry stored in the dedicated geometry field of the feature. If both match, then the GML content stored in the _xml suffixed field will be used, such as to preserve particularities of the initial GML content. Otherwise GML will be exported from the OGR geometry.

To increase export performance on very large databases, creating attribute indexes on the fields pointed by the 'layer_pkid_name' attribute in '_ogr_layers_metadata' might help.

ogr2ogr behaviour

When using ogr2ogr / GDALVectorTranslate() to convert to XML/GML from a source database, there are restrictions to the options that can be used. Only the following options of ogr2ogr are supported:

The effect of spatial and attribute filtering will only apply on top-levels layers. Sub-features selected through joins will not be affected by those filters.

Dataset creation options

The supported dataset creation options are:

Examples

Listing content of a data file:

ogrinfo -ro GMLAS:my.gml

Converting to PostGIS:

ogr2ogr -f PostgreSQL PG:'host=myserver dbname=warmerda' GMLAS:my.gml -nlt CONVERT_TO_LINEAR

Converting to Spatialite and back to GML

ogr2ogr -f SQLite tmp.sqlite GMLAS:in.gml -dsco SPATILIATE=YES -nlt CONVERT_TO_LINEAR -oo EXPOSE_METADATA_LAYERS=YES
ogr2ogr -f GMLAS out.gml tmp.sqlite

See Also

Credits

Initial implementation has been funded by the European Union's Earth observation programme Copernicus, as part of the tasks delegated to the European Environment Agency.

Development of special processing of some Sensor Web Enablement (SWE) Common Data Model swe:DataRecord and swe:DataArray constructs has been funded by Bureau des Recherches Géologiques et Minières (BRGM).