NetCDF: Network Common Data Form - Vector

Starting with GDAL 2.1, the netCDF driver support read and write (creation from scratch and append operations) to vector datasets (you can find documentation for the raster side)

NetCDF is an interface for array-oriented data access and is used for representing scientific data.

The driver handles the "point" and "profile" feature types of the CF 1.6 convention. It also supports a more custom approach for non-point geometries.

Mapping of concepts

Field types

On creation of netCDF files, the mapping between OGR field types and netCDF type is the following :

OGR field type netCDF type
String(1) char
String char (bi-dimensional), or string for NC4
Integer int
Integer(Boolean) byte
Integer(Int16) short
Integer64 int64 for NC4, or double for NC3 as a fallback
Real double
Real(Float32) float
Date int (with units="days since 1970-1-1")
DateTime double (with units="seconds since 1970-1-1 0:0:0")

The driver also writes the following attributes for each OGR fields / netCDF variables.

They are written by default (unless the WRITE_GDAL_TAGS dataset creation option is set to NO). They are not required for reading, but may help to better identify field characteristics

On reading, the mapping is the following :

netCDF type OGR field type
byte Integer
ubyte (NC4 only) Integer
char (mono dimensional) String(1)
char (bi dimensional) String
string (NC4 only) String
short Integer(Int16)
ushort (NC4 only) Integer
int Integer
int or double (with units="days since 1970-1-1")Date
uint (NC4 only) Integer64
int64 (NC4 only) Integer64
uint64 (NC4 only) Real
float Real(Float32)
double Real
double (with units="seconds since 1970-1-1 0:0:0")DateTime

Layers

Generally a single netCDF file is viewed as a single OGR layer, provided that it contains only mono-dimensional variables, indexed by the same dimension (or bi-dimensional variables of type char). For netCDF v4 files with multiple groups, each group may be seen as a separate OGR layer.

On writing, the MULTIPLE_LAYERS dataset creation option can be used to control whether multiple layers is disabled, or if multiple layers should go in separate files, or separate groups.

Strings

Variable length strings are not natively supported in netCDF v3 format. To work around that, OGR uses bi-dimensional char variables, whose first dimension is the record dimension, and second dimension the maximum width of the string. By default, OGR implements a "auto-grow" mode in writing, where the maximum width of the variable used to store a OGR string field is extended when needed. Note that this leads to a full rewrite of already written records : this is transparent for the user, but can slow down the creation process in non-linear ways. A similar mechanism is used to handle layers with geometry types other than point to store the ISO WKT representation of the geometries.

When using a netCDF v4 output format (NC4), strings will be by default written as netCDF v4 variable length strings.

Geometry

Layers with a geometry type of Point or Point25D will cause the implicit creation of x,y(,z) variables for projected coordinate system, or lon,lat(,z) variables for geographic coordinate systems. For other geometry type, a variable "ogc_wkt" ( bi-dimensional char for NC3 output, or string for NC4 output) is created and used to store the geometry as a ISO WKT string.

"Profile" feature type

The driver can handle "profile" feature type, i.e. phenomenons that happen at a few positions along a vertical line at a fixed horizontal position. In that representation, some variables are indexed by the profile, and others by the observation.

More precisely, the driver supports reading and writing profiles organized accordingly with the "Indexed ragged array representation" of profiles.

On reading, the driver will collect values of variables indexed by the profile dimension and expose them as long as variables indexed by the observation dimension, based on a variable such as "parentIndex" with an attribute "instance_dimension" pointing to the profile dimension.

On writing, the FEATURE_TYPE=PROFILE layer creation option must be set and the driver will need to be instructed which OGR fields are indexed either by the profile or by the observation dimension. The list of fields indexed by the profile can be specified with the PROFILE_VARIABLES layer creation options (other fields are assumed to be indexed by the observation dimension). Fields indexed by the profile are the horizontal geolocation (created implicitly), and other user attributes such as the location name, etc. Care should be taken into selecting which variables are indexed by the profile dimension: given 2 OGR features (taking into account only the variables indexed by the profile dimension), if they have different values for such variables, they will be considered to belong to different profiles.

In the below example, the station_name and time variables may be indexed by the profile dimension (the geometry is assumed to be also indexed by the profile dimension), since all records that have the same value for one of those variables have same values for the other ones, whereas temparature and Z should be indexed by the default dimension.

station_name time geometry temperature Z
Paris 2016-03-01T00:00:00Z POINT (2 49) 25 100
Vancouver 2016-04-01T12:00:00Z POINT (-123 49.25) 5 100
Paris 2016-03-01T00:00:00Z POINT (2 49) 3 500
Vancouver 2016-04-01T12:00:00Z POINT (-123 49.25) -15 500

An integer field, with the name of the profile dimension (whose default name is "profile", which can be altered with the PROFILE_DIM_NAME layer creation option), will be used to store the automatically computed id of profile sites (unless a integer OGR field with the same name exits).

The size of the profile dimension defaults to 100 for non-NC4 output format, and is extended automatically in case of additional profiles (with similar performance issues as growing strings). For NC4 output format, the profile dimension is of unlimited size by default.

Dataset creation options

Layer creation options

XML configuration file

A XML configuration file conforming to the following schema can be used for very precise control on the output format, in particular to set all needed attributes (such as units) to conform to the NetCDF CF-1.6 convention.

It has been designed in particular, but not exclusively, to be usable in use cases involving the MapServer OGR output.

Such a file can be used to :

The scope of effect is either globally, when elements are defined as direct children of the root <Configuration> node, or specifically to a given layer, when defined as children of a <Layer> node.

The filename is specified with the CONFIG_FILE dataset creation option. Alternatively, the content of the file can be specifid inline as the value of the option (it must then begin strictly with the "<Configuration" characters)

The following example shows all possibilities and precedence rules:

<Configuration>
    <DatasetCreationOption name="FORMAT" value="NC4"/>
    <DatasetCreationOption name="MULTIPLE_LAYERS" value="SEPARATE_GROUPS"/>
    <LayerCreationOption name="RECORD_DIM_NAME" value="observation"/>
<!-- applies to all layers -->
    <Attribute name="copyright" value="Copyright(C) 2016 Example"/>
    <Field name="weight">  <!-- edit user field/variable -->
        <Attribute name="units" value="kg"/> 
        <Attribute name="maximum" value="10" type="double"/>
    </Field>
    <Field netcdf_name="z"> <!-- edit predefined variable -->
        <Attribute name="long_name" value="Elevation"/> 
    </Field>
<!-- start of layer specific definitions -->
    <Layer name="1st_layer" netcdf_name="firstlayer"> <!-- OGR layer "1st_layer" is renamed as "firstlayer" netCDF group -->
        <LayerCreationOption name="FEATURE_TYPE" value="POINT"/>
        <Attribute name="copyright" value="Public domain"/> <!-- override global one -->
        <Attribute name="description" value="This is my first layer"/> <!-- additional attribute -->
        <Field name="1st_field" netcdf_name="firstfield"/> <!-- rename OGR field "1st_field" as the "firstfield" netCDF variable -->
        <Field name="weight"/> <!-- cancel above global customization -->
        <Field netcdf_name="lat"> <!-- edit predefined variable -->
            <Attribute name="long_name" value=""/> <!-- remove predefined attribute -->
        </Field>
    </Layer>
    <Layer name="sounding">
        <LayerCreationOption name="FEATURE_TYPE" value="PROFILE"/>
        <Field name="station_name" main_dim="profile"/> <!-- the corresponding netCDF variable will be indexed against the profile dimension, instead of the observation dimension -->
        <Field name="time" main_dim="profile"/> <!-- the corresponding netCDF variable will be indexed against the profile dimension, instead of the observation dimension -->
    </Layer>
</Configuration>

The effect on the output can be checked by running the ncdump utility

See Also:

Credits

Development of the read/write vector capabilities for netCDF was funded by Meteorological Service of Canada and World Ozone and Ultraviolet Radiation Data Centre.