User Manual
Quick start
PyEPR provides Python bindings for the ENVISAT Product Reader C API (EPR API) for reading satellite data from ENVISAT ESA (European Space Agency) mission.
PyEPR, as well as the EPR API for C, supports ENVISAT MERIS, AATSR Level 1B and Level 2 and also ASAR data products. It provides access to the data either on a geophysical (decoded, ready-to-use pixel samples) or on a raw data layer. The raw data access makes it possible to read any data field contained in a product file.
Full access to the Python EPR API is provided by the epr
module that
have to be imported by the client program e.g. as follows:
import epr
The following snippet open an ASAR product and dumps the “Main Processing Parameters” record to the standard output:
import epr
product = epr.Product(
'ASA_IMP_1PNUPA20060202_062233_000000152044_00435_20529_3110.N1')
dataset = product.get_dataset('MAIN_PROCESSING_PARAMS_ADS')
record = dataset.read_record(0)
print(record)
product.close()
Since version 0.9 PyEPR also include update features that are not
available in the EPR C API.
The user can open a product in update mode (‘rb+’) and call the
epr.Field.set_elem()
and epr.Field.set_elems()
methods of
epr.Field
class to update its elements and write changes to disk.
See also
Update support and Update Field elements tutorial for details.
Requirements
In order to use PyEPR it is needed that the following software are correctly installed and configured:
numpy >= 1.7.0
EPR API >= 2.2 (optional, since PyEPR 0.7 the source tar-ball comes with a copy of the EPR C API sources)
a reasonably updated C compiler 1 (build only)
pytest (optional and only needed for testing)
Download
Official source tar-balls can be downloaded form PyPi:
The source code of the development versions is available on the GitHub project page
To clone the git repository the following command can be used:
$ git clone https://github.com/avalentino/pyepr.git
To get also the EPR C API source code, the following commands are necessary:
$ cd pyepr
$ git submodule init
Submodule 'extern/epr-api' (https://github.com/avalentino/epr-api.git) registered for path 'extern/epr-api'
$ git submodule update
Cloning into '/Users/antonio valentino/projects/av/pyepr/extern/epr-api'...
Submodule path 'extern/epr-api': checked out '93c1f1efce26c64d508fe882d5c72a898a068f29'
Installation
The easier way to install PyEPR is using tools like pip:
$ python3 -m pip install pyepr
For a user specific installation please use:
$ python3 -m pip install --user pyepr
To install PyEPR in a non-standard path:
$ python3 -m pip install --install-option="--prefix=<TARGET_PATH>" pyepr
just make sure that <TARGET_PATH>/lib/pythonX.Y/site-packages
is in
the PYTHONPATH
.
PyEPR can be installed from sources using the following command:
$ python3 -m pip install .
The setup.py
script by default checks for the availability of the
EPR C API source code in the <package-root>/epr-api-src
directory
and tries to build PyEPR in standalone mode, i.e. without linking an
external dynamic library of EPR-API.
If no EPR C API sources are found then the setup.py
script
automatically tries to link the EPR-API dynamic library.
This can happen, for example, if the user is using a copy of the PyEPR
sources cloned from a git repository.
In this case it is assumed that the EPR API C library is properly
installed in the system (see the Requirements section).
It is possible to control which EPR API C sources to use by means of the
–epr-api-src option of the setup.py
script:
$ python3 setup.py install --epr-api-src=../epr-api/src
Also it is possible to switch off the standalone mode and force the link with the system EPR API C library:
$ python3 setup.py install --epr-api-src=""
Please note that if the setup.py
script is invoked directly, then the
user must make sure that setup requirements are propely installed:
$ python3 -m pip install cython numpy
Testing
PyEPR package comes with a complete test suite. The test suite can be run from the package root directory using pytest:
$ python3 -m pytest tests
or running the test_all.py
script directly:
$ python3 test_all.py
In the second case please make sure that the epr
extension module
is in the Python search path (see also PYTHONPATH
).
The test script automatically downloads and decompresses the ENVISAT sample
product necessary for testing,
MER_LRC_2PTGMV20000620_104318_00000104X000_00000_00000_0001.N1,
if it is not already available in the tests
directory.
Note
please note that, unless the user already have a copy of the specified sample product correctly installed, an internet connection is necessary the first time that the test suite is run.
After the first run the sample product remains in the tests
directory so the internet access is no longer necessary.
Python vs C API
The Python EPR API is fully object oriented. The main structures of the C API have been implemented as objects while C function have been logically grouped and mapped onto object methods.
The entire process of defining an object oriented API for Python has been quite easy and straightforward thanks to the good design of the C API,
Of course there are also some differences that are illustrated in the following sections.
Memory management
Being Python a very high level language uses have never to worry about memory allocation/de-allocation. They simply have to instantiate objects:
product = epr.Product('filename.N1')
and use them freely.
Objects are automatically destroyed when there are no more references to them and memory is de-allocated automatically.
Even better, each object holds a reference to other objects it depends on so the user never have to worry about identifiers validity or about the correct order structures have to be freed.
For example: the C EPR_DatasetId structure has a field (product_id) that points to the product descriptor EPR_productId to which it belongs to.
The reference to the parent product is used, for example, when one wants to read a record using the epr_read_record function:
EPR_SRecord* epr_read_record(EPR_SDatasetId* dataset_id, ...);
The function takes a EPR_SDatasetId as a parameter and assumes all
fields (including dataset->product_id
) are valid.
It is responsibility of the programmer to keep all structures valid and
free them at the right moment and in the correct order.
This is the standard way to go in C but not in Python.
In Python all is by far simpler, and the user can get a dateset object instance:
dataset = product.get_dataset('MAIN_PROCESSING_PARAMS_ADS')
and then forget about the product instance it depends on. Even if the product variable goes out of scope and it is no more directly accessible in the program the dataset object keeps staying valid since it holds an internal reference to the product instance it depends on.
When record is destroyed automatically also the parent epr.Product
object is destroyed (assumed there is no other reference to it).
The entire machinery is completely automatic and transparent to the user.
Note
of course when a product object is explicitly closed using the
epr.Product.close()
any I/O operation on it and on other objects
(bands, datasets, etc) associated to it is no more possible.
Arrays
PyEPR uses numpy in order to manage efficiently the potentially large amount of data contained in ENVISAT products.
epr.Field.get_elems()
return an 1D array containing elements of the fieldthe Raster.data property is a 2D array exposes data contained in the
epr.Raster
object in form ofnumpy.ndarray
Note
epr.Raster.data
directly exposesepr.Raster
i.e. shares the same memory buffer withepr.Raster
:>>> raster.get_pixel(i, j) 5 >>> raster.data[i, j] 5 >>> raster.data[i, j] = 3 >>> raster.get_pixel(i, j) 3
epr.Band.read_as_array()
is an additional method provided by the Python EPR API (does not exist any correspondent function in the C API). It is mainly a facility method that allows users to get access to band data without creating an intermediateepr.Raster
object. It read a slice of data from theepr.Band
and returns it as a 2Dnumpy.ndarray
.
Enumerators
Python does not have enumerators at language level (at least this is true for Python < 3.4). Enumerations are simply mapped as module constants that have the same name of the C enumerate but are spelled all in capital letters.
For example:
C |
Pythn |
---|---|
e_tid_double |
E_TID_DOUBLE |
e_smod_1OF1 |
E_SMOD_1OF1 |
e_smid_log |
E_SMID_LOG |
Error handling and logging
Currently error handling and logging functions of the EPR C API are not exposed to python.
Internal library logging is completely silenced and errors are converted
to Python exceptions.
Where appropriate standard Python exception types are use in other cases
custom exception types (e.g. epr.EPRError
, epr.EPRValueError
)
are used.
Library initialization
Differently from the C API library initialization is not needed: it is
performed internally the first time the epr
module is imported
in Python.
High level API
PyEPR provides some utility method that has no correspondent in the C API:
Example:
for dataset in product.datasets():
for record in dataset.records():
print(record)
print()
Another example:
if 'proc_data' in product.band_names():
band = product.get_band('proc_data')
print(band)
Special methods
The Python EPR API also implements some special method in order to make EPR programming even handy and, in short, “pythonic”.
The __repr__
methods have been overridden to provide a little more
information with respect to the standard implementation.
In some cases __str__
method have been overridden to output a verbose
string representation of the objects and their contents.
If the EPR object has a print_
method (like e.g. epr.Record.print_()
and epr.Field.print_()
) then the string representation of the object
will have the same format used by the print_
method.
So writing:
fd.write(str(record))
giver the same result of:
record.print_(fd)
Of course the epr.Record.print_()
method is more efficient for writing
to file.
Also epr.Dataset
and epr.Record
classes implement the
__iter__
special method for iterating over records and fields
respectively.
So it is possible to write code like the following:
for record in dataset:
for index, field in enumerate(record):
print(index, field)
epr.DSD
and epr.Field
classes implement the __eq__
and __ne__
methods for objects comparison:
if filed1 == field2:
print('field 1 and field2 are equal')
print(field1)
else:
print('field1:', field1)
print('field2:', field2)
epr.Field
object also implement the __len__
special method
that returns the number of elements in the field:
if field.get_type() != epr.E_TID_STRING:
assert field.get_num_elems() == len(field)
else:
assert len(field) == len(field.get_elem())
Note
differently from the epr.Field.get_num_elems()
method
len(field)
return the number of elements if the field
type is not epr.E_TID_STRING
.
If the field contains a string then the string length is
returned.
Finally the epr.Product
class acts as a context manager (i.e. it
implements the __enter__
and __exit__
methods).
This allows the user to write code like the following:
with epr.open('ASA_IMS_ ... _4650.N1') as product:
print(product)
that ensure that the product is closed as soon as the program exits the
with
block.
Update support
It is not possible to create new ENVISAT products for scratch with the
EPR API. Indeed EPR means “ENVISAT Product Reaeder”.
Anyway, since version 0.9, PyEPR also include basic update features.
This means that, while it is still not possible to create new
Products
, the user can update existing ones changing the
contents of any Field
in any record with the only exception of
MPH and SPH Field
s.
The user can open a product in update mode (‘rb+’):
product = epr.open('ASA_IMS_ ... _4650.N1', 'rb+')
and update the epr.Field
element at a specific index:
field.set_elem(new_value, index)
or also update all elements ol the epr.Field
in one shot:
field.set_elems(new_values)
Note
unfortunately there are some limitations to the update support.
Many of the internal structures of the EPR C API are loaded when the
Product
is opened and are not automatically updated when the
epr.Field.set_elem()
and epr.Field.set_elems()
methods are
called.
In particular epr.Band
s contents may depend on several
epr.Field
values, e.g. the contents of Scaling_Factor_GADS
epr.Dataset
.
For this reason the user may need to close and re-open the
epr.Product
in order to have all changes effectively applied.
See also