hdf5storage¶

This is the hdf5storage package, a Python package to read and write python data types to HDF5 (Heirarchal Data Format) files beyond just Numpy types.

Version 0.1.19

`write`(data[, path, filename, ...])	Writes one piece of data into an HDF5 file (high level).
`writes`(mdict[, filename, truncate_existing, ...])	Writes data into an HDF5 file (high level).
`read`([path, filename, options])	Reads one piece of data from an HDF5 file (high level).
`reads`(paths[, filename, options])	Reads data from an HDF5 file (high level).
`savemat`(file_name, mdict[, appendmat, ...])	Save a dictionary of python types to a MATLAB MAT file.
`loadmat`(file_name[, mdict, appendmat, ...])	Loads data to a MATLAB MAT file.
`Options`([store_python_metadata, ...])	Set of options governing how data is read/written to/from disk.
`MarshallerCollection`([marshallers])	Represents, maintains, and retreives a set of marshallers.

write¶

hdf5storage.write(data, path='/', filename='data.h5', truncate_existing=False, truncate_invalid_matlab=False, options=None, **keywords)[source]¶

Writes one piece of data into an HDF5 file (high level).

A wrapper around writes to write a single piece of data, data, to a single location, path.

High level function to store a Python type (data) to a specified path (path) in an HDF5 file. The path is specified as a POSIX style path where the directory name is the Group to put it in and the basename is the name to write it to.

There are various options that can be used to influence how the data is written. They can be passed as an already constructed Options into options or as additional keywords that will be used to make one by options = Options(**keywords).

Two very important options are store_python_metadata and matlab_compatible, which are bool. The first makes it so that enough metadata (HDF5 Attributes) are written that data can be read back accurately without it (or its contents if it is a container type) ending up different types, transposed in the case of numpy arrays, etc. The latter makes it so that the appropriate metadata is written, string and bool and complex types are converted properly, and numpy arrays are transposed; which is needed to make sure that MATLAB can import data correctly (the HDF5 header is also set so MATLAB will recognize it).

Parameters:

dataany: The data to write.
pathstr, optional: The path to write data to. Must be a POSIX style path where the directory name is the Group to put it in and the basename is the name to write it to.
filenamestr, optional: The name of the HDF5 file to write data to.
truncate_existingbool, optional: Whether to truncate the file if it already exists before writing to it.
truncate_invalid_matlabbool, optional: Whether to truncate a file if matlab_compatibility is being done and the file doesn’t have the proper header (userblock in HDF5 terms) setup for MATLAB metadata to be placed.
optionsOptions, optional: The options to use when writing. Is mutually exclusive with any additional keyword arguments given (set to None or don’t provide to use them).
**keywords: If options was not provided or was None, these are used as arguments to make a Options.

Raises:

NotImplementedError: If writing data is not supported.
TypeNotMatlabCompatibleError: If writing a type not compatible with MATLAB and options.action_for_matlab_incompatible is set to 'error'.

See also

writes: Writes more than one piece of data at once
reads
read
Options
lowlevel.write_data: Low level version

writes¶

hdf5storage.writes(mdict, filename='data.h5', truncate_existing=False, truncate_invalid_matlab=False, options=None, **keywords)[source]¶

Writes data into an HDF5 file (high level).

High level function to store one or more Python types (data) to specified pathes in an HDF5 file. The paths are specified as POSIX style paths where the directory name is the Group to put it in and the basename is the name to write it to.

There are various options that can be used to influence how the data is written. They can be passed as an already constructed Options into options or as additional keywords that will be used to make one by options = Options(**keywords).

Two very important options are store_python_metadata and matlab_compatible, which are bool. The first makes it so that enough metadata (HDF5 Attributes) are written that data can be read back accurately without it (or its contents if it is a container type) ending up different types, transposed in the case of numpy arrays, etc. The latter makes it so that the appropriate metadata is written, string and bool and complex types are converted properly, and numpy arrays are transposed; which is needed to make sure that MATLAB can import data correctly (the HDF5 header is also set so MATLAB will recognize it).

Parameters:

mdictdict, dict like: The dict or other dictionary type object of paths and data to write to the file. The paths, the keys, must be POSIX style paths where the directory name is the Group to put it in and the basename is the name to write it to. The values are the data to write.
filenamestr, optional: The name of the HDF5 file to write data to.
truncate_existingbool, optional: Whether to truncate the file if it already exists before writing to it.
truncate_invalid_matlabbool, optional: Whether to truncate a file if matlab_compatibility is being done and the file doesn’t have the proper header (userblock in HDF5 terms) setup for MATLAB metadata to be placed.
optionsOptions, optional: The options to use when writing. Is mutually exclusive with any additional keyword arguments given (set to None or don’t provide to use them).
**keywords: If options was not provided or was None, these are used as arguments to make a Options.

Raises:

NotImplementedError: If writing data is not supported.
TypeNotMatlabCompatibleError: If writing a type not compatible with MATLAB and options.action_for_matlab_incompatible is set to 'error'.

See also

write: Writes just a single piece of data
reads
read
Options
lowlevel.write_data: Low level version

read¶

hdf5storage.read(path='/', filename='data.h5', options=None, **keywords)[source]¶

Reads one piece of data from an HDF5 file (high level).

A wrapper around reads to read a single piece of data at the single location path.

High level function to read data from an HDF5 file located at path into Python types. The path is specified as a POSIX style path where the data to read is located.

There are various options that can be used to influence how the data is read. They can be passed as an already constructed Options into options or as additional keywords that will be used to make one by options = Options(**keywords).

Parameters:

pathstr, optional: The path to read data from. Must be a POSIX style path where the directory name is the Group to put it in and the basename is the name to write it to.
filenamestr, optional: The name of the HDF5 file to read data from.
optionsOptions, optional: The options to use when reading. Is mutually exclusive with any additional keyword arguments given (set to None or don’t provide to use them).
**keywords: If options was not provided or was None, these are used as arguments to make a Options.

Returns:

data: The piece of data at path.

Raises:

CantReadError: If reading the data can’t be done.

See also

reads: Reads more than one piece of data at once
writes
write
Options
lowlevel.read_data: Low level version.

reads¶

hdf5storage.reads(paths, filename='data.h5', options=None, **keywords)[source]¶

Reads data from an HDF5 file (high level).

High level function to read one or more pieces of data from an HDF5 file located at the paths specified in paths into Python types. Each path is specified as a POSIX style path where the data to read is located.

There are various options that can be used to influence how the data is read. They can be passed as an already constructed Options into options or as additional keywords that will be used to make one by options = Options(**keywords).

Parameters:

pathsiterable of str: An iterable of paths to read data from. Each must be a POSIX style path where the directory name is the Group to put it in and the basename is the name to write it to.
filenamestr, optional: The name of the HDF5 file to read data from.
optionsOptions, optional: The options to use when reading. Is mutually exclusive with any additional keyword arguments given (set to None or don’t provide to use them).
**keywords: If options was not provided or was None, these are used as arguments to make a Options.

Returns:

datasiterable: An iterable holding the piece of data for each path in paths in the same order.

Raises:

CantReadError: If reading the data can’t be done.

See also

read: Reads just a single piece of data
writes
write
Options
lowlevel.read_data: Low level version.

savemat¶

hdf5storage.savemat(file_name, mdict, appendmat=True, format='7.3', oned_as='row', store_python_metadata=True, action_for_matlab_incompatible='error', marshaller_collection=None, truncate_existing=False, truncate_invalid_matlab=False, **keywords)[source]¶

Save a dictionary of python types to a MATLAB MAT file.

Saves the data provided in the dictionary mdict to a MATLAB MAT file. format determines which kind/vesion of file to use. The ‘7.3’ version, which is HDF5 based, is handled by this package and all types that this package can write are supported. Versions 4 and 5 are not HDF5 based, so everything is dispatched to the SciPy package’s scipy.io.savemat function, which this function is modelled after (arguments not specific to this package have the same names, etc.).

Parameters:

file_namestr or file-like object: Name of the MAT file to store in. The ‘.mat’ extension is added on automatically if not present if appendmat is set to True. An open file-like object can be passed if the writing is being dispatched to SciPy (format < 7.3).
mdictdict: The dictionary of variables and their contents to store in the file.
appendmatbool, optional: Whether to append the ‘.mat’ extension to file_name if it doesn’t already end in it or not.
format{‘4’, ‘5’, ‘7.3’}, optional: The MATLAB mat file format to use. The ‘7.3’ format is handled by this package while the ‘4’ and ‘5’ formats are dispatched to SciPy.
oned_as{‘row’, ‘column’}, optional: Whether 1D arrays should be turned into row or column vectors.
store_python_metadatabool, optional: Whether or not to store Python type information. Doing so allows most types to be read back perfectly. Only applicable if not dispatching to SciPy (format >= 7.3).
action_for_matlab_incompatible: str, optional: The action to perform writing data that is not MATLAB compatible. The actions are to write the data anyways (‘ignore’), don’t write the incompatible data (‘discard’), or throw a TypeNotMatlabCompatibleError exception.
marshaller_collectionMarshallerCollection, optional: Collection of marshallers to disk to use. Only applicable if not dispatching to SciPy (format >= 7.3).
truncate_existingbool, optional: Whether to truncate the file if it already exists before writing to it.
truncate_invalid_matlabbool, optional: Whether to truncate a file if the file doesn’t have the proper header (userblock in HDF5 terms) setup for MATLAB metadata to be placed.
**keywords: Additional keywords arguments to be passed onto scipy.io.savemat if dispatching to SciPy (format < 7.3).

Raises:

ImportError: If format < 7.3 and the scipy module can’t be found.
NotImplementedError: If writing a variable in mdict is not supported.
TypeNotMatlabCompatibleError: If writing a type not compatible with MATLAB and action_for_matlab_incompatible is set to 'error'.

See also

loadmat: Equivelent function to do reading.
scipy.io.savemat: SciPy function this one models after and dispatches to.
Options
writes: Function used to do the actual writing.

Notes

Writing the same data and then reading it back from disk using the HDF5 based version 7.3 format (the functions in this package) or the older format (SciPy functions) can lead to very different results. Each package supports a different set of data types and converts them to and from the same MATLAB types differently.

loadmat¶

hdf5storage.loadmat(file_name, mdict=None, appendmat=True, variable_names=None, marshaller_collection=None, **keywords)[source]¶

Loads data to a MATLAB MAT file.

Reads data from the specified variables (or all) in a MATLAB MAT file. There are many different formats of MAT files. This package can only handle the HDF5 based ones (the version 7.3 and later). As SciPy’s scipy.io.loadmat function can handle the earlier formats, if this function cannot read the file, it will dispatch it onto the scipy function with all the calling arguments it uses passed on. This function is modelled after the SciPy one (arguments not specific to this package have the same names, etc.).

Parameters:

file_namestr: Name of the MAT file to read from. The ‘.mat’ extension is added on automatically if not present if appendmat is set to True.
mdictdict, optional: The dictionary to insert read variables into
appendmatbool, optional: Whether to append the ‘.mat’ extension to file_name if it doesn’t already end in it or not.
variable_names: None or sequence, optional: The variable names to read from the file. None selects all.
marshaller_collectionMarshallerCollection, optional: Collection of marshallers from disk to use. Only applicable if not dispatching to SciPy (version 7.3 and newer files).
**keywords: Additional keywords arguments to be passed onto scipy.io.loadmat if dispatching to SciPy if the file is not a version 7.3 or later format.

Returns:

dict: Dictionary of all the variables read from the MAT file (name as the key, and content as the value). If a variable was missing from the file, it will not be present here.

Raises:

ImportError: If it is not a version 7.3 .mat file and the scipy module can’t be found when dispatching to SciPy.
CantReadError: If reading the data can’t be done.

See also

savemat: Equivalent function to do writing.
scipy.io.loadmat: SciPy function this one models after and dispatches to.
Options
reads: Function used to do the actual reading.

Notes

Writing the same data and then reading it back from disk using the HDF5 based version 7.3 format (the functions in this package) or the older format (SciPy functions) can lead to very different results. Each package supports a different set of data types and converts them to and from the same MATLAB types differently.

Options¶

class hdf5storage.Options(store_python_metadata=True, matlab_compatible=True, action_for_matlab_incompatible='error', delete_unused_variables=False, structured_numpy_ndarray_as_struct=False, make_atleast_2d=False, convert_numpy_bytes_to_utf16=False, convert_numpy_str_to_utf16=False, convert_bools_to_uint8=False, reverse_dimension_order=False, store_shape_for_empty=False, complex_names=('r', 'i'), group_for_references='/#refs#', oned_as='row', compress=True, compress_size_threshold=16384, compression_algorithm='gzip', gzip_compression_level=7, shuffle_filter=True, compressed_fletcher32_filter=True, uncompressed_fletcher32_filter=False, marshaller_collection=None, **keywords)[source]¶

Bases: object

Set of options governing how data is read/written to/from disk.

There are many ways that data can be transformed as it is read or written from a file, and many attributes can be used to describe the data depending on its format. The option with the most effect is the matlab_compatible option. It makes sure that the file is compatible with MATLAB’s HDF5 based version 7.3 mat file format. It overrides several options to the values in the following table.

attribute	value
delete_unused_variables	`True`
structured_numpy_ndarray_as_struct	`True`
make_atleast_2d	`True`
convert_numpy_bytes_to_utf16	`True`
convert_numpy_str_to_utf16	`True`
convert_bools_to_uint8	`True`
reverse_dimension_order	`True`
store_shape_for_empty	`True`
complex_names	`('real', 'imag')`
group_for_references	`'/#refs#'`
compression_algorithm	`'gzip'`

In addition to setting these options, a specially formatted block of bytes is put at the front of the file so that MATLAB can recognize its format.

Parameters:

store_python_metadatabool, optional: See Attributes.
matlab_compatiblebool, optional: See Attributes.
action_for_matlab_incompatiblestr, optional: See Attributes. Only valid values are ‘ignore’, ‘discard’, and ‘error’.
delete_unused_variablesbool, optional: See Attributes.
structured_numpy_ndarray_as_structbool, optional: See Attributes.
make_atleast_2dbool, optional: See Attributes.
convert_numpy_bytes_to_utf16bool, optional: See Attributes.
convert_numpy_str_to_utf16bool, optional: See Attributes.
convert_bools_to_uint8bool, optional: See Attributes.
reverse_dimension_orderbool, optional: See Attributes.
store_shape_for_emptybool, optional: See Attributes.
complex_namestuple of two str, optional: See Attributes.
group_for_referencesstr, optional: See Attributes.
oned_asstr, optional: See Attributes.
compressbool, optional: See Attributes.
compress_size_thresholdint, optional: See Attributes.
compression_algorithmstr, optional: See Attributes.
gzip_compression_levelint, optional: See Attributes.
shuffle_filterbool, optional: See Attributes.
compressed_fletcher32_filterbool, optional: See Attributes.
uncompressed_fletcher32_filterbool, optional: See Attributes.
marshaller_collectionMarshallerCollection, optional: See Attributes.
**keywords: Additional keyword arguments. They are ignored. They are allowed to be given to be more compatible with future versions of this package where more options will be added.

Attributes:

store_python_metadatabool: Whether or not to store Python metadata.
matlab_compatiblebool: Whether or not to make the file compatible with MATLAB.
action_for_matlab_incompatiblestr: The action to do when writing non-MATLAB compatible data.
delete_unused_variablesbool: Whether or not to delete file variables not written to.
structured_numpy_ndarray_as_structbool: Whether or not to convert structured ndarrays to structs.
make_atleast_2dbool: Whether or not to convert scalar types to 2D arrays.
convert_numpy_bytes_to_utf16bool: Whether or not to convert numpy.bytes_ to UTF-16.
convert_numpy_str_to_utf16bool: Whether or not to convert numpy.str_ to UTF-16.
convert_bools_to_uint8bool: Whether or not to convert bools to numpy.uint8.
reverse_dimension_orderbool: Whether or not to reverse the order of array dimensions.
store_shape_for_emptybool: Whether to write the shape if an object has no elements.
complex_namestuple of two str: Names to use for the real and imaginary fields.
group_for_referencesstr: Path for where to put objects pointed at by references.
oned_as{‘row’, ‘column’}: Vector that 1D arrays become when making everything >= 2D.
compressbool: Whether to compress large python objects (datasets).
compress_size_thresholdint: Minimum size of a python object before it is compressed.
compression_algorithm{‘gzip’, ‘lzf’, ‘szip’}: Algorithm to use for compression.
gzip_compression_levelint: The compression level to use when doing the gzip algorithm.
shuffle_filterbool: Whether to use the shuffle filter on compressed python objects.
compressed_fletcher32_filterbool: Whether to use the fletcher32 filter on compressed python objects.
uncompressed_fletcher32_filterbool: Whether to use the fletcher32 filter on uncompressed non-scalar python objects.
scalar_optionsdict: h5py.Group.create_dataset options for writing scalars.
array_optionsdict: h5py.Group.create_dataset options for writing scalars.
marshaller_collectionMarshallerCollection: Collection of marshallers to disk.

property action_for_matlab_incompatible¶

The action to do when writing non-MATLAB compatible data.

{‘ignore’, ‘discard’, ‘error’}

The action to perform when doing MATLAB compatibility but a type being written is not MATLAB compatible. The actions are to write the data anyways (‘ignore’), don’t write the incompatible data (‘discard’), or throw a TypeNotMatlabCompatibleError exception. The default is ‘error’.

See also

matlab_compatible
hdf5storage.lowlevel.TypeNotMatlabCompatibleError

property complex_names¶

Names to use for the real and imaginary fields.

tuple of two str

(r, i) where r and i are two str. When reading and writing complex numbers, the real part gets the name in r and the imaginary part gets the name in i. h5py uses ('r', 'i') by default, unless MATLAB compatibility is being done in which case its default is ('real', 'imag').

Must be ('real', 'imag') if doing MATLAB compatibility.

property compress¶

Whether to compress large python objects (datasets).

bool

If True, python objects (datasets) larger than compress_size_threshold will be compressed.

See also

compress_size_threshold
compression_algorithm
shuffle_filter
compressed_fletcher32_filter

property compress_size_threshold¶

Minimum size of a python object before it is compressed.

int

Minimum size in bytes a python object must be for it to be compressed if compress is set. Must be non-negative.

See also

compress

property compressed_fletcher32_filter¶

Whether to use the fletcher32 filter on compressed python objects.

bool

If True, python objects (datasets) that are compressed are run through the fletcher32 filter, which stores a checksum with each chunk so that data corruption can be more easily detected.

See also

compress
shuffle_filter
uncompressed_flether32_filter
h5py.Group.create_dataset

property compression_algorithm¶

Algorithm to use for compression.

{‘gzip’, ‘lzf’, ‘szip’}

Compression algorithm to use When the compress option is set and a python object is larger than compress_size_threshold. 'gzip' is the only MATLAB compatible option.

'gzip' is also known as the Deflate algorithm, which is the default compression algorithm of ZIP files and is a common compression algorithm used on tarballs. It is the most compatible option. It has good compression and is reasonably fast. Its compression level is set with the gzip_compression_level option, which is an integer between 0 and 9 inclusive.

'lzf' is a very fast but low to moderate compression algorithm. It is less commonly used than gzip/Deflate, but doesn’t have any patent or license issues.

'szip' is a compression algorithm that has some patents and license restrictions. It is not always available.

See also

compress
compress_size_threshold
h5py.Group.create_dataset

property convert_bools_to_uint8¶

Whether or not to convert bools to numpy.uint8.

bool

If True (defaults to False unless MATLAB compatibility is being done), bool types are converted to numpy.uint8 before being written to file.

Must be True if doing MATLAB compatibility. MATLAB doesn’t use the enums that h5py wants to use by default and also uses uint8 intead of int8.

property convert_numpy_bytes_to_utf16¶

Whether or not to convert numpy.bytes_ to UTF-16.

bool

If True (defaults to False unless MATLAB compatibility is being done), numpy.bytes_ and anything that is converted to them (bytes, and bytearray) are converted to UTF-16 before being written to file as numpy.uint16.

Must be True if doing MATLAB compatibility. MATLAB uses UTF-16 for its strings.

See also

numpy.bytes_
convert_numpy_str_to_utf16

property convert_numpy_str_to_utf16¶

Whether or not to convert numpy.str_ to UTF-16.

bool

If True (defaults to False unless MATLAB compatibility is being done), numpy.str_ and anything that is converted to them (str) will be converted to UTF-16 if possible before being written to file as numpy.uint16. If doing so would lead to a loss of data (character can’t be translated to UTF-16) or would change the shape of an array of numpy.str_ due to a character being converted into a pair 2-bytes, the conversion will not be made and the string will be stored in UTF-32 form as a numpy.uint32.

Must be True if doing MATLAB compatibility. MATLAB uses UTF-16 for its strings.

See also

numpy.bytes_
convert_numpy_str_to_utf16

property delete_unused_variables¶

Whether or not to delete file variables not written to.

bool

If True (defaults to False unless MATLAB compatibility is being done), variables in the file below where writing starts that are not written to are deleted.

Must be True if doing MATLAB compatibility.

property group_for_references¶

Path for where to put objects pointed at by references.

str

The absolute POSIX path for the Group to place all data that is pointed to by another piece of data (needed for numpy.object_ and similar types). This path is automatically excluded from its parent group when reading back a dict.

Must be '/#refs# if doing MATLAB compatibility.

property gzip_compression_level¶

The compression level to use when doing the gzip algorithm.

int

Compression level to use when data is being compressed with the 'gzip' algorithm. Must be an integer between 0 and 9 inclusive. Lower values are faster while higher values give better compression.

See also

compress
compression_algorithm

property make_atleast_2d¶

Whether or not to convert scalar types to 2D arrays.

bool

If True (defaults to False unless MATLAB compatibility is being done), all scalar types are converted to 2D arrays when written to file. oned_as determines whether 1D arrays are turned into row or column vectors.

Must be True if doing MATLAB compatibility. MATLAB can only import 2D and higher dimensional arrays.

See also

oned_as

marshaller_collection¶

Collection of marshallers to disk.

MarshallerCollection

See also

MarshallerCollection

property matlab_compatible¶

Whether or not to make the file compatible with MATLAB.

bool

If True (default), data is written to file in such a way that it compatible with MATLAB’s version 7.3 mat file format which is HDF5 based. Setting it to True forces other options to hold the specific values in the table below.

attribute	value
delete_unused_variables	`True`
structured_numpy_ndarray_as_struct	`True`
make_atleast_2d	`True`
convert_numpy_bytes_to_utf16	`True`
convert_numpy_str_to_utf16	`True`
convert_bools_to_uint8	`True`
reverse_dimension_order	`True`
store_shape_for_empty	`True`
complex_names	`('real', 'imag')`
group_for_references	`'/#refs#'`
compression_algorithm	`'gzip'`

In addition to setting these options, a specially formatted block of bytes is put at the front of the file so that MATLAB can recognize its format.

property oned_as¶

Vector that 1D arrays become when making everything >= 2D.

{‘row’, ‘column’}

When the make_atleast_2d option is set (set implicitly by doing MATLAB compatibility), this option controls whether 1D arrays become row vectors or column vectors.

See also

make_atleast_2d

property reverse_dimension_order¶

Whether or not to reverse the order of array dimensions.

bool

If True (defaults to False unless MATLAB compatibility is being done), the dimension order of numpy.ndarray and numpy.matrix are reversed. This switches them from C ordering to Fortran ordering. The switch of ordering is essentially a transpose.

Must be True if doing MATLAB compatibility. MATLAB uses Fortran ordering.

property shuffle_filter¶

Whether to use the shuffle filter on compressed python objects.

bool

If True, python objects (datasets) that are compressed are run through the shuffle filter, which reversibly rearranges the data to improve compression.

See also

compress
h5py.Group.create_dataset

property store_python_metadata¶

Whether or not to store Python metadata.

bool

If True (default), information on the Python type for each object written to disk is put in its attributes so that it can be read back into Python as the same type.

property store_shape_for_empty¶

Whether to write the shape if an object has no elements.

bool

If True (defaults to False unless MATLAB compatibility is being done), objects that have no elements (e.g. a 0x0x2 array) will have their shape (an array of the number of elements along each axis) written to disk in place of nothing, which would otherwise be written.

Must be True if doing MATLAB compatibility. For empty arrays, MATLAB requires that the shape array be written in its place along with the attribute ‘MATLAB_empty’ set to 1 to flag it.

property structured_numpy_ndarray_as_struct¶

Whether or not to convert structured ndarrays to structs.

bool

If True (defaults to False unless MATLAB compatibility is being done), all ``numpy.ndarray``s with fields (compound dtypes) are written as HDF5 Groups with the fields as Datasets (correspond to struct arrays in MATLAB).

Must be True if doing MATLAB compatibility. MATLAB cannot handle the compound types made by writing these types.

property uncompressed_fletcher32_filter¶

Whether to use the fletcher32 filter on uncompressed non-scalar python objects.

bool

If True, python objects (datasets) that are NOT compressed and are not scalars (when converted to a Numpy type, their shape is not an empty tuple) are run through the fletcher32 filter, which stores a checksum with each chunk so that data corruption can be more easily detected. This forces all uncompressed data to be chuncked regardless of how small and can increase file sizes.

See also

compress
shuffle_filter
compressed_flether32_filter
h5py.Group.create_dataset

MarshallerCollection¶

class hdf5storage.MarshallerCollection(marshallers=[])[source]¶

Bases: object

Represents, maintains, and retreives a set of marshallers.

Maintains a list of marshallers used to marshal data types to and from HDF5 files. It includes the builtin marshallers from the hdf5storage.Marshallers module as well as any user supplied or added marshallers. While the builtin list cannot be changed; user ones can be added or removed. Also has functions to get the marshaller appropriate for type or type_string for a python data type.

User marshallers must provide the same interface as hdf5storage.Marshallers.TypeMarshaller, which is probably most easily done by inheriting from it.

Parameters:

marshallersmarshaller or list of marshallers, optional: The user marshaller/s to add to the collection. Could also be a tuple, set, or frozenset of marshallers.

See also

hdf5storage.Marshallers
hdf5storage.Marshallers.TypeMarshaller

Methods

`add_marshaller`(marshallers)	Add a marshaller/s to the user provided list.
`clear_marshallers`()	Clears the list of user provided marshallers.
`get_marshaller_for_matlab_class`(matlab_class)	Gets the appropriate marshaller for a MATLAB class string.
`get_marshaller_for_type`(tp)	Gets the appropriate marshaller for a type.
`get_marshaller_for_type_string`(type_string)	Gets the appropriate marshaller for a type string.
`remove_marshaller`(marshallers)	Removes a marshaller/s from the user provided list.

add_marshaller(marshallers)[source]¶

Add a marshaller/s to the user provided list.

Adds a marshaller or a list of them to the user provided set of marshallers.

Parameters:

marshallersmarshaller or list of marshallers: The user marshaller/s to add to the user provided collection. Could also be a tuple, set, or frozenset of marshallers.

clear_marshallers()[source]¶

Clears the list of user provided marshallers.

Removes all user provided marshallers, but not the builtin ones from the hdf5storage.Marshallers module, from the list of marshallers used.

get_marshaller_for_matlab_class(matlab_class)[source]¶

Gets the appropriate marshaller for a MATLAB class string.

Retrieves the marshaller, if any, that can be used to read/write a Python object associated with the given MATLAB class string.

Parameters:

matlab_classstr: MATLAB class string for a Python object.

Returns:

marshaller: The marshaller that can read/write the type to file. None if no appropriate marshaller is found.

See also

hdf5storage.Marshallers.TypeMarshaller.python_type_strings

get_marshaller_for_type(tp)[source]¶

Gets the appropriate marshaller for a type.

Retrieves the marshaller, if any, that can be used to read/write a Python object with type ‘tp’.

Parameters:

tptype: Python object type.

Returns:

marshaller: The marshaller that can read/write the type to file. None if no appropriate marshaller is found.

See also

hdf5storage.Marshallers.TypeMarshaller.types

get_marshaller_for_type_string(type_string)[source]¶

Gets the appropriate marshaller for a type string.

Retrieves the marshaller, if any, that can be used to read/write a Python object with the given type string.

Parameters:

type_stringstr: Type string for a Python object.

Returns:

marshaller: The marshaller that can read/write the type to file. None if no appropriate marshaller is found.

See also

hdf5storage.Marshallers.TypeMarshaller.python_type_strings

remove_marshaller(marshallers)[source]¶

Removes a marshaller/s from the user provided list.

Removes a marshaller or a list of them from the user provided set of marshallers.

Parameters:

marshallersmarshaller or list of marshallers: The user marshaller/s to from the user provided collection. Could also be a tuple, set, or frozenset of marshallers.

hdf5storage¶

write¶

writes¶

read¶

reads¶

savemat¶

loadmat¶

Options¶

MarshallerCollection¶

Table of Contents

Previous topic

Next topic

This Page