hdf5storage.utilities¶
Module of functions to set and delete HDF5 attributes.
|
Get the raw bytes of a numpy object's data. |
Determine whether a dtype (or its fields) have zero shape. |
|
|
Gives a name that isn't used in a Group. |
Converts a numpy.unicode_ to UTF-16 in numpy.uint16 form. |
|
Converts a numpy.str_ to its numpy.uint32 representation. |
|
|
Decodes data to the Python 3.x str (Python 2.x unicode) type. |
|
Decodes data to Numpy unicode string (str_). |
|
Decodes data to Numpy UTF-8 econded string (bytes_). |
|
Decodes possibly complex data read from an HDF5 file. |
|
Encodes complex data to having arbitrary complex field names. |
|
Gets an attribute from a Dataset or Group. |
|
Gets a string attribute from a Dataset or Group. |
|
Gets a string array Attribute from a Dataset or Group. |
|
Reads all Attributes into a MutableMapping (dict-like) |
|
Reads the |
|
Sets an attribute on a Dataset or Group. |
|
Sets an attribute to a string on a Dataset or Group. |
|
Sets an attribute to an array of string on a Dataset or Group. |
|
Deletes an attribute on a Dataset or Group. |
numpy_to_bytes¶
- hdf5storage.utilities.numpy_to_bytes(obj)[source]¶
Get the raw bytes of a numpy object’s data.
Calls the
tobytes
method on obj for new versions ofnumpy
where the method exists, andtostring
for old versions ofnumpy
where it does not.- Parameters:
- objnumpy.generic or numpy.ndarray
Numpy scalar or array.
- Returns:
- databytes
The raw data.
does_dtype_have_a_zero_shape¶
- hdf5storage.utilities.does_dtype_have_a_zero_shape(dt)[source]¶
Determine whether a dtype (or its fields) have zero shape.
Determines whether the given
numpy.dtype
has a shape with a zero element or if one of its fields does, or if one of its fields’ fields does, and so on recursively. The following dtypes do not have zero shape.'uint8'
[('a', 'int32'), ('blah', 'float16', (3, 3))]
[('a', [('b', 'complex64')], (2, 1, 3))]
But the following do
('uint8', (1, 0))
[('a', 'int32'), ('blah', 'float16', (3, 0))]
[('a', [('b', 'complex64')], (2, 0, 3))]
- Parameters:
- dtnumpy.dtype
The dtype to check.
- Returns:
- yesnobool
Whether dt or one of its fields has a shape with at least one element that is zero.
- Raises:
- TypeError
If dt is not a
numpy.dtype
.
next_unused_name_in_group¶
- hdf5storage.utilities.next_unused_name_in_group(grp, length)[source]¶
Gives a name that isn’t used in a Group.
Generates a name of the desired length that is not a Dataset or Group in the given group. Note, if length is not large enough and grp is full enough, there may be no available names meaning that this function will hang.
- Parameters:
- grph5py.Group or h5py.File
The HDF5 Group (or File if at ‘/’) to generate an unused name in.
- lengthint
Number of characters the name should be.
- Returns:
- str
A name that isn’t already an existing Dataset or Group in grp.
convert_numpy_str_to_uint16¶
- hdf5storage.utilities.convert_numpy_str_to_uint16(data)[source]¶
Converts a numpy.unicode_ to UTF-16 in numpy.uint16 form.
Convert a
numpy.unicode_
or an array of them (they are UTF-32 strings) to UTF-16 in the equivalent array ofnumpy.uint16
. The conversion will throw an exception if any characters cannot be converted to UTF-16. Strings are expanded along rows (across columns) so a 2x3x4 array of 10 element strings will get turned into a 2x30x4 array of uint16’s if every UTF-32 character converts easily to a UTF-16 singlet, as opposed to a UTF-16 doublet.- Parameters:
- datanumpy.unicode_ or numpy.ndarray of numpy.unicode_
The string or array of them to convert.
- Returns:
- arraynumpy.ndarray of numpy.uint16
The result of the conversion.
- Raises:
- UnicodeEncodeError
If a UTF-32 character has no UTF-16 representation.
convert_numpy_str_to_uint32¶
- hdf5storage.utilities.convert_numpy_str_to_uint32(data)[source]¶
Converts a numpy.str_ to its numpy.uint32 representation.
Convert a
numpy.str
or an array of them (they are UTF-32 strings) into the equivalent array ofnumpy.uint32
that is byte for byte identical. Strings are expanded along rows (across columns) so a 2x3x4 array of 10 element strings will get turned into a 2x30x4 array of uint32’s.- Parameters:
- datanumpy.str_ or numpy.ndarray of numpy.str_
The string or array of them to convert.
- Returns:
- numpy.ndarray of numpy.uint32
The result of the conversion.
See also
convert_numpy_str_to_uint16
decode_to_numpy_str
convert_to_str¶
- hdf5storage.utilities.convert_to_str(data)[source]¶
Decodes data to the Python 3.x str (Python 2.x unicode) type.
Decodes data to a Python 3.x
str
(Python 2.xunicode
). If it can’t be decoded, it is returned as is. Unsigned integers, Pythonbytes
, and Numpy strings (numpy.str_
andnumpy.bytes_
). Python 3.xbytes
, Python 2.xstr
, andnumpy.bytes_
are assumed to be encoded in UTF-8.- Parameters:
- datasome type
Data decode into an
str
string.
- Returns:
- str or data
If data can be decoded into a
str
, the decoded version is returned. Otherwise, data is returned unchanged.
convert_to_numpy_str¶
- hdf5storage.utilities.convert_to_numpy_str(data, length=None)[source]¶
Decodes data to Numpy unicode string (str_).
Decodes data to Numpy unicode string (UTF-32), which is
numpy.str_
, or an array of them. If it can’t be decoded, it is returned as is. Unsigned integers, Python string types (str
,bytes
), andnumpy.bytes_
are supported. If it is an array ofnumpy.bytes_
, an array of those all converted tonumpy.str_
is returned. Python 3.xbytes
, Python 2.xstr
, andnumpy.bytes_
are assumed to be encoded in UTF-8.For an array of unsigned integers, it may be desirable to make an array with strings of some specified length as opposed to an array of the same size with each element being a one element string. This naturally arises when converting strings to unsigned integer types in the first place, so it needs to be reversible. The length parameter specifies how many to group together into a string (desired string length). For 1d arrays, this is along its only dimension. For higher dimensional arrays, it is done along each row (across columns). So, for a 3x10x5 input array of uints and a length of 5, the output array would be a 3x2x5 of 5 element strings.
- Parameters:
- datasome type
Data decode into a Numpy unicode string.
- lengthint or None, optional
The number of consecutive elements (in the case of unsigned integer data) to compose each string in the output array from.
None
indicates the full amount for a 1d array or the number of columns (full length of row) for a higher dimension array.
- Returns:
- numpy.str_ or numpy.ndarray of numpy.str_ or data
If data can be decoded into a
numpy.str_
or anumpy.ndarray
of them, the decoded version is returned. Otherwise, data is returned unchanged.
See also
convert_to_str
convert_to_numpy_bytes
numpy.str_
convert_to_numpy_bytes¶
- hdf5storage.utilities.convert_to_numpy_bytes(data, length=None)[source]¶
Decodes data to Numpy UTF-8 econded string (bytes_).
Decodes data to a Numpy UTF-8 encoded string, which is
numpy.bytes_
, or an array of them in which case it will be ASCII encoded instead. If it can’t be decoded, it is returned as is. Unsigned integers, Python string types (str
,bytes
), andnumpy.str_
(UTF-32) are supported.For an array of unsigned integers, it may be desirable to make an array with strings of some specified length as opposed to an array of the same size with each element being a one element string. This naturally arises when converting strings to unsigned integer types in the first place, so it needs to be reversible. The length parameter specifies how many to group together into a string (desired string length). For 1d arrays, this is along its only dimension. For higher dimensional arrays, it is done along each row (across columns). So, for a 3x10x5 input array of uints and a length of 5, the output array would be a 3x2x5 of 5 element strings.
- Parameters:
- datasome type
Data decode into a Numpy UTF-8 encoded string/s.
- lengthint or None, optional
The number of consecutive elements (in the case of unsigned integer data) to compose each string in the output array from.
None
indicates the full amount for a 1d array or the number of columns (full length of row) for a higher dimension array.
- Returns:
- numpy.bytes_ or numpy.ndarray of numpy.bytes_ or data
If data can be decoded into a
numpy.bytes_
or anumpy.ndarray
of them, the decoded version is returned. Otherwise, data is returned unchanged.
See also
convert_to_str
convert_to_numpy_str
numpy.bytes_
decode_complex¶
- hdf5storage.utilities.decode_complex(data, complex_names=(None, None))[source]¶
Decodes possibly complex data read from an HDF5 file.
Decodes possibly complex datasets read from an HDF5 file. HDF5 doesn’t have a native complex type, so they are stored as H5T_COMPOUND types with fields such as ‘r’ and ‘i’ for the real and imaginary parts. As there is no standardization for field names, the field names have to be given explicitly, or the fieldnames in data analyzed for proper decoding to figure out the names. A variety of reasonably expected combinations of field names are checked and used if available to decode. If decoding is not possible, it is returned as is.
- Parameters:
- dataarraylike
The data read from an HDF5 file, that might be complex, to decode into the proper Numpy complex type.
- complex_namestuple of 2 str and/or Nones, optional
tuple
of the names to use (in order) for the real and imaginary fields. ANone
indicates that various common field names should be tried.
- Returns:
- decoded data or data
If data can be decoded into a complex type, the decoded complex version is returned. Otherwise, data is returned unchanged.
See also
Notes
Currently looks for real field names of
('r', 're', 'real')
and imaginary field names of('i', 'im', 'imag', 'imaginary')
ignoring case.
encode_complex¶
- hdf5storage.utilities.encode_complex(data, complex_names)[source]¶
Encodes complex data to having arbitrary complex field names.
Encodes complex data to have the real and imaginary field names given in complex_numbers. This is needed because the field names have to be set so that it can be written to an HDF5 file with the right field names (HDF5 doesn’t have a native complex type, so H5T_COMPOUND have to be used).
- Parameters:
- dataarraylike
The data to encode as a complex type with the desired real and imaginary part field names.
- complex_namestuple of 2 str
tuple
of the names to use (in order) for the real and imaginary fields.
- Returns:
- encoded data
data encoded into having the specified field names for the real and imaginary parts.
See also
get_attribute¶
- hdf5storage.utilities.get_attribute(target, name)[source]¶
Gets an attribute from a Dataset or Group.
Gets the value of an Attribute if it is present (get
None
if not).- Parameters:
- targetDataset or Group
Dataset or Group to get the attribute of.
- namestr
Name of the attribute to get.
- Returns:
- The value of the attribute if it is present, or
None
if it - isn’t.
- The value of the attribute if it is present, or
get_attribute_string¶
- hdf5storage.utilities.get_attribute_string(target, name)[source]¶
Gets a string attribute from a Dataset or Group.
Gets the value of an Attribute that is a string if it is present (get
None
if it is not present or isn’t a string type).- Parameters:
- targetDataset or Group
Dataset or Group to get the string attribute of.
- namestr
Name of the attribute to get.
- Returns:
- str or None
The
str
value of the attribute if it is present, orNone
if it isn’t or isn’t a type that can be converted tostr
get_attribute_string_array¶
- hdf5storage.utilities.get_attribute_string_array(target, name)[source]¶
Gets a string array Attribute from a Dataset or Group.
Gets the value of an Attribute that is a string array if it is present (get
None
if not).- Parameters:
- targetDataset or Group
Dataset or Group to get the attribute of.
- namestr
Name of the string array Attribute to get.
- Returns:
- list of str or None
The string array value of the Attribute if it is present, or
None
if it isn’t.
read_all_attributes_into¶
- hdf5storage.utilities.read_all_attributes_into(attrs, out)[source]¶
Reads all Attributes into a MutableMapping (dict-like)
Reads all Attributes into the MutableMapping (dict-like) out, including the special handling of the
MATLAB_fields
Attribute on versions ofh5py
where it cannot be read in the standard fashion.- Parameters:
- attrsh5py.AttributeManager
The Attribute manager to read from.
- outMutableMapping
The MutableMapping (dict-like) to write the Attributes into.
- Raises:
- TypeError
If an argument has the wrong type.
See also
read_matlab_fields_attribute¶
- hdf5storage.utilities.read_matlab_fields_attribute(attrs)[source]¶
Reads the
MATLAB_fields
Attribute.On some versions of
h5py
, theMATLAB_fields
Attribute cannot be read in the standard way and must instead be read in a more manual fashion. This function reads the Attribute by the proper method.- Parameters:
- attrsh5py.AttributeManager
The Attribute manager to read from.
- Returns:
- valuenumpy.ndarray or None
The value of the
MATLAB_fields
Attribute, orNone
if it isn’t available or its format is invalid.
- Raises:
- TypeError
If an argument has the wrong type.
set_attribute¶
- hdf5storage.utilities.set_attribute(target, name, value)[source]¶
Sets an attribute on a Dataset or Group.
If the attribute name doesn’t exist yet, it is created. If it already exists, it is overwritten if it differs from value.
- Parameters:
- targetDataset or Group
Dataset or Group to set the attribute of.
- namestr
Name of the attribute to set.
- valuenumpy type other than
numpy.str_
Value to set the attribute to.
set_attribute_string¶
- hdf5storage.utilities.set_attribute_string(target, name, value)[source]¶
Sets an attribute to a string on a Dataset or Group.
If the attribute name doesn’t exist yet, it is created. If it already exists, it is overwritten if it differs from value.
- Parameters:
- targetDataset or Group
Dataset or Group to set the string attribute of.
- namestr
Name of the attribute to set.
- valuestring
Value to set the attribute to. Can be any sort of string type that will convert to a
numpy.bytes_
set_attribute_string_array¶
- hdf5storage.utilities.set_attribute_string_array(target, name, string_list)[source]¶
Sets an attribute to an array of string on a Dataset or Group.
If the attribute name doesn’t exist yet, it is created. If it already exists, it is overwritten with the list of string string_list (they will be vlen strings).
- Parameters:
- targetDataset or Group
Dataset or Group to set the string array attribute of.
- namestr
Name of the attribute to set.
- string_listlist of str
List of strings to set the attribute to. Strings must be
str