What’s new in h5py 2.10

New features

  • HDF5 8-bit bitfield data can now be read either as uint8 or booleans (GH821). Pytables stores booleans as this type. For now, you must pick which type to use explicitly:

    with dset.astype(numpy.uint8):   # or numpy.bool
        arr = dset[:]
    
  • Numpy arrays of integers can now be used for fancy indexing, where previously a Python list was required (GH963).

  • Fancy indexing now allows an empty list or array (GH1174).

  • IPython can now tab-complete names in h5py groups and attributes without any special user action (GH1228). This simple completion only matches the first level of keys in a group, not subkeys. You can still call h5py.enable_ipython_completion() for more complete results.

  • The libver parameter for File now accepts 'v108' and 'v110' to specify compatibility with HDF5 1.8 or 1.10 (GH1155). See Version bounding for details.

  • New functions and constants for getting and identifying special data types - string_dtype(), vlen_dtype(), enum_dtype(), ref_dtype and regionref_dtype replace special_dtype(). For identifying string, vlen and enum dtypes, check_string_dtype(), check_vlen_dtype() and check_enum_dtype() replace check_dtype() (GH1132).

  • A new method make_scale() to conveniently make a dataset into a dimension scale (GH830, GH1212).

  • A new method AttributeManager.get_id() to get a low-level AttrID object referring to an attribute (GH1278).

  • Several examples were updated to run on Python 3 (GH1149).

Deprecations

  • The default behaviour of h5py.File with no specified mode is deprecated (GH1143). It currently tries to create a file or open it for read/write access, silently falling back to read-only depending on permissions. From h5py 3.0, the default will be read-only.

    Ideally, code should pass an explicit mode each time a file is opened:

    h5py.File("example.h5", "r")
    

    The possible modes are described in Opening & creating files. If you want to suppress the deprecation warnings from code you can’t modify, you can either:

    • set h5.get_config().default_file_mode = 'r' (or another available mode)

    • or set the environment variable H5PY_DEFAULT_READONLY to any non-empty string, to adopt the future default.

  • This is expected to be the last h5py release to support Python 2.7 and 3.4.

Exposing HDF5 functions

  • H5Zunregister exposed as h5z.unregister_filter() (GH746, GH1224).

  • The new module h5py.h5pl module exposes various H5PL functions to inspect and modify the search path for plugins (GH1166, GH1256).

  • H5Dread_chunk exposed as h5d.read_direct_chunk() (GH1190).

Bugfixes

  • Fix crash with empty variable-length data (GH1248, GH1253).

  • Fixed random selection of data type when reading 64-bit floats on Windows where Python uses random dictionary order (GH1051, GH1134).

  • Pickling h5py objects now fails explicitly. It previously failed on unpickling, and we can’t reliably serialise and restore handles to HDF5 objects anyway (GH531, GH1194). If you need to use these objects in other processes, you could explicitly serialise the filename and the name of the object inside the file. Or consider h5pickle, which does the same implicitly.

  • Creating a dataset with external storage can no longer mutate the external list parameter passed in (GH1205). It also has improved error messages (GH1204).

  • Certain deprecation warnings will now show the relevant line of code which uses the deprecated feature (GH1146).

  • Skipped a failing test for complex floating point numbers on 32-bit x86 systems (GH1235).

  • Disabled the longdouble type on the ppc64le architecture, as it was causing segfaults with more commonly used float types (GH1243).

  • Documented that nested compound types are not currently supported (GH1236).

  • Fixed attribute create method to be consistent with __setattr__ (GH1265).

Building h5py

  • The version of HDF5 can now be automatically detected on Windows (GH1123).

  • Fixed autodetecting the version from libhdf5 in default locations on Windows and Mac (GH1240).

  • Fail to build if it can’t detect version from libhdf5, rather than assuming 1.8.4 as a default (GH1241).

  • Building h5py from source on Unix platforms now requires either pkg-config or an explicitly specified path to HDF5 (GH1231). Previously it had a hardcoded default path, but when this was wrong, the failures were unnecessarily confusing.

  • The Cython ‘language level’ is now explicitly set to 2, to prepare h5py for changing defaults in Cython (GH1171).

  • Avoid using setup_requires when pip calls setup.py egg_info (GH1259).

Development

  • h5py’s tests are now run by pytest (GH1003), and coverage reports are automatically generated on Codecov.