volumeutils

Utility functions for analyze-like formats

DtypeMapper()

Specialized mapper for numpy dtypes

Recoder(codes[, fields, map_maker])

class to return canonical code(s) from code or aliases

apply_read_scaling(arr[, slope, inter])

Apply scaling in slope and inter to array arr

array_from_file(shape, in_dtype, infile[, ...])

Get array from file with specified shape, dtype and file offset

array_to_file(data, fileobj[, out_dtype, ...])

Helper function for writing arrays to file objects

best_write_scale_ftype(arr[, slope, inter, ...])

Smallest float type to contain range of arr after scaling

better_float_of(first, second[, default])

Return more capable float type of first and second

finite_range(arr[, check_nan])

Get range (min, max) or range and flag (min, max, has_nan) from arr

fname_ext_ul_case(fname)

fname with ext changed to upper / lower case if file exists

int_scinter_ftype(ifmt[, slope, inter, default])

float type containing int type ifmt * slope + inter

make_dt_codes(codes_seqs)

Create full dt codes Recoder instance from datatype codes

pretty_mapping(mapping[, getterfunc])

Make pretty string from mapping

rec2dict(rec)

Convert recarray to dictionary

seek_tell(fileobj, offset[, write0])

Seek in fileobj or check we're in the right place already

shape_zoom_affine(shape, zooms[, x_flip])

Get affine implied by given shape and zooms

working_type(in_type[, slope, inter])

Return array type from applying slope, inter to array of in_type

write_zeros(fileobj, count[, block_size])

Write count zero bytes to fileobj

DtypeMapper

class nibabel.volumeutils.DtypeMapper

Bases: object

Specialized mapper for numpy dtypes

We pass this mapper into the Recoder class to deal with numpy dtype hashing.

The hashing problem is that dtypes that compare equal may not have the same hash. This is true for numpys up to the current at time of writing (1.6.0). For numpy 1.2.1 at least, even dtypes that look exactly the same in terms of fields don’t always have the same hash. This makes dtypes difficult to use as keys in a dictionary.

This class wraps a dictionary in order to implement a __getitem__ to deal with dtype hashing. If the key doesn’t appear to be in the mapping, and it is a dtype, we compare (using ==) all known dtype keys to the input key, and return any matching values for the matching key.

__init__()
keys()
values()

Recoder

class nibabel.volumeutils.Recoder(codes, fields=('code', ), map_maker=<class 'collections.OrderedDict'>)

Bases: object

class to return canonical code(s) from code or aliases

The concept is a lot easier to read in the implementation and tests than it is to explain, so…

>>> # If you have some codes, and several aliases, like this:
>>> code1 = 1; aliases1=['one', 'first']
>>> code2 = 2; aliases2=['two', 'second']
>>> # You might want to do this:
>>> codes = [[code1]+aliases1,[code2]+aliases2]
>>> recodes = Recoder(codes)
>>> recodes.code['one']
1
>>> recodes.code['second']
2
>>> recodes.code[2]
2
>>> # Or maybe you have a code, a label and some aliases
>>> codes=((1,'label1','one', 'first'),(2,'label2','two'))
>>> # you might want to get back the code or the label
>>> recodes = Recoder(codes, fields=('code','label'))
>>> recodes.code['first']
1
>>> recodes.code['label1']
1
>>> recodes.label[2]
'label2'
>>> # For convenience, you can get the first entered name by
>>> # indexing the object directly
>>> recodes[2]
2

Create recoder object

codes give a sequence of code, alias sequences fields are names by which the entries in these sequences can be accessed.

By default fields gives the first column the name “code”. The first column is the vector of first entries in each of the sequences found in codes. Thence you can get the equivalent first column value with ob.code[value], where value can be a first column value, or a value in any of the other columns in that sequence.

You can give other columns names too, and access them in the same way - see the examples in the class docstring.

Parameters:
codessequence of sequences

Each sequence defines values (codes) that are equivalent

fields{(‘code’,) string sequence}, optional

names by which elements in sequences can be accessed

map_maker: callable, optional

constructor for dict-like objects used to store key value pairs. Default is dict. map_maker() generates an empty mapping. The mapping need only implement __getitem__, __setitem__, keys, values.

__init__(codes, fields=('code', ), map_maker=<class 'collections.OrderedDict'>)

Create recoder object

codes give a sequence of code, alias sequences fields are names by which the entries in these sequences can be accessed.

By default fields gives the first column the name “code”. The first column is the vector of first entries in each of the sequences found in codes. Thence you can get the equivalent first column value with ob.code[value], where value can be a first column value, or a value in any of the other columns in that sequence.

You can give other columns names too, and access them in the same way - see the examples in the class docstring.

Parameters:
codessequence of sequences

Each sequence defines values (codes) that are equivalent

fields{(‘code’,) string sequence}, optional

names by which elements in sequences can be accessed

map_maker: callable, optional

constructor for dict-like objects used to store key value pairs. Default is dict. map_maker() generates an empty mapping. The mapping need only implement __getitem__, __setitem__, keys, values.

add_codes(code_syn_seqs)

Add codes to object

Parameters:
code_syn_seqssequence

sequence of sequences, where each sequence S = code_syn_seqs[n] for n in 0..len(code_syn_seqs), is a sequence giving values in the same order as self.fields. Each S should be at least of the same length as self.fields. After this call, if self.fields == ['field1', 'field2'], then ``self.field1[S[n]] == S[0] for all n in 0..len(S) and self.field2[S[n]] == S[1] for all n in 0..len(S).

Examples

>>> code_syn_seqs = ((2, 'two'), (1, 'one'))
>>> rc = Recoder(code_syn_seqs)
>>> rc.value_set() == set((1,2))
True
>>> rc.add_codes(((3, 'three'), (1, 'first')))
>>> rc.value_set() == set((1,2,3))
True
>>> print(rc.value_set())  # set is actually ordered
OrderedSet([2, 1, 3])
keys()

Return all available code and alias values

Returns same value as obj.field1.keys() and, with the default initializing fields argument of fields=(‘code’,), this will return the same as obj.code.keys()

>>> codes = ((1, 'one'), (2, 'two'), (1, 'repeat value'))
>>> k = Recoder(codes).keys()
>>> set(k) == set([1, 2, 'one', 'repeat value', 'two'])
True
value_set(name=None)

Return OrderedSet of possible returned values for column

By default, the column is the first column.

Returns same values as set(obj.field1.values()) and, with the default initializing``fields`` argument of fields=(‘code’,), this will return the same as set(obj.code.values())

Parameters:
name{None, string}

Where default of none gives result for first column

>>> codes = ((1, ‘one’), (2, ‘two’), (1, ‘repeat value’))
>>> vs = Recoder(codes).value_set()
>>> vs == set([1, 2]) # Sets are not ordered, hence this test
True
>>> rc = Recoder(codes, fields=(‘code’, ‘label’))
>>> rc.value_set(‘label’) == set((‘one’, ‘two’, ‘repeat value’))
True

apply_read_scaling

nibabel.volumeutils.apply_read_scaling(arr, slope=None, inter=None)

Apply scaling in slope and inter to array arr

This is for loading the array from a file (as opposed to the reverse scaling when saving an array to file)

Return data will be arr * slope + inter. The trick is that we have to find a good precision to use for applying the scaling. The heuristic is that the data is always upcast to the higher of the types from arr, `slope, inter if slope and / or inter are not default values. If the dtype of arr is an integer, then we assume the data more or less fills the integer range, and upcast to a type such that the min, max of arr.dtype * scale + inter, will be finite.

Parameters:
arrarray-like
slopeNone or float, optional

slope value to apply to arr (arr * slope + inter). None corresponds to a value of 1.0

interNone or float, optional

intercept value to apply to arr (arr * slope + inter). None corresponds to a value of 0.0

Returns:
retarray

array with scaling applied. Maybe upcast in order to give room for the scaling. If scaling is default (1, 0), then ret may be arr ret is arr.

array_from_file

nibabel.volumeutils.array_from_file(shape, in_dtype, infile, offset=0, order='F', mmap=True)

Get array from file with specified shape, dtype and file offset

Parameters:
shapesequence

sequence specifying output array shape

in_dtypenumpy dtype

fully specified numpy dtype, including correct endianness

infilefile-like

open file-like object implementing at least read() and seek()

offsetint, optional

offset in bytes into infile to start reading array data. Default is 0

order{‘F’, ‘C’} string

order in which to write data. Default is ‘F’ (fortran order).

mmap{True, False, ‘c’, ‘r’, ‘r+’}

mmap controls the use of numpy memory mapping for reading data. If False, do not try numpy memmap for data array. If one of {‘c’, ‘r’, ‘r+’}, try numpy memmap with mode=mmap. A mmap value of True gives the same behavior as mmap='c'. If infile cannot be memory-mapped, ignore mmap value and read array from file.

Returns:
arrarray-like

array like object that can be sliced, containing data

Examples

>>> from io import BytesIO
>>> bio = BytesIO()
>>> arr = np.arange(6).reshape(1,2,3)
>>> _ = bio.write(arr.tobytes('F'))  # outputs int
>>> arr2 = array_from_file((1,2,3), arr.dtype, bio)
>>> np.all(arr == arr2)
True
>>> bio = BytesIO()
>>> _ = bio.write(b' ' * 10)
>>> _ = bio.write(arr.tobytes('F'))
>>> arr2 = array_from_file((1,2,3), arr.dtype, bio, 10)
>>> np.all(arr == arr2)
True

array_to_file

nibabel.volumeutils.array_to_file(data, fileobj, out_dtype=None, offset=0, intercept=0.0, divslope=1.0, mn=None, mx=None, order='F', nan2zero=True)

Helper function for writing arrays to file objects

Writes arrays as scaled by intercept and divslope, and clipped at (prescaling) mn minimum, and mx maximum.

  • Clip data array at min mn, max max where there are not None -> clipped (this is pre scale clipping)

  • Scale clipped with clipped_scaled = (clipped - intercept) / divslope

  • Clip clipped_scaled to fit into range of out_dtype (post scale clipping) -> clipped_scaled_clipped

  • If converting to integer out_dtype and nan2zero is True, set NaN values in clipped_scaled_clipped to 0

  • Write clipped_scaled_clipped_n2z to fileobj fileobj starting at offset offset in memory layout order

Parameters:
dataarray-like

array or array-like to write.

fileobjfile-like

file-like object implementing write method.

out_dtypeNone or dtype, optional

dtype to write array as. Data array will be coerced to this dtype before writing. If None (default) then use input data type.

offsetNone or int, optional

offset into fileobj at which to start writing data. Default is 0. None means start at current file position

interceptscalar, optional

scalar to subtract from data, before dividing by divslope. Default is 0.0

divslopeNone or scalar, optional

scalefactor to divide data by before writing. Default is 1.0. If None, there is no valid data, we write zeros.

mnscalar, optional

minimum threshold in (unscaled) data, such that all data below this value are set to this value. Default is None (no threshold). The typical use is to set -np.inf in the data to have this value (which might be the minimum non-finite value in the data).

mxscalar, optional

maximum threshold in (unscaled) data, such that all data above this value are set to this value. Default is None (no threshold). The typical use is to set np.inf in the data to have this value (which might be the maximum non-finite value in the data).

order{‘F’, ‘C’}, optional

memory order to write array. Default is ‘F’

nan2zero{True, False}, optional

Whether to set NaN values to 0 when writing integer output. Defaults to True. If False, NaNs will be represented as numpy does when casting; this depends on the underlying C library and is undefined. In practice nan2zero == False might be a good choice when you completely sure there will be no NaNs in the data. This value ignored for float output types. NaNs are treated as zero before applying intercept and divslope - so an array [np.nan] with an intercept of 10 becomes [-10] after conversion to integer out_dtype with nan2zero set. That is because you will likely apply divslope and intercept in reverse order when reading the data back, returning the zero you probably expected from the input NaN.

Examples

>>> from io import BytesIO
>>> sio = BytesIO()
>>> data = np.arange(10, dtype=np.float64)
>>> array_to_file(data, sio, np.float64)
>>> sio.getvalue() == data.tobytes('F')
True
>>> _ = sio.truncate(0); _ = sio.seek(0)  # outputs 0
>>> array_to_file(data, sio, np.int16)
>>> sio.getvalue() == data.astype(np.int16).tobytes()
True
>>> _ = sio.truncate(0); _ = sio.seek(0)
>>> array_to_file(data.byteswap(), sio, np.float64)
>>> sio.getvalue() == data.byteswap().tobytes('F')
True
>>> _ = sio.truncate(0); _ = sio.seek(0)
>>> array_to_file(data, sio, np.float64, order='C')
>>> sio.getvalue() == data.tobytes('C')
True

best_write_scale_ftype

nibabel.volumeutils.best_write_scale_ftype(arr, slope=1.0, inter=0.0, default=<class 'numpy.float32'>)

Smallest float type to contain range of arr after scaling

Scaling that will be applied to arr is (arr - inter) / slope.

Note that slope and inter get promoted to 1D arrays for this purpose to avoid the numpy scalar casting rules, which prevent scalars upcasting the array.

Parameters:
arrarray-like

array that will be scaled

slopearray-like, optional

scalar such that output array will be (arr - inter) / slope.

interarray-like, optional

scalar such that output array will be (arr - inter) / slope

defaultnumpy type, optional

minimum float type to return

Returns:
ftypenumpy type

Best floating point type for scaling. If no floating point type prevents overflow, return the top floating point type. If the input array arr already contains inf values, return the greater of the input type and the default type.

Examples

>>> arr = np.array([0, 1, 2], dtype=np.int16)
>>> best_write_scale_ftype(arr, 1, 0) is np.float32
True

Specify higher default return value

>>> best_write_scale_ftype(arr, 1, 0, default=np.float64) is np.float64
True

Even large values that don’t overflow don’t change output

>>> arr = np.array([0, np.finfo(np.float32).max], dtype=np.float32)
>>> best_write_scale_ftype(arr, 1, 0) is np.float32
True

Scaling > 1 reduces output values, so no upcast needed

>>> best_write_scale_ftype(arr, np.float32(2), 0) is np.float32
True

Scaling < 1 increases values, so upcast may be needed (and is here)

>>> best_write_scale_ftype(arr, np.float32(0.5), 0) is np.float64
True

better_float_of

nibabel.volumeutils.better_float_of(first, second, default=<class 'numpy.float32'>)

Return more capable float type of first and second

Return default if neither of first or second is a float

Parameters:
firstnumpy type specifier

Any valid input to np.dtype()`

secondnumpy type specifier

Any valid input to np.dtype()`

defaultnumpy type specifier, optional

Any valid input to np.dtype()`

Returns:
better_typenumpy type

More capable of first or second if both are floats; if only one is a float return that, otherwise return default.

Examples

>>> better_float_of(np.float32, np.float64) is np.float64
True
>>> better_float_of(np.float32, 'i4') is np.float32
True
>>> better_float_of('i2', 'u4') is np.float32
True
>>> better_float_of('i2', 'u4', np.float64) is np.float64
True

finite_range

nibabel.volumeutils.finite_range(arr, check_nan=False)

Get range (min, max) or range and flag (min, max, has_nan) from arr

Parameters:
arrarray-like
check_nan{False, True}, optional

Whether to return third output, a bool signaling whether there are NaN values in arr

Returns:
mnscalar

minimum of values in (flattened) array

mxscalar

maximum of values in (flattened) array

has_nanbool

Returned if check_nan is True. has_nan is True if there are one or more NaN values in arr

Examples

>>> a = np.array([[-1, 0, 1],[np.inf, np.nan, -np.inf]])
>>> finite_range(a)
(-1.0, 1.0)
>>> a = np.array([[-1, 0, 1],[np.inf, np.nan, -np.inf]])
>>> finite_range(a, check_nan=True)
(-1.0, 1.0, True)
>>> a = np.array([[np.nan],[np.nan]])
>>> finite_range(a) == (np.inf, -np.inf)
True
>>> a = np.array([[-3, 0, 1],[2,-1,4]], dtype=int)
>>> finite_range(a)
(-3, 4)
>>> a = np.array([[1, 0, 1],[2,3,4]], dtype=np.uint)
>>> finite_range(a)
(0, 4)
>>> a = a + 1j
>>> finite_range(a)
(1j, (4+1j))
>>> a = np.zeros((2,), dtype=[('f1', 'i2')])
>>> finite_range(a)
Traceback (most recent call last):
   ...
TypeError: Can only handle numeric types

fname_ext_ul_case

nibabel.volumeutils.fname_ext_ul_case(fname)

fname with ext changed to upper / lower case if file exists

Check for existence of fname. If it does exist, return unmodified. If it doesn’t, check for existence of fname with case changed from lower to upper, or upper to lower. Return this modified fname if it exists. Otherwise return fname unmodified

Parameters:
fnamestr

filename.

Returns:
mod_fnamestr

filename, maybe with extension of opposite case

int_scinter_ftype

nibabel.volumeutils.int_scinter_ftype(ifmt, slope=1.0, inter=0.0, default=<class 'numpy.float32'>)

float type containing int type ifmt * slope + inter

Return float type that can represent the max and the min of the ifmt type after multiplication with slope and addition of inter with something like np.array([imin, imax], dtype=ifmt) * slope + inter.

Note that slope and inter get promoted to 1D arrays for this purpose to avoid the numpy scalar casting rules, which prevent scalars upcasting the array.

Parameters:
ifmtobject

numpy integer type (e.g. np.int32)

slopefloat, optional

slope, default 1.0

interfloat, optional

intercept, default 0.0

default_outobject, optional

numpy floating point type, default is np.float32

Returns:
ftypeobject

numpy floating point type

Notes

It is difficult to make floats overflow with just addition because the deltas are so large at the extremes of floating point. For example:

>>> arr = np.array([np.finfo(np.float32).max], dtype=np.float32)
>>> res = arr + np.iinfo(np.int16).max
>>> arr == res
array([ True])

Examples

>>> int_scinter_ftype(np.int8, 1.0, 0.0) == np.float32
True
>>> int_scinter_ftype(np.int8, 1e38, 0.0) == np.float64
True

make_dt_codes

nibabel.volumeutils.make_dt_codes(codes_seqs)

Create full dt codes Recoder instance from datatype codes

Include created numpy dtype (from numpy type) and opposite endian numpy dtype

Parameters:
codes_seqssequence of sequences

contained sequences make be length 3 or 4, but must all be the same length. Elements are data type code, data type name, and numpy type (such as np.float32). The fourth element is the nifti string representation of the code (e.g. “NIFTI_TYPE_FLOAT32”)

Returns:
recRecoder instance

Recoder that, by default, returns code when indexed with any of the corresponding code, name, type, dtype, or swapped dtype. You can also index with niistring values if codes_seqs had sequences of length 4 instead of 3.

pretty_mapping

nibabel.volumeutils.pretty_mapping(mapping, getterfunc=None)

Make pretty string from mapping

Adjusts text column to print values on basis of longest key. Probably only sensible if keys are mainly strings.

You can pass in a callable that does clever things to get the values out of the mapping, given the names. By default, we just use __getitem__

Parameters:
mappingmapping

implementing iterator returning keys and .items()

getterfuncNone or callable

callable taking two arguments, obj and key where obj is the passed mapping. If None, just use lambda obj, key: obj[key]

Returns:
strstring

Examples

>>> d = {'a key': 'a value'}
>>> print(pretty_mapping(d))
a key  : a value
>>> class C: # to control ordering, show get_ method
...     def __iter__(self):
...         return iter(('short_field','longer_field'))
...     def __getitem__(self, key):
...         if key == 'short_field':
...             return 0
...         if key == 'longer_field':
...             return 'str'
...     def get_longer_field(self):
...         return 'method string'
>>> def getter(obj, key):
...     # Look for any 'get_<name>' methods
...     try:
...         return obj.__getattribute__('get_' + key)()
...     except AttributeError:
...         return obj[key]
>>> print(pretty_mapping(C(), getter))
short_field   : 0
longer_field  : method string

rec2dict

nibabel.volumeutils.rec2dict(rec)

Convert recarray to dictionary

Also converts scalar values to scalars

Parameters:
recndarray

structured ndarray

Returns:
dctdict

dict with key, value pairs as for rec

Examples

>>> r = np.zeros((), dtype = [('x', 'i4'), ('s', 'S10')])
>>> d = rec2dict(r)
>>> d == {'x': 0, 's': b''}
True

seek_tell

nibabel.volumeutils.seek_tell(fileobj, offset, write0=False)

Seek in fileobj or check we’re in the right place already

Parameters:
fileobjfile-like

object implementing seek and (if seek raises an OSError) tell

offsetint

position in file to which to seek

write0{False, True}, optional

If True, and standard seek fails, try to write zeros to the file to reach offset. This can be useful when writing bz2 files, that cannot do write seeks.

shape_zoom_affine

nibabel.volumeutils.shape_zoom_affine(shape, zooms, x_flip=True)

Get affine implied by given shape and zooms

We get the translations from the center of the image (implied by shape).

Parameters:
shape(N,) array-like

shape of image data. N is the number of dimensions

zooms(N,) array-like

zooms (voxel sizes) of the image

x_flip{True, False}

whether to flip the X row of the affine. Corresponds to radiological storage on disk.

Returns:
aff(4,4) array

affine giving correspondence of voxel coordinates to mm coordinates, taking the center of the image as origin

Examples

>>> shape = (3, 5, 7)
>>> zooms = (3, 2, 1)
>>> shape_zoom_affine((3, 5, 7), (3, 2, 1))
array([[-3.,  0.,  0.,  3.],
       [ 0.,  2.,  0., -4.],
       [ 0.,  0.,  1., -3.],
       [ 0.,  0.,  0.,  1.]])
>>> shape_zoom_affine((3, 5, 7), (3, 2, 1), False)
array([[ 3.,  0.,  0., -3.],
       [ 0.,  2.,  0., -4.],
       [ 0.,  0.,  1., -3.],
       [ 0.,  0.,  0.,  1.]])

working_type

nibabel.volumeutils.working_type(in_type, slope=1.0, inter=0.0)

Return array type from applying slope, inter to array of in_type

Numpy type that results from an array of type in_type being combined with slope and inter. It returns something like the dtype type of ((np.zeros((2,), dtype=in_type) - inter) / slope), but ignoring the actual values of slope and inter.

Note that you would not necessarily get the same type by applying slope and inter the other way round. Also, you’ll see that the order in which slope and inter are applied is the opposite of the order in which they are passed.

Parameters:
in_typenumpy type specifier

Numpy type of input array. Any valid input for np.dtype()

slopescalar, optional

slope to apply to array. If 1.0 (default), ignore this value and its type.

interscalar, optional

intercept to apply to array. If 0.0 (default), ignore this value and its type.

Returns:
wtype: numpy type

Numpy type resulting from applying inter and slope to array of type in_type.

write_zeros

nibabel.volumeutils.write_zeros(fileobj, count, block_size=8194)

Write count zero bytes to fileobj

Parameters:
fileobjfile-like object

with write method

countint

number of bytes to write

block_sizeint, optional

largest continuous block to write.