`fileslice`¶

Utilities for getting array slices out of file-like objects

`calc_slicedefs`(sliceobj, in_shape, itemsize, ...)	Return parameters for slicing array with sliceobj given memory layout
`canonical_slicers`(sliceobj, shape[, check_inds])	Return canonical version of sliceobj for array shape shape
`fileslice`(fileobj, sliceobj, shape, dtype[, ...])	Slice array in fileobj using sliceobj slicer and array definitions
`fill_slicer`(slicer, in_len)	Return slice object with Nones filled out to match in_len
`is_fancy`(sliceobj)	Returns True if sliceobj is attempting fancy indexing
`optimize_read_slicers`(sliceobj, in_shape, ...)	Calculates slices to read from disk, and apply after reading
`optimize_slicer`(slicer, dim_len, all_full, ...)	Return maybe modified slice and post-slice slicing for slicer
`predict_shape`(sliceobj, in_shape)	Predict shape of array from slicing array shape shape with sliceobj
`read_segments`(fileobj, segments, n_bytes[, lock])	Read n_bytes byte data implied by segments from fileobj
`slice2len`(slicer, in_len)	Output length after slicing original length in_len with slicer Parameters ---------- slicer : slice object in_len : int
`slice2outax`(ndim, sliceobj)	Matching output axes for input array ndim ndim and slice sliceobj
`slicers2segments`(read_slicers, in_shape, ...)	Get segments from read_slicers given in_shape and memory steps
`strided_scalar`(shape[, scalar])	Return array shape shape where all entries point to value scalar
`threshold_heuristic`(slicer, dim_len, stride)	Whether to force full axis read or contiguous read of stepped slice

calc_slicedefs¶

nibabel.fileslice.calc_slicedefs(sliceobj, in_shape, itemsize, offset, order, heuristic=<function threshold_heuristic>)¶

Return parameters for slicing array with sliceobj given memory layout

Calculate the best combination of skips / (read + discard) to use for reading the data from disk / memory, then generate corresponding segments, the disk offsets and read lengths to read the memory. If we have chosen some (read + discard) optimization, then we need to discard the surplus values from the read array using post_slicers, a slicing tuple that takes the array as read from a file-like object, and returns the array we want.

Parameters:

sliceobjobject: something that can be used to slice an array as in arr[sliceobj]
in_shapesequence: shape of underlying array to be sliced
itemsizeint: element size in array (in bytes)
offsetint: offset of array data in underlying file or memory buffer
order{‘C’, ‘F’}: memory layout of underlying array
heuristiccallable, optional: function taking slice object, dim_len, stride length as arguments, returning one of ‘full’, ‘contiguous’, None. See optimize_slicer() and threshold_heuristic()

Returns:

segmentslist: list of 2 element lists where lists are (offset, length), giving absolute memory offset in bytes and number of bytes to read
read_shapetuple: shape with which to interpret memory as read from segments. Interpreting the memory read from segments with this shape, and a dtype, gives an intermediate array - call this R
post_slicerstuple: Any new slicing to be applied to the array R after reading via segments and reshaping via read_shape. Slices are in terms of read_shape. If empty, no new slicing to apply

canonical_slicers¶

nibabel.fileslice.canonical_slicers(sliceobj, shape, check_inds=True)¶

Return canonical version of sliceobj for array shape shape

sliceobj is a slicer for an array A implied by shape.

Expand sliceobj with slice(None) to add any missing (implied) axes in sliceobj
Find any slicers in sliceobj that do a full axis slice and replace by slice(None)
Replace any floating point values for slicing with integers
Replace negative integer slice values with equivalent positive integers.

Does not handle fancy indexing (indexing with arrays or array-like indices)

Parameters:

sliceobjobject: something that can be used to slice an array as in arr[sliceobj]
shapesequence: shape of array that will be indexed by sliceobj
check_inds{True, False}, optional: Whether to check if integer indices are out of bounds

Returns:

can_slicerstuple: version of sliceobj for which Ellipses have been expanded, missing (implied) dimensions have been appended, and slice objects equivalent to slice(None) have been replaced by slice(None), integer axes have been checked, and negative indices set to positive equivalent

fileslice¶

nibabel.fileslice.fileslice(fileobj, sliceobj, shape, dtype, offset=0, order='C', heuristic=<function threshold_heuristic>, lock=None)¶

Slice array in fileobj using sliceobj slicer and array definitions

fileobj contains the contiguous binary data for an array A of shape, dtype, memory layout shape, dtype, order, with the binary data starting at file offset offset.

Our job is to return the sliced array A[sliceobj] in the most efficient way in terms of memory and time.

Sometimes it will be quicker to read memory that we will later throw away, to save time we might lose doing short seeks on fileobj. Call these alternatives: (read + discard); and skip. This routine guesses when to (read+discard) or skip using the callable heuristic, with a default using a hard threshold for the memory gap large enough to prefer a skip.

Parameters:

fileobjfile-like object: file-like object, opened for reading in binary mode. Implements read and seek.
sliceobjobject: something that can be used to slice an array as in arr[sliceobj].
shapesequence: shape of full array inside fileobj.
dtypedtype specifier: dtype of array inside fileobj, or input to numpy.dtype to specify array dtype.
offsetint, optional: offset of array data within fileobj
order{‘C’, ‘F’}, optional: memory layout of array in fileobj.
heuristiccallable, optional: function taking slice object, axis length, stride length as arguments, returning one of ‘full’, ‘contiguous’, None. See optimize_slicer() and see threshold_heuristic() for an example.
lock{None, threading.Lock, lock-like} optional: If provided, used to ensure that paired calls to seek and read cannot be interrupted by another thread accessing the same fileobj. Each thread which accesses the same file via read_segments must share a lock in order to ensure that the file access is thread-safe. A lock does not need to be provided for single-threaded access. The default value (None) results in a lock-like object (a _NullLock) which does not do anything.

Returns:

sliced_arrarray: Array in fileobj as sliced with sliceobj

fill_slicer¶

nibabel.fileslice.fill_slicer(slicer, in_len)¶

Return slice object with Nones filled out to match in_len

Also fixes too large stop / start values according to slice() slicing rules.

The returned slicer can have a None as slicer.stop if slicer.step is negative and the input slicer.stop is None. This is because we can’t represent the stop as an integer, because -1 has a different meaning.

Parameters:

slicerslice object
in_lenint: length of axis on which slicer will be applied

Returns:

can_slicerslice object: slice with start, stop, step set to explicit values, with the exception of stop for negative step, which is None for the case of slicing down through the first element

is_fancy¶

nibabel.fileslice.is_fancy(sliceobj)¶

Returns True if sliceobj is attempting fancy indexing

Parameters:

sliceobjobject: something that can be used to slice an array as in arr[sliceobj]

Returns:

tf: bool: True if sliceobj represents fancy indexing, False for basic indexing

optimize_read_slicers¶

nibabel.fileslice.optimize_read_slicers(sliceobj, in_shape, itemsize, heuristic)¶

Calculates slices to read from disk, and apply after reading

Parameters:

sliceobjobject: something that can be used to slice an array as in arr[sliceobj]. Can be assumed to be canonical in the sense of canonical_slicers
in_shapesequence: shape of underlying array to be sliced. Array for in_shape assumed to be already in ‘F’ order. Reorder shape / sliceobj for slicing a ‘C’ array before passing to this function.
itemsizeint: element size in array (bytes)
heuristiccallable: function taking slice object, axis length, and stride length as arguments, returning one of ‘full’, ‘contiguous’, None. See optimize_slicer(); see threshold_heuristic() for an example.

Returns:

read_slicerstuple: sliceobj maybe rephrased to fill out dimensions that are better read from disk and later trimmed to their original size with post_slicers. read_slicers implies a block of memory to be read from disk. The actual disk positions come from slicers2segments run over read_slicers. Includes any newaxis dimensions in sliceobj
post_slicerstuple: Any new slicing to be applied to the read array after reading. The post_slicers discard any memory that we read to save time, but that we don’t need for the slice. Include any newaxis dimension added by sliceobj

optimize_slicer¶

nibabel.fileslice.optimize_slicer(slicer, dim_len, all_full, is_slowest, stride, heuristic=<function threshold_heuristic>)¶

Return maybe modified slice and post-slice slicing for slicer

Parameters:

slicerslice object or int
dim_lenint: length of axis along which to slice
all_fullbool: Whether dimensions up until now have been full (all elements)
is_slowestbool: Whether this dimension is the slowest changing in memory / on disk
strideint: size of one step along this axis
heuristiccallable, optional: function taking slice object, dim_len, stride length as arguments, returning one of ‘full’, ‘contiguous’, None. See threshold_heuristic() for an example.

Returns:

to_readslice object or int: maybe modified slice based on slicer expressing what data should be read from an underlying file or buffer. to_read must always have positive step (because we don’t want to go backwards in the buffer / file)
post_sliceslice object: slice to be applied after array has been read. Applies any transformations in slicer that have not been applied in to_read. If axis will be dropped by to_read slicing, so no slicing would make sense, return string dropped

Notes

This is the heart of the algorithm for making segments from slice objects.

A contiguous slice is a slice with slice.step in (1, -1)

A full slice is a continuous slice returning all elements.

The main question we have to ask is whether we should transform to_read, post_slice to prefer a full read and partial slice. We only do this in the case of all_full==True. In this case we might benefit from reading a continuous chunk of data even if the slice is not continuous, or reading all the data even if the slice is not full. Apply a heuristic heuristic to decide whether to do this, and adapt to_read and post_slice slice accordingly.

Otherwise (apart from constraint to be positive) return to_read unaltered and post_slice as slice(None)

predict_shape¶

nibabel.fileslice.predict_shape(sliceobj, in_shape)¶

Predict shape of array from slicing array shape shape with sliceobj

Parameters:

sliceobjobject: something that can be used to slice an array as in arr[sliceobj]
in_shapesequence: shape of array that could be sliced by sliceobj

Returns:

out_shapetuple: predicted shape arising from slicing array shape in_shape with sliceobj

read_segments¶

nibabel.fileslice.read_segments(fileobj, segments, n_bytes, lock=None)¶

Read n_bytes byte data implied by segments from fileobj

Parameters:

fileobjfile-like object: Implements seek and read
segmentssequence: list of 2 sequences where sequences are (offset, length), giving absolute file offset in bytes and number of bytes to read
n_bytesint: total number of bytes that will be read
lock{None, threading.Lock, lock-like} optional: If provided, used to ensure that paired calls to seek and read cannot be interrupted by another thread accessing the same fileobj. Each thread which accesses the same file via read_segments must share a lock in order to ensure that the file access is thread-safe. A lock does not need to be provided for single-threaded access. The default value (None) results in a lock-like object (a _NullLock) which does not do anything.

Returns:

bufferbuffer object: object implementing buffer protocol, such as byte string or ndarray or mmap or ctypes c_char_array

slice2len¶

nibabel.fileslice.slice2len(slicer, in_len)¶

Output length after slicing original length in_len with slicer Parameters ———- slicer : slice object in_len : int

Returns:

out_lenint: Length after slicing

Notes

Returns same as len(np.arange(in_len)[slicer])

slice2outax¶

nibabel.fileslice.slice2outax(ndim, sliceobj)¶

Matching output axes for input array ndim ndim and slice sliceobj

Parameters:

ndimint: number of axes in input array
sliceobjobject: something that can be used to slice an array as in arr[sliceobj]

Returns:

out_ax_indstuple: Say A` is a (pretend) input array of `ndim` dimensions. Say ``B = A[sliceobj]. out_ax_inds has one value per axis in A giving corresponding axis in B.

slicers2segments¶

nibabel.fileslice.slicers2segments(read_slicers, in_shape, offset, itemsize)¶

Get segments from read_slicers given in_shape and memory steps

Parameters:

read_slicersobject: something that can be used to slice an array as in arr[sliceobj] Slice objects can by be assumed canonical as in canonical_slicers, and positive as in _positive_slice
in_shapesequence: shape of underlying array on disk before reading
offsetint: offset of array data in underlying file or memory buffer
itemsizeint: element size in array (in bytes)

Returns:

segmentslist: list of 2 element lists where lists are [offset, length], giving absolute memory offset in bytes and number of bytes to read

strided_scalar¶

nibabel.fileslice.strided_scalar(shape, scalar=0.0)¶

Return array shape shape where all entries point to value scalar

Parameters:

shapesequence: Shape of output array.
scalarscalar: Scalar value with which to fill array.

Returns:

strided_arrarray

Array of shape shape for which all values == scalar, built by setting all strides of strided_arr to 0, so the scalar is broadcast out to the full array shape. strided_arr is flagged as not writeable.

The array is set read-only to avoid a numpy error when broadcasting - see https://github.com/numpy/numpy/issues/6491

threshold_heuristic¶

nibabel.fileslice.threshold_heuristic(slicer, dim_len, stride, skip_thresh=256)¶

Whether to force full axis read or contiguous read of stepped slice

Allows fileslice() to sometimes read memory that it will throw away in order to get maximum speed. In other words, trade memory for fewer disk reads.

Parameters:

slicerslice object, or int: If slice, can be assumed to be full as in fill_slicer
dim_lenint: length of axis being sliced
strideint: memory distance between elements on this axis
skip_threshint, optional: Memory gap threshold in bytes above which to prefer skipping memory rather than reading it and later discarding.

Returns:

action{‘full’, ‘contiguous’, None}

Gives the suggested optimization for reading the data

‘full’ - read whole axis
‘contiguous’ - read all elements between start and stop
None - read only memory needed for output

Notes

Let’s say we are in the middle of reading a file at the start of some memory length $B$ bytes. We don’t need the memory, and we are considering whether to read it anyway (then throw it away) (READ) or stop reading, skip $B$ bytes and restart reading from there (SKIP).

After trying some more fancy algorithms, a hard threshold (skip_thresh) for the maximum skip distance seemed to work well, as measured by times on nibabel.benchmarks.bench_fileslice

NiBabel

Access a cacophony of neuro-imaging file formats

Table of Contents

Previous topic

Next topic

`fileslice`¶

calc_slicedefs¶

canonical_slicers¶

fileslice¶

fill_slicer¶

is_fancy¶

optimize_read_slicers¶

optimize_slicer¶

predict_shape¶

read_segments¶

slice2len¶

slice2outax¶

slicers2segments¶

strided_scalar¶

threshold_heuristic¶