fileslice
¶
Utilities for getting array slices out of file-like objects
|
Return parameters for slicing array with sliceobj given memory layout |
|
Return canonical version of sliceobj for array shape shape |
|
Slice array in fileobj using sliceobj slicer and array definitions |
|
Return slice object with Nones filled out to match in_len |
|
Returns True if sliceobj is attempting fancy indexing |
|
Calculates slices to read from disk, and apply after reading |
|
Return maybe modified slice and post-slice slicing for slicer |
|
Predict shape of array from slicing array shape shape with sliceobj |
|
Read n_bytes byte data implied by segments from fileobj |
|
Output length after slicing original length in_len with slicer Parameters ---------- slicer : slice object in_len : int |
|
Matching output axes for input array ndim ndim and slice sliceobj |
|
Get segments from read_slicers given in_shape and memory steps |
|
Return array shape shape where all entries point to value scalar |
|
Whether to force full axis read or contiguous read of stepped slice |
calc_slicedefs¶
- nibabel.fileslice.calc_slicedefs(sliceobj, in_shape, itemsize, offset, order, heuristic=<function threshold_heuristic>)¶
Return parameters for slicing array with sliceobj given memory layout
Calculate the best combination of skips / (read + discard) to use for reading the data from disk / memory, then generate corresponding segments, the disk offsets and read lengths to read the memory. If we have chosen some (read + discard) optimization, then we need to discard the surplus values from the read array using post_slicers, a slicing tuple that takes the array as read from a file-like object, and returns the array we want.
- Parameters:
- sliceobjobject
something that can be used to slice an array as in
arr[sliceobj]
- in_shapesequence
shape of underlying array to be sliced
- itemsizeint
element size in array (in bytes)
- offsetint
offset of array data in underlying file or memory buffer
- order{‘C’, ‘F’}
memory layout of underlying array
- heuristiccallable, optional
function taking slice object, dim_len, stride length as arguments, returning one of ‘full’, ‘contiguous’, None. See
optimize_slicer()
andthreshold_heuristic()
- Returns:
- segmentslist
list of 2 element lists where lists are (offset, length), giving absolute memory offset in bytes and number of bytes to read
- read_shapetuple
shape with which to interpret memory as read from segments. Interpreting the memory read from segments with this shape, and a dtype, gives an intermediate array - call this
R
- post_slicerstuple
Any new slicing to be applied to the array
R
after reading via segments and reshaping via read_shape. Slices are in terms of read_shape. If empty, no new slicing to apply
canonical_slicers¶
- nibabel.fileslice.canonical_slicers(sliceobj, shape, check_inds=True)¶
Return canonical version of sliceobj for array shape shape
sliceobj is a slicer for an array
A
implied by shape.Expand sliceobj with
slice(None)
to add any missing (implied) axes in sliceobjFind any slicers in sliceobj that do a full axis slice and replace by
slice(None)
Replace any floating point values for slicing with integers
Replace negative integer slice values with equivalent positive integers.
Does not handle fancy indexing (indexing with arrays or array-like indices)
- Parameters:
- sliceobjobject
something that can be used to slice an array as in
arr[sliceobj]
- shapesequence
shape of array that will be indexed by sliceobj
- check_inds{True, False}, optional
Whether to check if integer indices are out of bounds
- Returns:
- can_slicerstuple
version of sliceobj for which Ellipses have been expanded, missing (implied) dimensions have been appended, and slice objects equivalent to
slice(None)
have been replaced byslice(None)
, integer axes have been checked, and negative indices set to positive equivalent
fileslice¶
- nibabel.fileslice.fileslice(fileobj, sliceobj, shape, dtype, offset=0, order='C', heuristic=<function threshold_heuristic>, lock=None)¶
Slice array in fileobj using sliceobj slicer and array definitions
fileobj contains the contiguous binary data for an array
A
of shape, dtype, memory layout shape, dtype, order, with the binary data starting at file offset offset.Our job is to return the sliced array
A[sliceobj]
in the most efficient way in terms of memory and time.Sometimes it will be quicker to read memory that we will later throw away, to save time we might lose doing short seeks on fileobj. Call these alternatives: (read + discard); and skip. This routine guesses when to (read+discard) or skip using the callable heuristic, with a default using a hard threshold for the memory gap large enough to prefer a skip.
- Parameters:
- fileobjfile-like object
file-like object, opened for reading in binary mode. Implements
read
andseek
.- sliceobjobject
something that can be used to slice an array as in
arr[sliceobj]
.- shapesequence
shape of full array inside fileobj.
- dtypedtype specifier
dtype of array inside fileobj, or input to
numpy.dtype
to specify array dtype.- offsetint, optional
offset of array data within fileobj
- order{‘C’, ‘F’}, optional
memory layout of array in fileobj.
- heuristiccallable, optional
function taking slice object, axis length, stride length as arguments, returning one of ‘full’, ‘contiguous’, None. See
optimize_slicer()
and seethreshold_heuristic()
for an example.- lock{None, threading.Lock, lock-like} optional
If provided, used to ensure that paired calls to
seek
andread
cannot be interrupted by another thread accessing the samefileobj
. Each thread which accesses the same file viaread_segments
must share a lock in order to ensure that the file access is thread-safe. A lock does not need to be provided for single-threaded access. The default value (None
) results in a lock-like object (a_NullLock
) which does not do anything.
- Returns:
- sliced_arrarray
Array in fileobj as sliced with sliceobj
fill_slicer¶
- nibabel.fileslice.fill_slicer(slicer, in_len)¶
Return slice object with Nones filled out to match in_len
Also fixes too large stop / start values according to slice() slicing rules.
The returned slicer can have a None as slicer.stop if slicer.step is negative and the input slicer.stop is None. This is because we can’t represent the
stop
as an integer, because -1 has a different meaning.- Parameters:
- slicerslice object
- in_lenint
length of axis on which slicer will be applied
- Returns:
- can_slicerslice object
slice with start, stop, step set to explicit values, with the exception of
stop
for negative step, which is None for the case of slicing down through the first element
is_fancy¶
- nibabel.fileslice.is_fancy(sliceobj)¶
Returns True if sliceobj is attempting fancy indexing
- Parameters:
- sliceobjobject
something that can be used to slice an array as in
arr[sliceobj]
- Returns:
- tf: bool
True if sliceobj represents fancy indexing, False for basic indexing
optimize_read_slicers¶
- nibabel.fileslice.optimize_read_slicers(sliceobj, in_shape, itemsize, heuristic)¶
Calculates slices to read from disk, and apply after reading
- Parameters:
- sliceobjobject
something that can be used to slice an array as in
arr[sliceobj]
. Can be assumed to be canonical in the sense ofcanonical_slicers
- in_shapesequence
shape of underlying array to be sliced. Array for in_shape assumed to be already in ‘F’ order. Reorder shape / sliceobj for slicing a ‘C’ array before passing to this function.
- itemsizeint
element size in array (bytes)
- heuristiccallable
function taking slice object, axis length, and stride length as arguments, returning one of ‘full’, ‘contiguous’, None. See
optimize_slicer()
; seethreshold_heuristic()
for an example.
- Returns:
- read_slicerstuple
sliceobj maybe rephrased to fill out dimensions that are better read from disk and later trimmed to their original size with post_slicers. read_slicers implies a block of memory to be read from disk. The actual disk positions come from slicers2segments run over read_slicers. Includes any
newaxis
dimensions in sliceobj- post_slicerstuple
Any new slicing to be applied to the read array after reading. The post_slicers discard any memory that we read to save time, but that we don’t need for the slice. Include any
newaxis
dimension added by sliceobj
optimize_slicer¶
- nibabel.fileslice.optimize_slicer(slicer, dim_len, all_full, is_slowest, stride, heuristic=<function threshold_heuristic>)¶
Return maybe modified slice and post-slice slicing for slicer
- Parameters:
- slicerslice object or int
- dim_lenint
length of axis along which to slice
- all_fullbool
Whether dimensions up until now have been full (all elements)
- is_slowestbool
Whether this dimension is the slowest changing in memory / on disk
- strideint
size of one step along this axis
- heuristiccallable, optional
function taking slice object, dim_len, stride length as arguments, returning one of ‘full’, ‘contiguous’, None. See
threshold_heuristic()
for an example.
- Returns:
- to_readslice object or int
maybe modified slice based on slicer expressing what data should be read from an underlying file or buffer. to_read must always have positive
step
(because we don’t want to go backwards in the buffer / file)- post_sliceslice object
slice to be applied after array has been read. Applies any transformations in slicer that have not been applied in to_read. If axis will be dropped by to_read slicing, so no slicing would make sense, return string
dropped
Notes
This is the heart of the algorithm for making segments from slice objects.
A contiguous slice is a slice with
slice.step in (1, -1)
A full slice is a continuous slice returning all elements.
The main question we have to ask is whether we should transform to_read, post_slice to prefer a full read and partial slice. We only do this in the case of all_full==True. In this case we might benefit from reading a continuous chunk of data even if the slice is not continuous, or reading all the data even if the slice is not full. Apply a heuristic heuristic to decide whether to do this, and adapt to_read and post_slice slice accordingly.
Otherwise (apart from constraint to be positive) return to_read unaltered and post_slice as
slice(None)
predict_shape¶
- nibabel.fileslice.predict_shape(sliceobj, in_shape)¶
Predict shape of array from slicing array shape shape with sliceobj
- Parameters:
- sliceobjobject
something that can be used to slice an array as in
arr[sliceobj]
- in_shapesequence
shape of array that could be sliced by sliceobj
- Returns:
- out_shapetuple
predicted shape arising from slicing array shape in_shape with sliceobj
read_segments¶
- nibabel.fileslice.read_segments(fileobj, segments, n_bytes, lock=None)¶
Read n_bytes byte data implied by segments from fileobj
- Parameters:
- fileobjfile-like object
Implements seek and read
- segmentssequence
list of 2 sequences where sequences are (offset, length), giving absolute file offset in bytes and number of bytes to read
- n_bytesint
total number of bytes that will be read
- lock{None, threading.Lock, lock-like} optional
If provided, used to ensure that paired calls to
seek
andread
cannot be interrupted by another thread accessing the samefileobj
. Each thread which accesses the same file viaread_segments
must share a lock in order to ensure that the file access is thread-safe. A lock does not need to be provided for single-threaded access. The default value (None
) results in a lock-like object (a_NullLock
) which does not do anything.
- Returns:
- bufferbuffer object
object implementing buffer protocol, such as byte string or ndarray or mmap or ctypes
c_char_array
slice2len¶
- nibabel.fileslice.slice2len(slicer, in_len)¶
Output length after slicing original length in_len with slicer Parameters ———- slicer : slice object in_len : int
- Returns:
- out_lenint
Length after slicing
Notes
Returns same as
len(np.arange(in_len)[slicer])
slice2outax¶
- nibabel.fileslice.slice2outax(ndim, sliceobj)¶
Matching output axes for input array ndim ndim and slice sliceobj
- Parameters:
- ndimint
number of axes in input array
- sliceobjobject
something that can be used to slice an array as in
arr[sliceobj]
- Returns:
- out_ax_indstuple
Say
A` is a (pretend) input array of `ndim` dimensions. Say ``B = A[sliceobj]
. out_ax_inds has one value per axis inA
giving corresponding axis inB
.
slicers2segments¶
- nibabel.fileslice.slicers2segments(read_slicers, in_shape, offset, itemsize)¶
Get segments from read_slicers given in_shape and memory steps
- Parameters:
- read_slicersobject
something that can be used to slice an array as in
arr[sliceobj]
Slice objects can by be assumed canonical as incanonical_slicers
, and positive as in_positive_slice
- in_shapesequence
shape of underlying array on disk before reading
- offsetint
offset of array data in underlying file or memory buffer
- itemsizeint
element size in array (in bytes)
- Returns:
- segmentslist
list of 2 element lists where lists are [offset, length], giving absolute memory offset in bytes and number of bytes to read
strided_scalar¶
- nibabel.fileslice.strided_scalar(shape, scalar=0.0)¶
Return array shape shape where all entries point to value scalar
- Parameters:
- shapesequence
Shape of output array.
- scalarscalar
Scalar value with which to fill array.
- Returns:
- strided_arrarray
Array of shape shape for which all values == scalar, built by setting all strides of strided_arr to 0, so the scalar is broadcast out to the full array shape. strided_arr is flagged as not writeable.
The array is set read-only to avoid a numpy error when broadcasting - see https://github.com/numpy/numpy/issues/6491
threshold_heuristic¶
- nibabel.fileslice.threshold_heuristic(slicer, dim_len, stride, skip_thresh=256)¶
Whether to force full axis read or contiguous read of stepped slice
Allows
fileslice()
to sometimes read memory that it will throw away in order to get maximum speed. In other words, trade memory for fewer disk reads.- Parameters:
- slicerslice object, or int
If slice, can be assumed to be full as in
fill_slicer
- dim_lenint
length of axis being sliced
- strideint
memory distance between elements on this axis
- skip_threshint, optional
Memory gap threshold in bytes above which to prefer skipping memory rather than reading it and later discarding.
- Returns:
- action{‘full’, ‘contiguous’, None}
Gives the suggested optimization for reading the data
‘full’ - read whole axis
‘contiguous’ - read all elements between start and stop
None - read only memory needed for output
Notes
Let’s say we are in the middle of reading a file at the start of some memory length $B$ bytes. We don’t need the memory, and we are considering whether to read it anyway (then throw it away) (READ) or stop reading, skip $B$ bytes and restart reading from there (SKIP).
After trying some more fancy algorithms, a hard threshold (skip_thresh) for the maximum skip distance seemed to work well, as measured by times on
nibabel.benchmarks.bench_fileslice