bioimage_py.morphology.distance_transform
Block-wise (halo-based) distance transform and point-to-object mapping.
distance_transform computes the euclidean distance transform (and/or the feature/index transform)
of 2D/3D data via bioimage_cpp.distance.distance_transform. Like the seeded watershed it is a
single-stage, halo-based operation (so it supports block_ids / resume_from): each block is
computed over a halo-padded region and the halo-free inner block is written back. The result is exact
only up to the block boundary plus halo, so choose a halo covering the maximum distance of interest;
for a fixed (block_shape, halo) the output is bit-identical across backends.
map_points_to_objects assigns each point to the nearest object in a segmentation and reports the
distance, by computing the index transform of the background within each (halo-padded) block.
1"""Block-wise (halo-based) distance transform and point-to-object mapping. 2 3``distance_transform`` computes the euclidean distance transform (and/or the feature/index transform) 4of 2D/3D data via ``bioimage_cpp.distance.distance_transform``. Like the seeded watershed it is a 5single-stage, halo-based operation (so it supports ``block_ids`` / ``resume_from``): each block is 6computed over a halo-padded region and the halo-free inner block is written back. The result is exact 7only up to the block boundary plus halo, so choose a halo covering the maximum distance of interest; 8for a fixed ``(block_shape, halo)`` the output is bit-identical across backends. 9 10``map_points_to_objects`` assigns each point to the nearest object in a segmentation and reports the 11distance, by computing the index transform of the background within each (halo-padded) block. 12""" 13from __future__ import annotations 14 15from typing import List, Optional, Sequence, Tuple, Union 16 17import bioimage_cpp as bic 18import numpy as np 19 20from ..runner import get_runner 21from ..runner.config import RunnerConfig 22from ..sources import Source, SourceLike, as_source 23from ..util import (BlockDescriptor, ComputeFn, check_rerun_args, full_roi, is_direct, 24 normalize_halo, to_roi) 25 26__all__ = ["distance_transform", "map_points_to_objects"] 27 28Sampling = Optional[Union[float, Sequence[float]]] 29 30 31def _make_distance_block(sampling: Sampling, return_distances: bool, return_indices: bool, 32 ndim: int) -> ComputeFn: 33 """Build the per-block distance-transform function (captures only picklable values).""" 34 35 def _compute(block: BlockDescriptor, inputs: Sequence[Source], outputs: Sequence[Source], 36 mask: Optional[Source]) -> None: 37 outer = to_roi(block.outer_block) 38 inner = to_roi(block.inner_block) 39 inner_local = to_roi(block.inner_block_local) 40 ret = bic.distance.distance_transform( 41 inputs[0][outer], sampling=sampling, return_distances=return_distances, 42 return_indices=return_indices, number_of_threads=1, 43 ) 44 if return_distances and return_indices: 45 dist, ind = ret 46 elif return_distances: 47 dist, ind = ret, None 48 else: 49 dist, ind = None, ret 50 51 out_i = 0 52 if return_distances: 53 outputs[out_i][inner] = dist[inner_local] 54 out_i += 1 55 if return_indices: 56 # The index transform is local to the (outer) block; shift it to global coordinates. 57 offset = np.array([int(b) for b in block.outer_block.begin], dtype="int32") 58 offset = offset[(slice(None),) + (np.newaxis,) * ndim] 59 ind_inner = ind[(slice(None),) + inner_local] + offset 60 outputs[out_i][(slice(None),) + inner] = ind_inner 61 return None 62 63 return _compute 64 65 66def _allocate_output(array: Optional[SourceLike], shape: Tuple[int, ...], dtype: str, job_type: str, 67 name: str) -> SourceLike: 68 """Return ``array`` if given, else allocate a numpy output for local execution (else raise).""" 69 if array is not None: 70 return array 71 if job_type != "local": 72 raise ValueError( 73 f"'{name}' is required for distributed execution (job_type={job_type!r}); " 74 "pass a file-backed (zarr/n5) output array." 75 ) 76 return np.zeros(shape, dtype=dtype) 77 78 79def distance_transform( 80 input: SourceLike, 81 halo: Sequence[int], 82 *, 83 sampling: Sampling = None, 84 return_distances: bool = True, 85 return_indices: bool = False, 86 distances: Optional[SourceLike] = None, 87 indices: Optional[SourceLike] = None, 88 block_shape: Optional[Tuple[int, ...]] = None, 89 job_type: str = "local", 90 job_config: Optional[RunnerConfig] = None, 91 num_workers: int = 1, 92 block_ids: Optional[Sequence[int]] = None, 93 resume_from: Optional[str] = None, 94) -> Union[SourceLike, Tuple[SourceLike, SourceLike]]: 95 """Compute the (halo-based) distance transform of 2D/3D data, block-wise. 96 97 Each block is computed over a halo-padded region via ``bioimage_cpp.distance.distance_transform`` 98 and the halo-free inner block is written back, so the result is exact only up to the block 99 boundary plus ``halo``. Single-stage, so it supports ``block_ids`` / ``resume_from``. 100 101 Args: 102 input: The input data (a numpy/zarr/n5 array or a `Source`); 2D or 3D. 103 halo: Per-axis halo enlarging each block; **required** for the block-wise path (choose it to 104 cover the maximum distance of interest). Ignored by the direct (single-block) path. 105 sampling: The per-axis voxel spacing passed to the distance transform (isotropic ``1`` by 106 default). 107 return_distances: Whether to compute the distance map. 108 return_indices: Whether to compute the index (feature) transform, an ``(ndim, *shape)`` array 109 holding, per voxel, the global coordinates of the nearest background voxel. 110 distances: Output array for the distances (``float32``, shape ``input.shape``). Optional for 111 local execution -- allocated and returned if omitted; **required** for distributed 112 execution. 113 indices: Output array for the indices (``int32``, shape ``(ndim, *input.shape)``). Optional 114 for local execution; **required** for distributed execution if ``return_indices``. 115 block_shape: Shape of the processing blocks. Defaults to the input chunk shape; required 116 for unchunked data. 117 job_type: Execution backend: one of ``"local"``, ``"subprocess"`` or ``"slurm"``. 118 job_config: Backend configuration (a `RunnerConfig` / `SlurmConfig`). 119 num_workers: Number of parallel workers (threads for ``local``, tasks for distributed 120 backends). 121 block_ids: Restrict processing to these block ids (e.g. to re-run previously failed blocks). 122 Mutually exclusive with ``resume_from``. 123 resume_from: Distributed only; the preserved temp folder of a failed run to resume (see 124 ``runner.run``). Mutually exclusive with ``block_ids``. 125 126 Returns: 127 The distances array if only ``return_distances``; the indices array if only 128 ``return_indices``; a ``(distances, indices)`` tuple if both. 129 """ 130 check_rerun_args(job_type, resume_from, block_ids) 131 src = as_source(input) 132 ndim = src.ndim 133 if ndim not in (2, 3): 134 raise ValueError(f"distance_transform supports 2D or 3D data, got {ndim}D.") 135 if not (return_distances or return_indices): 136 raise ValueError("At least one of 'return_distances' or 'return_indices' must be True.") 137 138 direct = (is_direct(job_type, num_workers, block_shape) 139 and block_ids is None and resume_from is None) 140 141 dist_out = (_allocate_output(distances, tuple(src.shape), "float32", job_type, "distances") 142 if return_distances else None) 143 idx_out = (_allocate_output(indices, (ndim,) + tuple(src.shape), "int32", job_type, "indices") 144 if return_indices else None) 145 outputs: List[SourceLike] = [] 146 if return_distances: 147 outputs.append(dist_out) 148 if return_indices: 149 outputs.append(idx_out) 150 151 if direct: 152 ret = bic.distance.distance_transform( 153 src[full_roi(ndim)], sampling=sampling, return_distances=return_distances, 154 return_indices=return_indices, number_of_threads=1, 155 ) 156 if return_distances and return_indices: 157 dist, ind = ret 158 elif return_distances: 159 dist, ind = ret, None 160 else: 161 dist, ind = None, ret 162 if return_distances: 163 as_source(dist_out)[full_roi(ndim)] = dist 164 if return_indices: 165 as_source(idx_out)[full_roi(ndim + 1)] = ind 166 else: 167 runner = get_runner(job_type, job_config) 168 runner.run(_make_distance_block(sampling, return_distances, return_indices, ndim), 169 [input], outputs=outputs, halo=normalize_halo(halo, ndim), block_shape=block_shape, 170 num_workers=num_workers, block_ids=block_ids, resume_from=resume_from, 171 name="distance_transform") 172 173 if return_distances and return_indices: 174 return dist_out, idx_out 175 return dist_out if return_distances else idx_out 176 177 178def _make_map_points(points: np.ndarray, sampling: Sampling, has_halo: bool, 179 seg_dtype: np.dtype) -> ComputeFn: 180 """Build the per-block point-to-object mapping function (captures the picklable points).""" 181 182 def _compute(block: BlockDescriptor, inputs: Sequence[Source], outputs: Sequence[Source], 183 mask: Optional[Source]) -> Optional[Tuple[np.ndarray, np.ndarray, np.ndarray]]: 184 seg_src = inputs[0] 185 blk = block.outer_block if has_halo else block 186 bb_min = np.array([int(b) for b in blk.begin]) 187 bb_max = np.array([int(e) for e in blk.end]) 188 ndim = bb_min.shape[0] 189 190 in_block = np.logical_and.reduce( 191 [points[:, i] >= bb_min[i] for i in range(ndim)] 192 + [points[:, i] < bb_max[i] for i in range(ndim)] 193 ) 194 if not in_block.any(): 195 return None 196 point_ids = np.where(in_block)[0] 197 block_points = (points[in_block] - bb_min[None, :]).astype("int64") 198 199 block_seg = seg_src[tuple(slice(int(b), int(e)) for b, e in zip(bb_min, bb_max))] 200 if not block_seg.any(): # no objects in the block -> background id 0 at infinite distance. 201 return (point_ids, np.zeros(len(point_ids), dtype=seg_dtype), 202 np.full(len(point_ids), np.inf, dtype="float32")) 203 204 distances, indices = bic.distance.distance_transform( 205 block_seg == 0, sampling=sampling, return_distances=True, return_indices=True, 206 number_of_threads=1, 207 ) 208 coords = tuple(block_points[:, i] for i in range(ndim)) 209 object_distances = distances[coords].astype("float32") 210 nearest = tuple(indices[i][coords] for i in range(ndim)) 211 object_ids = block_seg[nearest] 212 return point_ids, object_ids, object_distances 213 214 return _compute 215 216 217def map_points_to_objects( 218 segmentation: SourceLike, 219 points: np.ndarray, 220 block_shape: Tuple[int, ...], 221 *, 222 halo: Optional[Sequence[int]] = None, 223 sampling: Sampling = None, 224 num_workers: int = 1, 225 job_type: str = "local", 226 job_config: Optional[RunnerConfig] = None, 227 block_ids: Optional[Sequence[int]] = None, 228 resume_from: Optional[str] = None, 229) -> Tuple[np.ndarray, np.ndarray]: 230 """Map point coordinates to the nearest object in a segmentation and measure the distance. 231 232 Each (halo-padded) block computes the index transform of its background and maps the points it 233 contains to the nearest object. Choose ``halo`` to cover the maximum distance of interest; points 234 near a block boundary are resolved to the assignment with the smallest distance across overlapping 235 blocks. 236 237 Args: 238 segmentation: The label image (a numpy/zarr/n5 array or a `Source`). 239 points: The integer point coordinates, an ``(n_points, ndim)`` array. 240 block_shape: Shape of the processing blocks. 241 halo: Per-axis halo enlarging each block; choose it large enough to cover the maximum 242 distance of interest. ``None`` uses non-overlapping blocks (distances are then only 243 correct within each block). 244 sampling: The per-axis voxel spacing passed to the distance transform. 245 num_workers: Number of parallel workers (threads for ``local``, tasks for distributed 246 backends). 247 job_type: Execution backend: one of ``"local"``, ``"subprocess"`` or ``"slurm"``. 248 job_config: Backend configuration (a `RunnerConfig` / `SlurmConfig`). 249 block_ids: Restrict processing to these block ids (e.g. to re-run previously failed blocks). 250 Mutually exclusive with ``resume_from``. 251 resume_from: Distributed only; the preserved temp folder of a failed run to resume (see 252 ``runner.run``). Mutually exclusive with ``block_ids``. 253 254 Returns: 255 A ``(object_ids, object_distances)`` tuple, each of length ``n_points``: the id of the nearest 256 object per point (``0`` if none was found) and the corresponding distance (``inf`` if none). 257 """ 258 check_rerun_args(job_type, resume_from, block_ids) 259 seg_src = as_source(segmentation) 260 points = np.asarray(points) 261 if points.ndim != 2 or points.shape[1] != seg_src.ndim: 262 raise ValueError( 263 f"points must be an (n_points, {seg_src.ndim}) array, got shape {points.shape}." 264 ) 265 n_points = len(points) 266 267 has_halo = halo is not None 268 runner = get_runner(job_type, job_config) 269 results = runner.run(_make_map_points(points, sampling, has_halo, np.dtype(seg_src.dtype)), 270 [segmentation], block_shape=block_shape, 271 halo=normalize_halo(halo, seg_src.ndim) if has_halo else None, 272 num_workers=num_workers, block_ids=block_ids, resume_from=resume_from, 273 has_return_val=True, name="map_points_to_objects") 274 275 object_ids = np.zeros(n_points, dtype=seg_src.dtype) 276 object_distances = np.full(n_points, np.inf, dtype="float32") 277 for res in results: 278 if res is None: 279 continue 280 pids, oids, dists = res 281 take = dists < object_distances[pids] # overlapping blocks: keep the closest assignment. 282 object_ids[pids[take]] = oids[take] 283 object_distances[pids[take]] = dists[take] 284 return object_ids, object_distances
80def distance_transform( 81 input: SourceLike, 82 halo: Sequence[int], 83 *, 84 sampling: Sampling = None, 85 return_distances: bool = True, 86 return_indices: bool = False, 87 distances: Optional[SourceLike] = None, 88 indices: Optional[SourceLike] = None, 89 block_shape: Optional[Tuple[int, ...]] = None, 90 job_type: str = "local", 91 job_config: Optional[RunnerConfig] = None, 92 num_workers: int = 1, 93 block_ids: Optional[Sequence[int]] = None, 94 resume_from: Optional[str] = None, 95) -> Union[SourceLike, Tuple[SourceLike, SourceLike]]: 96 """Compute the (halo-based) distance transform of 2D/3D data, block-wise. 97 98 Each block is computed over a halo-padded region via ``bioimage_cpp.distance.distance_transform`` 99 and the halo-free inner block is written back, so the result is exact only up to the block 100 boundary plus ``halo``. Single-stage, so it supports ``block_ids`` / ``resume_from``. 101 102 Args: 103 input: The input data (a numpy/zarr/n5 array or a `Source`); 2D or 3D. 104 halo: Per-axis halo enlarging each block; **required** for the block-wise path (choose it to 105 cover the maximum distance of interest). Ignored by the direct (single-block) path. 106 sampling: The per-axis voxel spacing passed to the distance transform (isotropic ``1`` by 107 default). 108 return_distances: Whether to compute the distance map. 109 return_indices: Whether to compute the index (feature) transform, an ``(ndim, *shape)`` array 110 holding, per voxel, the global coordinates of the nearest background voxel. 111 distances: Output array for the distances (``float32``, shape ``input.shape``). Optional for 112 local execution -- allocated and returned if omitted; **required** for distributed 113 execution. 114 indices: Output array for the indices (``int32``, shape ``(ndim, *input.shape)``). Optional 115 for local execution; **required** for distributed execution if ``return_indices``. 116 block_shape: Shape of the processing blocks. Defaults to the input chunk shape; required 117 for unchunked data. 118 job_type: Execution backend: one of ``"local"``, ``"subprocess"`` or ``"slurm"``. 119 job_config: Backend configuration (a `RunnerConfig` / `SlurmConfig`). 120 num_workers: Number of parallel workers (threads for ``local``, tasks for distributed 121 backends). 122 block_ids: Restrict processing to these block ids (e.g. to re-run previously failed blocks). 123 Mutually exclusive with ``resume_from``. 124 resume_from: Distributed only; the preserved temp folder of a failed run to resume (see 125 ``runner.run``). Mutually exclusive with ``block_ids``. 126 127 Returns: 128 The distances array if only ``return_distances``; the indices array if only 129 ``return_indices``; a ``(distances, indices)`` tuple if both. 130 """ 131 check_rerun_args(job_type, resume_from, block_ids) 132 src = as_source(input) 133 ndim = src.ndim 134 if ndim not in (2, 3): 135 raise ValueError(f"distance_transform supports 2D or 3D data, got {ndim}D.") 136 if not (return_distances or return_indices): 137 raise ValueError("At least one of 'return_distances' or 'return_indices' must be True.") 138 139 direct = (is_direct(job_type, num_workers, block_shape) 140 and block_ids is None and resume_from is None) 141 142 dist_out = (_allocate_output(distances, tuple(src.shape), "float32", job_type, "distances") 143 if return_distances else None) 144 idx_out = (_allocate_output(indices, (ndim,) + tuple(src.shape), "int32", job_type, "indices") 145 if return_indices else None) 146 outputs: List[SourceLike] = [] 147 if return_distances: 148 outputs.append(dist_out) 149 if return_indices: 150 outputs.append(idx_out) 151 152 if direct: 153 ret = bic.distance.distance_transform( 154 src[full_roi(ndim)], sampling=sampling, return_distances=return_distances, 155 return_indices=return_indices, number_of_threads=1, 156 ) 157 if return_distances and return_indices: 158 dist, ind = ret 159 elif return_distances: 160 dist, ind = ret, None 161 else: 162 dist, ind = None, ret 163 if return_distances: 164 as_source(dist_out)[full_roi(ndim)] = dist 165 if return_indices: 166 as_source(idx_out)[full_roi(ndim + 1)] = ind 167 else: 168 runner = get_runner(job_type, job_config) 169 runner.run(_make_distance_block(sampling, return_distances, return_indices, ndim), 170 [input], outputs=outputs, halo=normalize_halo(halo, ndim), block_shape=block_shape, 171 num_workers=num_workers, block_ids=block_ids, resume_from=resume_from, 172 name="distance_transform") 173 174 if return_distances and return_indices: 175 return dist_out, idx_out 176 return dist_out if return_distances else idx_out
Compute the (halo-based) distance transform of 2D/3D data, block-wise.
Each block is computed over a halo-padded region via bioimage_cpp.distance.distance_transform
and the halo-free inner block is written back, so the result is exact only up to the block
boundary plus halo. Single-stage, so it supports block_ids / resume_from.
Args:
input: The input data (a numpy/zarr/n5 array or a Source); 2D or 3D.
halo: Per-axis halo enlarging each block; required for the block-wise path (choose it to
cover the maximum distance of interest). Ignored by the direct (single-block) path.
sampling: The per-axis voxel spacing passed to the distance transform (isotropic 1 by
default).
return_distances: Whether to compute the distance map.
return_indices: Whether to compute the index (feature) transform, an (ndim, *shape) array
holding, per voxel, the global coordinates of the nearest background voxel.
distances: Output array for the distances (float32, shape input.shape). Optional for
local execution -- allocated and returned if omitted; required for distributed
execution.
indices: Output array for the indices (int32, shape (ndim, *input.shape)). Optional
for local execution; required for distributed execution if return_indices.
block_shape: Shape of the processing blocks. Defaults to the input chunk shape; required
for unchunked data.
job_type: Execution backend: one of "local", "subprocess" or "slurm".
job_config: Backend configuration (a RunnerConfig / SlurmConfig).
num_workers: Number of parallel workers (threads for local, tasks for distributed
backends).
block_ids: Restrict processing to these block ids (e.g. to re-run previously failed blocks).
Mutually exclusive with resume_from.
resume_from: Distributed only; the preserved temp folder of a failed run to resume (see
runner.run). Mutually exclusive with block_ids.
Returns:
The distances array if only return_distances; the indices array if only
return_indices; a (distances, indices) tuple if both.
218def map_points_to_objects( 219 segmentation: SourceLike, 220 points: np.ndarray, 221 block_shape: Tuple[int, ...], 222 *, 223 halo: Optional[Sequence[int]] = None, 224 sampling: Sampling = None, 225 num_workers: int = 1, 226 job_type: str = "local", 227 job_config: Optional[RunnerConfig] = None, 228 block_ids: Optional[Sequence[int]] = None, 229 resume_from: Optional[str] = None, 230) -> Tuple[np.ndarray, np.ndarray]: 231 """Map point coordinates to the nearest object in a segmentation and measure the distance. 232 233 Each (halo-padded) block computes the index transform of its background and maps the points it 234 contains to the nearest object. Choose ``halo`` to cover the maximum distance of interest; points 235 near a block boundary are resolved to the assignment with the smallest distance across overlapping 236 blocks. 237 238 Args: 239 segmentation: The label image (a numpy/zarr/n5 array or a `Source`). 240 points: The integer point coordinates, an ``(n_points, ndim)`` array. 241 block_shape: Shape of the processing blocks. 242 halo: Per-axis halo enlarging each block; choose it large enough to cover the maximum 243 distance of interest. ``None`` uses non-overlapping blocks (distances are then only 244 correct within each block). 245 sampling: The per-axis voxel spacing passed to the distance transform. 246 num_workers: Number of parallel workers (threads for ``local``, tasks for distributed 247 backends). 248 job_type: Execution backend: one of ``"local"``, ``"subprocess"`` or ``"slurm"``. 249 job_config: Backend configuration (a `RunnerConfig` / `SlurmConfig`). 250 block_ids: Restrict processing to these block ids (e.g. to re-run previously failed blocks). 251 Mutually exclusive with ``resume_from``. 252 resume_from: Distributed only; the preserved temp folder of a failed run to resume (see 253 ``runner.run``). Mutually exclusive with ``block_ids``. 254 255 Returns: 256 A ``(object_ids, object_distances)`` tuple, each of length ``n_points``: the id of the nearest 257 object per point (``0`` if none was found) and the corresponding distance (``inf`` if none). 258 """ 259 check_rerun_args(job_type, resume_from, block_ids) 260 seg_src = as_source(segmentation) 261 points = np.asarray(points) 262 if points.ndim != 2 or points.shape[1] != seg_src.ndim: 263 raise ValueError( 264 f"points must be an (n_points, {seg_src.ndim}) array, got shape {points.shape}." 265 ) 266 n_points = len(points) 267 268 has_halo = halo is not None 269 runner = get_runner(job_type, job_config) 270 results = runner.run(_make_map_points(points, sampling, has_halo, np.dtype(seg_src.dtype)), 271 [segmentation], block_shape=block_shape, 272 halo=normalize_halo(halo, seg_src.ndim) if has_halo else None, 273 num_workers=num_workers, block_ids=block_ids, resume_from=resume_from, 274 has_return_val=True, name="map_points_to_objects") 275 276 object_ids = np.zeros(n_points, dtype=seg_src.dtype) 277 object_distances = np.full(n_points, np.inf, dtype="float32") 278 for res in results: 279 if res is None: 280 continue 281 pids, oids, dists = res 282 take = dists < object_distances[pids] # overlapping blocks: keep the closest assignment. 283 object_ids[pids[take]] = oids[take] 284 object_distances[pids[take]] = dists[take] 285 return object_ids, object_distances
Map point coordinates to the nearest object in a segmentation and measure the distance.
Each (halo-padded) block computes the index transform of its background and maps the points it
contains to the nearest object. Choose halo to cover the maximum distance of interest; points
near a block boundary are resolved to the assignment with the smallest distance across overlapping
blocks.
Args:
segmentation: The label image (a numpy/zarr/n5 array or a Source).
points: The integer point coordinates, an (n_points, ndim) array.
block_shape: Shape of the processing blocks.
halo: Per-axis halo enlarging each block; choose it large enough to cover the maximum
distance of interest. None uses non-overlapping blocks (distances are then only
correct within each block).
sampling: The per-axis voxel spacing passed to the distance transform.
num_workers: Number of parallel workers (threads for local, tasks for distributed
backends).
job_type: Execution backend: one of "local", "subprocess" or "slurm".
job_config: Backend configuration (a RunnerConfig / SlurmConfig).
block_ids: Restrict processing to these block ids (e.g. to re-run previously failed blocks).
Mutually exclusive with resume_from.
resume_from: Distributed only; the preserved temp folder of a failed run to resume (see
runner.run). Mutually exclusive with block_ids.
Returns:
A (object_ids, object_distances) tuple, each of length n_points: the id of the nearest
object per point (0 if none was found) and the corresponding distance (inf if none).