bioimage_py.evaluation.centroid_matching
Object matching scores (precision / recall / segmentation accuracy / f1) by centroid distance.
This complements the overlap-based bioimage_py.evaluation.matching: instead of densifying the
contingency table and scoring object pairs by IoU, it matches objects whose centroids lie within a
distance threshold of one another. This is the natural criterion when objects are point-like (spots,
nuclei, detections) or when voxel overlap is too strict. Like dice_score it is not built on the
contingency table — it needs object centroids, not voxel overlaps.
There are two layers. coordinate_matching is the low-level form: it matches two point sets directly.
centroid_matching is the high-level form: it derives each segmentation's per-object centroids (the
center of mass, via bioimage_py.morphology.morphology, which runs block-wise / distributed) and
then defers to coordinate_matching. Both find the optimal one-to-one assignment with the Hungarian
algorithm (scipy.optimize.linear_sum_assignment) and reuse the precision / recall / f1 /
segmentation-accuracy formulas of the overlap-based matching.
1"""Object matching scores (precision / recall / segmentation accuracy / f1) by centroid distance. 2 3This complements the overlap-based :func:`bioimage_py.evaluation.matching`: instead of densifying the 4contingency table and scoring object pairs by IoU, it matches objects whose centroids lie within a 5distance threshold of one another. This is the natural criterion when objects are point-like (spots, 6nuclei, detections) or when voxel overlap is too strict. Like ``dice_score`` it is *not* built on the 7contingency table — it needs object centroids, not voxel overlaps. 8 9There are two layers. ``coordinate_matching`` is the low-level form: it matches two point sets directly. 10``centroid_matching`` is the high-level form: it derives each segmentation's per-object centroids (the 11center of mass, via :func:`bioimage_py.morphology.morphology`, which runs block-wise / distributed) and 12then defers to ``coordinate_matching``. Both find the optimal one-to-one assignment with the Hungarian 13algorithm (``scipy.optimize.linear_sum_assignment``) and reuse the precision / recall / f1 / 14segmentation-accuracy formulas of the overlap-based matching. 15""" 16from __future__ import annotations 17 18from typing import Dict, Optional, Sequence, Tuple 19 20import numpy as np 21from scipy.optimize import linear_sum_assignment 22from scipy.spatial.distance import cdist 23 24from ..morphology import morphology 25from ..runner.config import RunnerConfig 26from ..sources import SourceLike 27from .matching import _f1, _precision, _recall, _segmentation_accuracy 28 29__all__ = ["coordinate_matching", "centroid_matching"] 30 31_EPS = 1e-12 32 33 34def _count_matches(distances: np.ndarray, distance_threshold: float) -> int: 35 """The number of true positives: optimal one-to-one assignment of points within the threshold.""" 36 n_matched = min(distances.shape) 37 valid = distances <= distance_threshold 38 if n_matched == 0 or not valid.any(): 39 return 0 40 # Prioritize within-threshold pairs; the distance is a tie-breaker normalized so each term is 41 # <= 1 / (2 * n_matched) and can never outweigh the integer validity term (cf. matching._compute_tps). 42 costs = -valid.astype("float64") + distances / (2 * n_matched * (float(distances.max()) + _EPS)) 43 row_ind, col_ind = linear_sum_assignment(costs) 44 return int(np.count_nonzero(valid[row_ind, col_ind])) 45 46 47def _as_coordinates(coordinates: object, name: str) -> np.ndarray: 48 """Coerce ``coordinates`` to a ``(N, ndim)`` float64 array (an empty set becomes ``(0, 0)``).""" 49 arr = np.asarray(coordinates, dtype="float64") 50 if arr.size == 0: 51 return arr.reshape(0, 0) 52 if arr.ndim != 2: 53 raise ValueError(f"{name} must have shape (N, ndim), got shape {arr.shape}.") 54 return arr 55 56 57def _centroids( 58 segmentation: SourceLike, 59 ignore_label: Optional[int], 60 num_workers: int, 61 block_shape: Optional[Tuple[int, ...]], 62 job_type: str, 63 job_config: Optional[RunnerConfig], 64 mask: Optional[SourceLike], 65) -> np.ndarray: 66 """Per-object centers of mass of ``segmentation`` as a ``(N, ndim)`` voxel-coordinate array.""" 67 table = morphology(segmentation, num_workers=num_workers, block_shape=block_shape, 68 job_type=job_type, job_config=job_config, mask=mask) 69 if ignore_label is not None: 70 table = table[table["label"] != ignore_label] 71 com_columns = [column for column in table.columns if column.startswith("com_")] 72 return table[com_columns].to_numpy(dtype="float64") 73 74 75def coordinate_matching( 76 coordinates_a: object, 77 coordinates_b: object, 78 *, 79 distance_threshold: float, 80 resolution: Optional[Sequence[float]] = None, 81) -> Dict[str, float]: 82 """Match two point sets by centroid distance and compute the matching scores. 83 84 Each point in ``coordinates_a`` is matched to at most one point in ``coordinates_b`` (and vice 85 versa) via the optimal one-to-one assignment, counting a pair as a match only if their Euclidean 86 distance does not exceed ``distance_threshold``. ``coordinates_a`` plays the role of the prediction 87 and ``coordinates_b`` the reference, so precision is computed over ``coordinates_a`` and recall over 88 ``coordinates_b`` (matching the orientation of :func:`matching`). 89 90 Args: 91 coordinates_a: The candidate points to evaluate, an array-like of shape ``(N, ndim)``. 92 coordinates_b: The reference points, an array-like of shape ``(M, ndim)`` with the same 93 ``ndim`` as ``coordinates_a``. 94 distance_threshold: Maximum centroid distance (in the units of the coordinates, or of 95 ``resolution`` if given) for two points to count as a match. 96 resolution: Optional per-axis spacing; when given, coordinates are scaled by it before 97 distances are computed, so ``distance_threshold`` is interpreted in physical units. 98 99 Returns: 100 A mapping with keys ``precision``, ``recall``, ``segmentation_accuracy`` and ``f1``. 101 """ 102 coords_a = _as_coordinates(coordinates_a, "coordinates_a") 103 coords_b = _as_coordinates(coordinates_b, "coordinates_b") 104 n_a, n_b = coords_a.shape[0], coords_b.shape[0] 105 106 if n_a and n_b: 107 if coords_a.shape[1] != coords_b.shape[1]: 108 raise ValueError(f"coordinates_a and coordinates_b must have the same ndim, got " 109 f"{coords_a.shape[1]} and {coords_b.shape[1]}.") 110 if resolution is not None: 111 spacing = np.asarray(resolution, dtype="float64") 112 if spacing.shape != (coords_a.shape[1],): 113 raise ValueError(f"resolution must have one entry per axis ({coords_a.shape[1]}), got " 114 f"shape {spacing.shape}.") 115 coords_a, coords_b = coords_a * spacing, coords_b * spacing 116 tp = _count_matches(cdist(coords_a, coords_b), distance_threshold) 117 else: 118 tp = 0 119 120 fp, fn = n_a - tp, n_b - tp 121 return {"precision": _precision(tp, fp, fn), "recall": _recall(tp, fp, fn), 122 "segmentation_accuracy": _segmentation_accuracy(tp, fp, fn), "f1": _f1(tp, fp, fn)} 123 124 125def centroid_matching( 126 segmentation: SourceLike, 127 groundtruth: SourceLike, 128 *, 129 distance_threshold: float, 130 resolution: Optional[Sequence[float]] = None, 131 ignore_label: Optional[int] = 0, 132 num_workers: int = 1, 133 block_shape: Optional[Tuple[int, ...]] = None, 134 job_type: str = "local", 135 job_config: Optional[RunnerConfig] = None, 136 mask: Optional[SourceLike] = None, 137) -> Dict[str, float]: 138 """Compute object-matching scores between two segmentations by centroid distance. 139 140 Derives each object's center of mass with :func:`bioimage_py.morphology.morphology` (the centroid 141 extraction runs block-wise / distributed via the runner arguments) and then matches the two centroid 142 sets with :func:`coordinate_matching`. 143 144 Args: 145 segmentation: Candidate segmentation to evaluate (a numpy/zarr/n5 array or a `Source`); must be 146 integer-typed. 147 groundtruth: The groundtruth segmentation; same shape as ``segmentation``. 148 distance_threshold: Maximum centroid distance (in voxels, or physical units if ``resolution`` 149 is given) for two objects to count as a match. 150 resolution: Optional per-axis voxel spacing; when given, centroids are scaled by it before 151 distances are computed, so ``distance_threshold`` is interpreted in physical units. 152 ignore_label: Object label removed from both segmentations before matching (e.g. background). 153 ``None`` keeps all objects. Note ``morphology`` already excludes label ``0``. 154 num_workers: Number of parallel workers used to extract the centroids. 155 block_shape: Shape of the processing blocks. Defaults to the input chunk shape. 156 job_type: Execution backend: one of ``"local"``, ``"subprocess"`` or ``"slurm"``. 157 job_config: Backend configuration (a `RunnerConfig` / `SlurmConfig`). 158 mask: Optional binary mask; voxels outside the mask are excluded. 159 160 Returns: 161 A mapping with keys ``precision``, ``recall``, ``segmentation_accuracy`` and ``f1``. 162 """ 163 coords_a = _centroids(segmentation, ignore_label, num_workers, block_shape, job_type, job_config, 164 mask) 165 coords_b = _centroids(groundtruth, ignore_label, num_workers, block_shape, job_type, job_config, 166 mask) 167 return coordinate_matching(coords_a, coords_b, distance_threshold=distance_threshold, 168 resolution=resolution)
76def coordinate_matching( 77 coordinates_a: object, 78 coordinates_b: object, 79 *, 80 distance_threshold: float, 81 resolution: Optional[Sequence[float]] = None, 82) -> Dict[str, float]: 83 """Match two point sets by centroid distance and compute the matching scores. 84 85 Each point in ``coordinates_a`` is matched to at most one point in ``coordinates_b`` (and vice 86 versa) via the optimal one-to-one assignment, counting a pair as a match only if their Euclidean 87 distance does not exceed ``distance_threshold``. ``coordinates_a`` plays the role of the prediction 88 and ``coordinates_b`` the reference, so precision is computed over ``coordinates_a`` and recall over 89 ``coordinates_b`` (matching the orientation of :func:`matching`). 90 91 Args: 92 coordinates_a: The candidate points to evaluate, an array-like of shape ``(N, ndim)``. 93 coordinates_b: The reference points, an array-like of shape ``(M, ndim)`` with the same 94 ``ndim`` as ``coordinates_a``. 95 distance_threshold: Maximum centroid distance (in the units of the coordinates, or of 96 ``resolution`` if given) for two points to count as a match. 97 resolution: Optional per-axis spacing; when given, coordinates are scaled by it before 98 distances are computed, so ``distance_threshold`` is interpreted in physical units. 99 100 Returns: 101 A mapping with keys ``precision``, ``recall``, ``segmentation_accuracy`` and ``f1``. 102 """ 103 coords_a = _as_coordinates(coordinates_a, "coordinates_a") 104 coords_b = _as_coordinates(coordinates_b, "coordinates_b") 105 n_a, n_b = coords_a.shape[0], coords_b.shape[0] 106 107 if n_a and n_b: 108 if coords_a.shape[1] != coords_b.shape[1]: 109 raise ValueError(f"coordinates_a and coordinates_b must have the same ndim, got " 110 f"{coords_a.shape[1]} and {coords_b.shape[1]}.") 111 if resolution is not None: 112 spacing = np.asarray(resolution, dtype="float64") 113 if spacing.shape != (coords_a.shape[1],): 114 raise ValueError(f"resolution must have one entry per axis ({coords_a.shape[1]}), got " 115 f"shape {spacing.shape}.") 116 coords_a, coords_b = coords_a * spacing, coords_b * spacing 117 tp = _count_matches(cdist(coords_a, coords_b), distance_threshold) 118 else: 119 tp = 0 120 121 fp, fn = n_a - tp, n_b - tp 122 return {"precision": _precision(tp, fp, fn), "recall": _recall(tp, fp, fn), 123 "segmentation_accuracy": _segmentation_accuracy(tp, fp, fn), "f1": _f1(tp, fp, fn)}
Match two point sets by centroid distance and compute the matching scores.
Each point in coordinates_a is matched to at most one point in coordinates_b (and vice
versa) via the optimal one-to-one assignment, counting a pair as a match only if their Euclidean
distance does not exceed distance_threshold. coordinates_a plays the role of the prediction
and coordinates_b the reference, so precision is computed over coordinates_a and recall over
coordinates_b (matching the orientation of matching()).
Args:
coordinates_a: The candidate points to evaluate, an array-like of shape (N, ndim).
coordinates_b: The reference points, an array-like of shape (M, ndim) with the same
ndim as coordinates_a.
distance_threshold: Maximum centroid distance (in the units of the coordinates, or of
resolution if given) for two points to count as a match.
resolution: Optional per-axis spacing; when given, coordinates are scaled by it before
distances are computed, so distance_threshold is interpreted in physical units.
Returns:
A mapping with keys precision, recall, segmentation_accuracy and f1.
126def centroid_matching( 127 segmentation: SourceLike, 128 groundtruth: SourceLike, 129 *, 130 distance_threshold: float, 131 resolution: Optional[Sequence[float]] = None, 132 ignore_label: Optional[int] = 0, 133 num_workers: int = 1, 134 block_shape: Optional[Tuple[int, ...]] = None, 135 job_type: str = "local", 136 job_config: Optional[RunnerConfig] = None, 137 mask: Optional[SourceLike] = None, 138) -> Dict[str, float]: 139 """Compute object-matching scores between two segmentations by centroid distance. 140 141 Derives each object's center of mass with :func:`bioimage_py.morphology.morphology` (the centroid 142 extraction runs block-wise / distributed via the runner arguments) and then matches the two centroid 143 sets with :func:`coordinate_matching`. 144 145 Args: 146 segmentation: Candidate segmentation to evaluate (a numpy/zarr/n5 array or a `Source`); must be 147 integer-typed. 148 groundtruth: The groundtruth segmentation; same shape as ``segmentation``. 149 distance_threshold: Maximum centroid distance (in voxels, or physical units if ``resolution`` 150 is given) for two objects to count as a match. 151 resolution: Optional per-axis voxel spacing; when given, centroids are scaled by it before 152 distances are computed, so ``distance_threshold`` is interpreted in physical units. 153 ignore_label: Object label removed from both segmentations before matching (e.g. background). 154 ``None`` keeps all objects. Note ``morphology`` already excludes label ``0``. 155 num_workers: Number of parallel workers used to extract the centroids. 156 block_shape: Shape of the processing blocks. Defaults to the input chunk shape. 157 job_type: Execution backend: one of ``"local"``, ``"subprocess"`` or ``"slurm"``. 158 job_config: Backend configuration (a `RunnerConfig` / `SlurmConfig`). 159 mask: Optional binary mask; voxels outside the mask are excluded. 160 161 Returns: 162 A mapping with keys ``precision``, ``recall``, ``segmentation_accuracy`` and ``f1``. 163 """ 164 coords_a = _centroids(segmentation, ignore_label, num_workers, block_shape, job_type, job_config, 165 mask) 166 coords_b = _centroids(groundtruth, ignore_label, num_workers, block_shape, job_type, job_config, 167 mask) 168 return coordinate_matching(coords_a, coords_b, distance_threshold=distance_threshold, 169 resolution=resolution)
Compute object-matching scores between two segmentations by centroid distance.
Derives each object's center of mass with bioimage_py.morphology.morphology (the centroid
extraction runs block-wise / distributed via the runner arguments) and then matches the two centroid
sets with coordinate_matching().
Args:
segmentation: Candidate segmentation to evaluate (a numpy/zarr/n5 array or a Source); must be
integer-typed.
groundtruth: The groundtruth segmentation; same shape as segmentation.
distance_threshold: Maximum centroid distance (in voxels, or physical units if resolution
is given) for two objects to count as a match.
resolution: Optional per-axis voxel spacing; when given, centroids are scaled by it before
distances are computed, so distance_threshold is interpreted in physical units.
ignore_label: Object label removed from both segmentations before matching (e.g. background).
None keeps all objects. Note morphology already excludes label 0.
num_workers: Number of parallel workers used to extract the centroids.
block_shape: Shape of the processing blocks. Defaults to the input chunk shape.
job_type: Execution backend: one of "local", "subprocess" or "slurm".
job_config: Backend configuration (a RunnerConfig / SlurmConfig).
mask: Optional binary mask; voxels outside the mask are excluded.
Returns:
A mapping with keys precision, recall, segmentation_accuracy and f1.