segmentation_measurement
segmentation-measurement
segmentation-measurement is a Python library for post-processing, measuring, and analyzing instance segmentations from microscopy images. It provides:
- Post-processing: filter small segments, remove small holes, compute ring-masks around segments.
- Intensity measurements: per-object mean, median, max, standard deviation, and percentile statistics.
- Morphology measurements: per-object area/volume, perimeter/surface area, sphericity, solidity, axis lengths, and equivalent diameter; supports anisotropic pixel/voxel sizes.
- Cell-nucleus measurements: per-cell nucleus count, cell-to-nucleus area/volume ratio, and optional cytoplasmic vs. nuclear intensity ratios from paired cell and nucleus segmentations.
- Threshold analysis: categorize objects into named groups based on any measurement column using automatic or manual thresholds.
- Clustering analysis: cluster objects using k-means, DBSCAN, HDBSCAN, or Mean Shift on any combination of measurement features, with an interactive 2-D feature-reduction scatter plot (UMAP, t-SNE, or PCA).
- Classification analysis: train a random forest or logistic regression classifier from interactive napari brush annotations and apply it to all segments, with optional export of the trained classifier.
- Batch processing across multiple segmentations: define named groups of layers in the napari plugin and run any measurement or analysis widget over every member of a group with a single click, with results written back per-layer.
- Napari plugin: interactive widgets for all of the above, with table visualization and export to CSV, TSV, and Excel.
- CLI: command-line interface for all functionality.
All functions support 2D and 3D inputs.
Installation
Install the core library with pip:
pip install segmentation-measurement
To also install the napari plugin and its dependencies:
pip install "segmentation-measurement[napari]"
To install from source:
git clone https://github.com/computational-cell-analytics/segmentation-measurement-tool
cd segmentation-measurement-tool
pip install -e .
# or with napari support:
pip install -e ".[napari]"
Requires Python 3.9 or later.
Napari Plugin
The segmentation-measurement napari plugin provides interactive widgets for
post-processing segmentation label layers and for computing and exploring per-segment
measurements – all without writing any Python code.
Installation
pip install "segmentation-measurement[napari]"
After installation the plugin is automatically discovered by napari.
Opening the Widgets
Open any widget from the napari menu:
Plugins → Segmentation Measurement → Postprocessing
Plugins → Segmentation Measurement → Intensity Measurement → Intensity Measurement
Plugins → Segmentation Measurement → Morphology Measurement → Morphology Measurement
Plugins → Segmentation Measurement → Threshold Analysis → Threshold Analysis
Plugins → Segmentation Measurement → Cell-Nucleus Measurement → Cell-Nucleus Measurement
Plugins → Segmentation Measurement → Clustering Analysis → Clustering Analysis
Plugins → Segmentation Measurement → Classification Analysis → Classification Analysis
Plugins → Segmentation Measurement → Table Manipulation → Table Manipulation
Plugins → Segmentation Measurement → Group Manager → Group Manager
All widgets appear as dockable panels that can be placed anywhere in the napari window.
Working with measurement tables
Measurement and analysis results are stored as the features table of the source
Labels layer. napari ships a built-in Features Table dock
(Layers → Visualize → Features table widget) that displays this table for the
currently selected layer. Whenever a widget in this plugin writes to a layer's
features (after a measurement, after applying a threshold/cluster/classifier, after
loading or editing a table), the dock is opened automatically and the source layer
is selected so the result is visible immediately. The dock supports sorting,
in-place editing, copy/paste, CSV save, and bidirectional row ↔ viewer selection
sync.
Working with groups (batch processing)
A group is a named bundle of layers that you batch over. Each group lists ordered layers under three roles:
segmentation(required, ≥1 layer) — the primary label layers to processnucleus_segmentation(optional) — paired nucleus segmentations for the Cell-Nucleus widgetintensity_image(optional) — paired intensity images for the Intensity widget and (optionally) the Cell-Nucleus widget
Within a group, layers across roles are paired by position:
segmentation[i] is matched with nucleus_segmentation[i] and
intensity_image[i]. An optional role's list must either be empty or have
exactly the same length as the segmentation list.
Groups are defined in the Group Manager Widget. Once defined, every measurement and analysis widget exposes a Target combo at the top with two options:
<single layer>(default) — original single-layer behaviour. Pick one segmentation (and optional pair partners) via the existing combos.- a group name — operate on the group's members. Measurement widgets
iterate over members and write results into each member's
features. Analysis widgets concatenate features across members for joint computation, then split results back to each member'sfeaturesand emit one output label layer per member named{output}_{layer}(or just{output}for a single-member group).
Renaming a layer that is referenced by a group automatically updates the group definition.
Postprocessing Widget
The Postprocessing widget applies one of three post-processing operations to an existing label layer and writes the result to a new layer or back to the same layer.
Layout
┌─────────────────────────────────┐
│ Input segmentation: [combo] │
│ Output name: [combo] │
│ Method: [combo] │
│ ┌ Parameters ──────────────┐ │
│ │ <method-specific param> │ │
│ └──────────────────────────┘ │
│ [Run] │
└─────────────────────────────────┘
Controls
Input segmentation : Dropdown list of all Labels layers currently loaded in napari. Select the layer you want to process.
Output name
: Dropdown showing existing Labels layers plus the default entry postprocessed. You
can also type a new name directly. If the chosen name matches an existing layer the
data of that layer is updated in place; otherwise a new Labels layer is added. Setting
the output name equal to the input name processes the input layer in place.
Method : Choose one of the three post-processing operations (see below). The Parameters panel updates immediately to show the relevant controls.
Run : Apply the selected method with the current parameters.
Methods
Filter Small Segments
Removes segments whose pixel (2-D) or voxel (3-D) count is below the threshold.
Removed segments become background (0).
- Min size – Minimum number of pixels/voxels a segment must have to be retained (default: 100).
Remove Small Holes
Fills enclosed background holes within segments when the hole size does not exceed the threshold. Other segments are never overwritten.
- Max hole size – Maximum hole size in pixels/voxels to fill (default: 50).
Ring Mask
Creates an annular ring of a fixed width around each segment. Rings are placed only on background pixels; overlapping rings resolve in favour of the smaller label ID. This is commonly used to create pseudo-cytoplasm masks around segmented nuclei.
- Ring width – Width of the ring in pixels/voxels (default: 5).
- Keep original – When checked (default), the original segment pixels are retained in
the output alongside the ring pixels. Uncheck to produce a ring-only mask where the
original segment interiors are set to
0.
Watershed
Refines a segmentation using the watershed algorithm. The selected input segmentation is used as seed markers; a separate heatmap image layer provides the topographic landscape. The algorithm floods from low heatmap values upward, so the heatmap should have low values at segment boundaries. If your heatmap instead has high values at object centres (e.g. a distance transform or foreground-probability map), negate it before passing it to the widget.
- Heatmap – Image layer used as the watershed landscape (low values flooded first).
- Mask (optional) – Label layer whose footprint restricts processing. Pixels outside the mask are set to 0 in the output. Select None to process all pixels (default).
Intensity Measurement Widget
The Intensity Measurement widget computes per-segment intensity statistics from a label layer and a co-registered intensity image.
Layout
┌─────────────────────────────────┐
│ Target: [combo] │
│ Segmentation: [combo] │
│ Intensity image: [combo] │
│ [Measure intensities] │
└─────────────────────────────────┘
Workflow
- Pick a Target:
<single layer>(default) for the original single-pair behaviour, or a previously defined group name to batch over its members. In group mode the Segmentation and Intensity image combos become read-only and show the first member; the group must define a non-emptyintensity_imagerole. - Select a Segmentation layer (Labels) from the first dropdown.
- Select an Intensity image layer (Image) from the second dropdown.
- Click Measure intensities.
In group mode the measurement is run on every
(segmentation[i], intensity_image[i]) pair in turn, writing results into
each segmentation layer's features. See the
Working with groups section for
the broader pattern.
The result is merged into the segmentation layer's features table and the napari
Features Table dock is opened automatically with the segmentation layer selected.
Re-running the measurement on the same layer silently overwrites the existing
intensity columns; running a different measurement (e.g. Morphology) on the same
layer adds its columns alongside.
The columns added to layer.features are:
| Column | Description |
|---|---|
index |
Integer segment label ID |
mean_intensity |
Mean pixel intensity |
median_intensity |
Median pixel intensity |
max_intensity |
Maximum pixel intensity |
min_intensity |
Minimum pixel intensity |
std_intensity |
Standard deviation |
percentile_10 |
10th percentile |
percentile_25 |
25th percentile (Q1) |
percentile_75 |
75th percentile (Q3) |
percentile_90 |
90th percentile |
Saving the table
Use the Save as CSV button in the napari Features Table dock to export the features. For TSV / XLSX export, use the Save table button in the Table Manipulation Widget below.
Morphology Measurement Widget
The Morphology Measurement widget computes per-segment shape descriptors from a label layer. Physical pixel/voxel sizes can be specified per axis to obtain measurements in real-world units.
Layout (scrollable)
┌─────────────────────────────────┐
│ Target: [combo] │
│ Segmentation: [combo] │
│ ┌ Physical pixel/voxel size ─┐ │
│ │ Y: [spinbox] │ │
│ │ X: [spinbox] │ │
│ └────────────────────────────┘ │
│ [Measure morphology] │
└─────────────────────────────────┘
For 3-D data a third spinbox Z is added automatically.
Workflow
- Pick a Target:
<single layer>(default) for the original single-layer behaviour, or a previously defined group name to batch over its segmentation members. In group mode the Segmentation combo becomes read-only and shows the first member; the same scale settings apply to every member. - Select a Segmentation layer (Labels) from the dropdown. The scale spinboxes are
pre-populated from the layer's
scaleattribute if it has been set; otherwise they default to1.0. - Adjust the per-axis scale values if needed. For isotropic data a single value applies to all axes; for anisotropic data set each axis independently.
- Click Measure morphology.
The result is merged into the segmentation layer's features table and the napari
Features Table dock is opened automatically with the segmentation layer selected.
Re-running the measurement on the same layer silently overwrites the morphology
columns; running a different measurement on the same layer adds its columns
alongside.
The columns added to layer.features (one row per segment) are:
2-D columns
| Column | Description |
|---|---|
index |
Integer segment label ID |
area |
Area in physical units (px² or µm² etc.) |
perimeter |
Perimeter length in physical units |
sphericity |
Circularity: 1.0 for a perfect circle, <1 for elongated or irregular shapes |
solidity |
Ratio of segment area to convex hull area |
axis_major_length |
Length of the major axis of the fitted ellipse |
axis_minor_length |
Length of the minor axis of the fitted ellipse |
equivalent_diameter |
Diameter of a circle with the same area |
3-D columns
| Column | Description |
|---|---|
index |
Integer segment label ID |
volume |
Volume in physical units (vx³ or µm³ etc.) |
surface_area |
Surface area computed via marching cubes |
sphericity |
1.0 for a perfect sphere, <1 for elongated or irregular shapes |
solidity |
Ratio of segment volume to convex hull volume |
axis_major_length |
Length of the major axis of the fitted ellipsoid |
axis_minor_length |
Length of the minor axis of the fitted ellipsoid |
equivalent_diameter |
Diameter of a sphere with the same volume |
Saving the table
Use the Save as CSV button in the napari Features Table dock, or the Save table button in the Table Manipulation Widget for TSV / XLSX output.
Threshold Analysis Widget
The Threshold Analysis widget categorizes segments into named groups based on one or more
thresholds applied to any numeric column of the selected layer's features table.
Layout (scrollable)
┌──────────────────────────────────────┐
│ Target: [combo] │
│ Segmentation: [combo] │
│ ┌ Column histogram ────────────────┐ │
│ │ Column: [combo] │ │
│ │ <histogram plot> │ │
│ └──────────────────────────────────┘ │
│ ┌ Categorization ──────────────────┐ │
│ │ Number of categories: [spin] │ │
│ │ Threshold 1: [spin] ... │ │
│ │ Name 1: [edit] ... │ │
│ │ [Suggest thresholds] │ │
│ │ Output layer: [edit] │ │
│ │ [Categorize] │ │
│ └──────────────────────────────────┘ │
└──────────────────────────────────────┘
Workflow
Step 1 – Select the segmentation layer or group
The widget operates on the features table of the selected Labels layer (populated
by Intensity Measurement, Morphology Measurement, Cell-Nucleus
Measurement, or by loading a CSV via the Table Manipulation
Widget).
- Run one of the measurement widgets first (or load a CSV).
- Pick a Target:
<single layer>(default) for the original single-layer behaviour, or a previously defined group name. In group mode the Segmentation combo becomes read-only. The histogram and threshold suggestion are computed on the concatenation of all members'featuresso a single set of thresholds can be applied consistently across the batch. - In single-layer mode, pick the layer from the Segmentation
dropdown. The Column dropdown is filled with its numeric columns
(excluding
indexandcategory_id).
Step 2 – Explore the histogram
The Column histogram section shows the distribution of the currently selected column. Threshold lines are drawn in red so you can evaluate the split visually before applying it.
Note: The histogram requires
matplotlib. Install it withpip install matplotlibif it is not already available.
Step 3 – Set thresholds and categorize
- Set Number of categories (2–10). The threshold and name fields update automatically.
- Enter threshold values in the Threshold spin-boxes, or click Suggest thresholds to auto-populate them using equally-spaced quantiles of the selected column. The threshold lines on the histogram update in real time.
- Optionally rename each category in the Name fields (defaults:
category_1,category_2, …). - Enter an Output layer name (default:
categories). - Click Categorize.
Two things happen:
- The source layer's
featuresgains two new columns:category_id(integer, 1-based) andcategory_name(string). The Features Table dock is opened automatically with the source layer selected so you can inspect them. - A new Labels layer is created (or updated) in napari where each segment is assigned its category ID as the label value. Use napari's built-in colormap controls to distinguish the categories visually.
In group mode every member's features receives the new columns and one
output Labels layer is created per member named {output}_{layer_name}
(or just {output} when the group has a single member).
How thresholds are applied
Segments with a value below the first threshold are assigned category 1, segments between the first and second threshold are assigned category 2, and so on. Thresholds need not be sorted; the widget sorts them internally.
Cell-Nucleus Measurement Widget
The Cell-Nucleus Measurement widget computes per-cell features that combine a cell segmentation with a nucleus segmentation. It reports the number of nuclei per cell, cell-to-nucleus area/volume ratios, and optionally cytoplasmic vs. nuclear intensity statistics.
Layout (scrollable)
┌─────────────────────────────────────┐
│ Target: [combo] │
│ Cell segmentation: [combo] │
│ Nucleus segmentation: [combo] │
│ Intensity image (optional): [combo] │
│ ┌ Physical pixel/voxel size ──────┐ │
│ │ Y: [spinbox] │ │
│ │ X: [spinbox] │ │
│ └─────────────────────────────────┘ │
│ [Measure cell-nucleus] │
└─────────────────────────────────────┘
For 3-D data a third spinbox Z is added automatically.
Workflow
- Pick a Target:
<single layer>(default) for the original per-pair behaviour, or a previously defined group name to batch over its members. In group mode the cell, nucleus, and intensity-image combos become read-only and show the first member; the group must define a non-emptynucleus_segmentationrole, whileintensity_imageis optional and is used when present. - Select a Cell segmentation layer (Labels) from the first dropdown. The scale
spinboxes are pre-populated from the layer's
scaleattribute if it has been set; otherwise they default to1.0. - Select a Nucleus segmentation layer (Labels) from the second dropdown. This layer must have the same spatial dimensions as the cell segmentation.
- Optionally select an Intensity image layer (Image) from the third dropdown.
Choose
(none)to skip intensity measurements. - Adjust the per-axis scale values if needed (same convention as the Morphology widget).
- Click Measure cell-nucleus.
In group mode the measurement is run on every
(segmentation[i], nucleus_segmentation[i], intensity_image[i] or None)
triple in turn, writing results into each cell-segmentation layer's
features.
The result is merged into the cell segmentation layer's features table and the
napari Features Table dock is opened automatically with that layer selected.
One row per cell.
Columns – without intensity image (2-D)
| Column | Description |
|---|---|
index |
Integer cell label ID |
n_nuclei |
Number of nucleus labels overlapping with this cell |
cell_area |
Area of the whole cell in physical units (nucleus included) |
nucleus_area |
Total area of nuclei within this cell in physical units |
area_ratio |
cell_area / nucleus_area; NaN for cells with no nucleus |
For 3-D data the columns are cell_volume, nucleus_volume, and volume_ratio
instead.
Additional columns – with intensity image
When an intensity image is selected, columns are added for each statistic {stat} in
mean, median, max, min, percentile_10, percentile_25, percentile_75,
percentile_90:
| Column | Description |
|---|---|
cell_{stat}_intensity |
Statistic over the cytoplasmic region (cell pixels where no nucleus is present) |
nucleus_{stat}_intensity |
Statistic over all nuclear pixels within this cell |
{stat}_intensity_ratio |
cell_{stat}_intensity / nucleus_{stat}_intensity; NaN when either region is empty or the nucleus value is zero |
Saving the table
Use the Save as CSV button in the napari Features Table dock, or the Save table button in the Table Manipulation Widget for TSV / XLSX output.
Clustering Analysis Widget
The Clustering Analysis widget groups segments into clusters based on all numeric columns in a previously computed measurement table. After clustering, a 2-D feature-reduction scatter plot visualises the result, and a new label layer is created where each segment is painted with its cluster ID. The scatter-plot colours and the label-layer colours are kept in sync.
Layout (scrollable)
┌──────────────────────────────────────┐
│ Target: [combo] │
│ Segmentation: [combo] │
│ ┌ Feature reduction ───────────────┐ │
│ │ Method: [UMAP▾] [Reduce] │ │
│ │ <2-D scatter plot> │ │
│ └──────────────────────────────────┘ │
│ ┌ Clustering ──────────────────────┐ │
│ │ Method: [K-Means▾] │ │
│ │ <method-specific parameters> │ │
│ │ Output layer: [edit] │ │
│ │ [Cluster] │ │
│ └──────────────────────────────────┘ │
└──────────────────────────────────────┘
Workflow
Step 1 – Select the segmentation layer or group
The widget operates on the features table of the selected Labels layer. Run a
measurement widget first (or load a CSV via the Table Manipulation widget), then
either pick the layer from the Segmentation dropdown or pick a group
from the Target dropdown.
In group mode the Segmentation combo becomes read-only. Features
across the group's members are concatenated for joint clustering and the
2-D feature-reduction scatter plot, and one output Labels layer is
created per member named {output}_{layer_name} (or just {output} for
a single-member group). The same cluster colours are applied across all
output layers so cluster IDs match visually.
Step 2 – Explore the feature space (optional)
- Choose a Feature reduction method: UMAP (default), TSNE, or PCA.
- Click Reduce to compute a 2-D embedding of the features and display an uncoloured scatter plot.
Note: UMAP requires the optional
umap-learnpackage (pip install umap-learn). If it is not installed the widget falls back to PCA automatically. Changing the reduction method clears the cached embedding so the next Reduce or Cluster call recomputes it.
Step 3 – Cluster
- Select a Clustering method from the dropdown (see table below).
- Adjust the method-specific parameters shown below the dropdown.
- Enter an Output layer name (default:
clusters). - Click Cluster.
Three things happen simultaneously:
- The source layer's
featuresgains a newcluster_idcolumn (1-based;-1for noise) and the napari Features Table dock is opened automatically with that layer selected. - The scatter plot is redrawn with each point coloured by its cluster.
- A new Labels layer is created (or updated) where each segment is painted with its
cluster_id. The layer colours are set to exactly match the scatter-plot colours.
If you re-run clustering, the existing cluster_id column is excluded from the feature
set so it does not affect the new result.
Clustering methods and parameters
| Method | Widget label | Key parameters (defaults) |
|---|---|---|
| scikit-learn KMeans | K-Means | N clusters (3) |
| scikit-learn DBSCAN | DBSCAN | Eps (0.5), Min samples (5) |
| scikit-learn HDBSCAN | HDBSCAN | Min cluster size (5) |
| scikit-learn MeanShift | Mean Shift | Bandwidth (0 = auto) |
Cluster IDs and label values
Cluster IDs are 1-based: the first cluster found is 1, the second is 2, and so on.
Segments that are classified as noise by DBSCAN or HDBSCAN receive cluster_id = -1 and
remain as background (0) in the output label layer.
Color matching
The widget uses matplotlib's tab10 (or tab20 when there are more than 10 clusters)
colormap to assign one colour per cluster. The same colour array is applied to the
Labels layer via DirectLabelColormap, so the scatter-plot legend and the segmentation
overlay always show identical colours.
Saving the table
Use the Save as CSV button in the napari Features Table dock, or the Save table button in the Table Manipulation Widget for TSV / XLSX output.
Classification Analysis Widget
The Classification Analysis widget lets you interactively annotate a small number of segments with class labels using napari's paint tools, train a random forest or logistic regression classifier on those annotations, and then apply it to every segment in the table. The result is written to a new label layer and two new columns in the measurement table. Trained classifiers can be exported to disk for batch use from the CLI.
Layout (scrollable)
┌──────────────────────────────────────┐
│ Target: [combo] │
│ ┌ Layers ──────────────────────────┐ │
│ │ Segmentation: [combo] │ │
│ │ Annotation layer: [combo] [Create new] │
│ └──────────────────────────────────┘ │
│ ┌ Class names ─────────────────────┐ │
│ │ Label ID │ Class Name │ │
│ │ <editable rows> │ │
│ └──────────────────────────────────┘ │
│ ┌ Classifier ──────────────────────┐ │
│ │ Method: [Random Forest▾] │ │
│ │ <method-specific parameters> │ │
│ │ Output layer: [edit] │ │
│ │ [x] Live Update │ │
│ │ [Train & Apply] │ │
│ │ [Export classifier] │ │
│ └──────────────────────────────────┘ │
└──────────────────────────────────────┘
Workflow
Step 1 – Select the segmentation layer or group
The widget operates on the features table of the selected Labels layer. Run a
measurement widget first (or load a CSV via the Table Manipulation widget), then
pick a Target: <single layer> (default) for the original
single-layer behaviour, or a previously defined group name to classify
across the group's segmentation members.
In group mode the Segmentation combo is restricted to the group's members so you can step through them one at a time to annotate each. The class-names table pools detected annotation IDs across every member of the group, so detected classes are not lost as you switch members. See Per-member annotation persistence below for how brushstrokes survive member switches.
Step 2 – Create an annotation layer
The Annotation layer combo defaults to (none) so per-member
persistence cannot accidentally overwrite an unrelated label layer.
- Click Create new next to the Annotation layer dropdown. A new, empty Labels
layer called
annotationsis added to napari and automatically selected. - Alternatively, select an existing Labels layer from the Annotation layer dropdown if you already have annotations you want to use.
The active annotation layer is outlined by a thin frame in the image space. In group mode the frame follows the current member position in the grid.
Step 3 – Paint annotations
Use napari's built-in label painting tools to draw brushstrokes on the annotation layer.
- Each label value you paint (1, 2, 3, …) represents a different class.
- You can use napari's color picker and label selector to switch between classes.
- Paint at least a few representative segments from each class. You do not need to annotate every segment — the classifier will be applied to the rest automatically.
Step 4 – Review projected annotations
Whenever you finish painting, erasing, or changing labels in the annotation layer, the widget waits briefly and then automatically:
- Reads the pixel-level brushstrokes from the annotation layer.
- For each segment in the segmentation, takes the majority-vote annotation label
across all annotated pixels that overlap that segment. Segments with no annotation
overlap receive annotation
0(unannotated). - Merges an
annotationcolumn into the source layer'sfeatures(overwriting any previous annotation values). - Populates the Class names table with all annotation label IDs detected.
Because the projection is based on the full current annotation layer, removed
brushstrokes set the corresponding segment annotation back to 0, and repainting
with another label updates the value in the table.
Step 5 – Name the classes (optional)
The Class names table lists each detected annotation label ID with an editable name
field. Click a name cell and type to rename a class (e.g. change class_1 to
mitotic, class_2 to interphase). These names are written to the
classification_name output column.
Step 6 – Train and apply the classifier
- Choose a Method (default: Random Forest) and adjust the parameters if needed (see table below).
- Enter an Output layer name (default:
classification). - Leave Live Update enabled to retrain and apply the classifier automatically after annotation edits. Disable it to enable the Train & Apply button and run the classifier manually.
Three things happen:
- A scikit-learn classifier is trained on all rows where
annotation > 0, using all numeric measurement columns as features (excludingindex,annotation,classification_id,classification_name,cluster_id,category_id, andcategory_name). Features are z-score standardised internally. - The classifier is applied to every row in the layer's features (including
unannotated ones). Results are merged back into the source layer's
featuresas two new columns:classification_id(1-based integer) andclassification_name(string). Manual Train & Apply also opens the Features Table dock with that layer selected; live updates keep the current layer selection unchanged. - A new Labels layer is created (or updated) in napari where each segment is painted with
its
classification_id. Colours are keyed off the class ID itself (usingtab10for ≤10 classes andtab20beyond that), so the same class always renders in the same colour even when other classes are absent from a particular layer.
If the classifier is re-run, existing classification_id and classification_name
columns are excluded from the feature set so they do not affect the new result.
In group mode the classifier is trained on the concatenation of all
members' features (using each member's projected annotation column)
and then applied per-member. One output Labels layer is created per
member named {output}_{layer_name} (or just {output} for a
single-member group). Class colours are keyed off the class ID, so the
same class always renders in the same colour across all member outputs
even if a particular member only contains a subset of classes.
Per-member annotation persistence
In group mode the widget keeps a per-member cache of your brushstrokes so they survive switching between members:
- When you switch the Segmentation combo to a different member, the current annotation layer's contents are saved (sparse-compressed) for the previously-active member, and the cached annotations for the new member are loaded into the same annotation layer. If the new member has no cache yet, the layer is reset to zeros.
- Switching the Target combo back to
<single layer>(or to another group) saves the current member's annotations first so they are available again when you re-enter the group. - The cache is kept only as long as the widget is open and is keyed off the segmentation-layer name; renaming a member-layer mid-session decouples its cache.
The single-layer mode is unaffected: switching segmentation layers there does not alter the annotation layer.
Classification methods and parameters
| Method | Widget label | Key parameters (defaults) |
|---|---|---|
| scikit-learn RandomForestClassifier | Random Forest | N estimators (100), Max depth (0 = unlimited) |
| scikit-learn LogisticRegression | Logistic Regression | C (1.0), Max iterations (1000) |
Class IDs
Classification IDs are 1-based and match the annotation label values painted in the
annotation layer. A segment that could not be classified (e.g. because all its feature
values were NaN) receives classification_id = 0 and an empty classification_name.
Exporting the classifier
Click Export classifier to save the trained pipeline (StandardScaler +
classifier) to a .joblib file. The exported file can be used with the
analyze classify CLI command to apply it to new tables in batch.
Saving the table
Use the Save as CSV button in the napari Features Table dock, or the Save table button in the Table Manipulation Widget for TSV / XLSX output.
Group Manager Widget
The Group Manager is the single place where you define and edit the layer groups consumed by every measurement and analysis widget's Target combo. See Working with groups for the underlying data model and pairing semantics.
Layout (scrollable)
┌────────────────────────────────────────┐
│ ┌ Defined groups ────────────────────┐ │
│ │ exp_1 (3 seg, 3 nuc) │ │
│ │ exp_2 (5 seg) │ │
│ │ ... │ │
│ │ [Delete selected] │ │
│ │ [Arrange selected as grid] │ │
│ └────────────────────────────────────┘ │
│ ┌ Group editor ──────────────────────┐ │
│ │ Name: [edit] │ │
│ │ ┌ Segmentation layers (required)─┐ │ │
│ │ │ cells_01 │ │ │
│ │ │ cells_02 │ │ │
│ │ │ [Add selected][Remove][Up][Down] │ │
│ │ └────────────────────────────────┘ │ │
│ │ ┌ Nucleus layers (optional)──────┐ │ │
│ │ │ nuclei_01 │ │ │
│ │ │ nuclei_02 │ │ │
│ │ │ [Add selected][Remove][Up][Down] │ │
│ │ └────────────────────────────────┘ │ │
│ │ ┌ Intensity images (optional)────┐ │ │
│ │ │ raw_01 │ │ │
│ │ │ raw_02 │ │ │
│ │ │ [Add selected][Remove][Up][Down] │ │
│ │ └────────────────────────────────┘ │ │
│ │ ┌ Pairing preview ───────────────┐ │ │
│ │ │ Seg Nucleus Intensity │ │ │
│ │ │ cells_01 nuclei_01 raw_01 │ │ │
│ │ │ cells_02 nuclei_02 raw_02 │ │ │
│ │ └────────────────────────────────┘ │ │
│ │ [Save] │ │
│ └────────────────────────────────────┘ │
└────────────────────────────────────────┘
Workflow
- Select layers in napari's main layer list (you can multi-select with Ctrl/Shift).
- In the Segmentation layers section click Add selected to add the selected Labels layers to the role. Layers are appended in napari layer-panel order; non-Labels layers in the selection are skipped, as are duplicates already in the list.
- Optionally repeat for Nucleus layers (also Labels-only) and Intensity images (Image-only).
- Reorder entries with Up / Down so each row of the Pairing preview at the bottom matches the segmentation–nucleus– image triple you actually want. Within a group, layers across roles are paired by position.
- Optional roles must be either empty or have the same length as the segmentation list — saving with mismatched lengths is rejected.
- Type a Name and click Save. If a group with that name already exists it is replaced.
Selecting a group in the top list loads its current members back into the editor for editing. Click Delete selected to remove the highlighted group.
Click Arrange selected as grid to lay out the highlighted group in the viewer. The grid has one cell per segmentation layer. Nucleus and intensity layers, when present, are translated to the same grid cell as their position-matched segmentation layer. For each grid cell, the intensity image is stacked below the nucleus segmentation, and the segmentation layer is stacked on top. Layers are linked by role (segmentations with segmentations, nucleus segmentations with nucleus segmentations, intensity images with intensity images) while spatial transforms remain independent so the grid positions are preserved.
Renaming a layer that a group references automatically updates the group's stored layer name. Removing a layer leaves the dangling reference in place; running a measurement or analysis widget that targets the group will raise an informative error if a member layer is missing.
Table Manipulation Widget
The Table Manipulation widget edits the features table of a Labels layer. It can
load an external CSV / TSV / XLSX file (which must contain an index column) and
merge it into the layer's features, drop a column from the layer's features, and
save the layer's features to a CSV / TSV / XLSX file.
Layout
┌────────────────────────────────────────┐
│ Segmentation: [combo] │
│ ┌ Load table from file ──────────────┐ │
│ │ [Load file…] │ │
│ └────────────────────────────────────┘ │
│ ┌ Drop column ───────────────────────┐ │
│ │ Column: [combo] [Drop] │ │
│ └────────────────────────────────────┘ │
│ [Save table] │
└────────────────────────────────────────┘
Loading a table from file
Click Load file… to read a CSV, TSV, or XLSX file. The file must contain an
index column whose values are the segment label IDs; loading is rejected
otherwise. The columns of the loaded file are merged into the selected layer's
features (outer join on index). Columns present in both the file and the
existing features are overwritten with the values from the file. The Features
Table dock is opened automatically after the merge.
Tip: Multiple measurement widgets writing to the same layer follow the same merge rules — running Intensity then Morphology on the same Labels layer leaves all columns from both measurements in the layer's
features.
Dropping a column
Select a column from the Drop column dropdown and click Drop. The column
is removed from the selected layer's features. The index column is the
segment identifier and is never offered for dropping.
Saving the table
Click Save table to export the selected layer's features to a CSV, TSV, or
XLSX file. This complements the napari Features Table dock's CSV-only save by
also supporting TSV and Excel output.
Command Line Interface
The segmentation-measurement CLI provides utilities for post-processing segmentations,
computing measurements, and analyzing results directly from the terminal without writing
any Python code.
Installation
pip install segmentation-measurement
After installation the segmentation-measurement command is available in your shell.
Overview
The CLI exposes five top-level commands:
| Command | Description |
|---|---|
open |
Open segmentations and optional matched images in napari |
postprocess |
Apply post-processing operations to segmentation TIFF files |
measure |
Compute per-segment measurements from segmentation TIFF files |
table |
Manipulate measurement tables (merge tables, drop columns) |
analyze |
Analyze measurement tables (threshold-based categorization, clustering, classification) |
Run any command with --help to see its full usage:
segmentation-measurement --help
segmentation-measurement open --help
segmentation-measurement postprocess --help
segmentation-measurement measure morphology --help
segmentation-measurement analyze threshold --help
Open In Napari (open)
Open one or more segmentation files in napari, optionally with matched
intensity images and nucleus segmentations. Paths may be single file paths
or glob expressions. Recursive glob expressions using ** are supported.
segmentation-measurement open \
--segmentations "data/**/cells*.tif" \
--intensities "data/**/raw*.tif" \
--nuclei "data/**/nuclei*.tif"
Segmentations and nuclei are opened as Labels layers. Intensities are opened
as Image layers. When a glob expression is used, layer names are derived
relative to the directory above the first wildcard component. For example,
data/**/seg.tif creates names such as well_a/seg and well_b/seg, so
files with the same basename in different subfolders remain distinguishable.
If multiple segmentations are opened, the default behavior is to create a
group named opened_files and arrange the matched layers in a grid view. Use
--no-grid to keep the group but skip grid arrangement, or --no-group to
skip both grouping and grid arrangement.
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--segmentations |
path/glob list | yes | Segmentation file path(s) or glob expression(s) |
--intensities |
path/glob list | no | Optional intensity image path(s); expanded count must match segmentations |
--nuclei, --nucleus-segmentations |
path/glob list | no | Optional nucleus segmentation path(s); expanded count must match segmentations |
--no-group |
flag | no | Do not create a group when multiple segmentations are opened |
--no-grid |
flag | no | Create the group but do not arrange it as a grid |
--group-name |
string | no | Name for the created group; default is opened_files |
Examples
Open a single segmentation:
segmentation-measurement open --segmentations cells_01.tif
Open matched files as a grouped grid:
segmentation-measurement open \
--segmentations "experiment/**/cells.tif" \
--intensities "experiment/**/raw.tif" \
--nuclei "experiment/**/nuclei.tif" \
--group-name experiment
Open multiple segmentations without grouping:
segmentation-measurement open --segmentations "cells_*.tif" --no-group
Post-processing (postprocess)
The postprocess command provides three sub-commands that each read a segmentation TIFF,
apply a transformation, and write the result to a new TIFF.
filter-small-segments
Remove segments whose size (in pixels for 2-D data, or voxels for 3-D data) is below a
minimum threshold. Removed segments are replaced by the background label 0.
segmentation-measurement postprocess filter-small-segments \
--input segmentation.tif \
--output filtered.tif \
--min-size 200
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--input |
path | yes | Input segmentation TIFF file |
--output |
path | yes | Output segmentation TIFF file |
--min-size |
int | yes | Minimum segment size in pixels/voxels; segments strictly smaller than this value are removed |
Example – keep only segments with at least 500 pixels:
segmentation-measurement postprocess filter-small-segments \
--input nuclei.tif --output nuclei_filtered.tif --min-size 500
remove-small-holes
Fill enclosed background holes inside segments when the hole is smaller than or equal to the specified maximum size. Pixels belonging to other segments are never overwritten.
segmentation-measurement postprocess remove-small-holes \
--input segmentation.tif \
--output filled.tif \
--max-hole-size 50
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--input |
path | yes | Input segmentation TIFF file |
--output |
path | yes | Output segmentation TIFF file |
--max-hole-size |
int | yes | Maximum hole size in pixels/voxels to fill |
Example – fill holes up to 100 pixels:
segmentation-measurement postprocess remove-small-holes \
--input cells.tif --output cells_filled.tif --max-hole-size 100
ring-mask
Compute a ring (annular hull) of a specified width around each segment. By default the
original segment pixels are retained in the output alongside the ring pixels. Pass
--remove-original to produce a ring-only mask (original segment interiors set to 0).
This is useful for creating pseudo-cytoplasm masks from segmented nuclei.
Rings are placed only on background pixels; if rings from different segments overlap, the segment with the smaller label ID takes precedence.
segmentation-measurement postprocess ring-mask \
--input segmentation.tif \
--output rings.tif \
--ring-width 5
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--input |
path | yes | Input segmentation TIFF file |
--output |
path | yes | Output TIFF file for the ring mask |
--ring-width |
int | yes | Width of the ring in pixels/voxels |
--remove-original |
flag | no | Remove original segment pixels; output contains only ring pixels |
Example – create 8-pixel-wide rings around nuclei (original segments kept):
segmentation-measurement postprocess ring-mask \
--input nuclei.tif --output cytoplasm_rings.tif --ring-width 8
Example – ring-only mask (original segments removed):
segmentation-measurement postprocess ring-mask \
--input nuclei.tif --output rings_only.tif --ring-width 8 --remove-original
watershed
Refine a segmentation using the watershed algorithm. The input segmentation
is used as seed markers; the heatmap is the topographic landscape that the
algorithm floods. Because skimage.segmentation.watershed floods from low
values upward, the heatmap should have low values at desired segment
boundaries and high values in the interior. If your heatmap has the
opposite convention (e.g. a distance transform or foreground-probability map
where high values indicate cell centres), pass the negated image.
An optional binary mask restricts processing to a subset of pixels; unmasked pixels are set to 0 in the output.
segmentation-measurement postprocess watershed \
--input seeds.tif \
--heatmap landscape.tif \
--output refined.tif
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--input |
path | yes | Input segmentation TIFF file used as seed markers |
--heatmap |
path | yes | Heatmap image TIFF (low values flooded first) |
--output |
path | yes | Output segmentation TIFF file |
--mask |
path | no | Binary mask TIFF; only masked pixels are processed |
Example – watershed refinement with a foreground-probability heatmap (negated so high-probability regions are flooded first):
# Negate the probability map beforehand, then run watershed
segmentation-measurement postprocess watershed \
--input seeds.tif --heatmap neg_prob.tif --output refined.tif
Measurements (measure)
intensities
Compute per-segment intensity statistics from a segmentation label image and a co-registered intensity image. Both files must be TIFF and must have identical shapes.
segmentation-measurement measure intensities \
--segmentation segmentation.tif \
--intensity fluorescence.tif \
--output measurements.csv
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--segmentation |
path | yes | Segmentation TIFF file (integer labels) |
--intensity |
path | yes | Intensity image TIFF file |
--output |
path | yes | Output table file; format inferred from extension |
Supported output formats
| Extension | Format |
|---|---|
.csv (default) |
Comma-separated values |
.tsv |
Tab-separated values |
.xlsx |
Excel workbook |
Output columns
| Column | Description |
|---|---|
index |
Integer segment label ID |
mean_intensity |
Mean pixel intensity within the segment |
median_intensity |
Median pixel intensity |
max_intensity |
Maximum pixel intensity |
min_intensity |
Minimum pixel intensity |
std_intensity |
Standard deviation of pixel intensities |
percentile_10 |
10th percentile of pixel intensities |
percentile_25 |
25th percentile (first quartile) |
percentile_75 |
75th percentile (third quartile) |
percentile_90 |
90th percentile of pixel intensities |
Example – save results as an Excel file:
segmentation-measurement measure intensities \
--segmentation cells.tif \
--intensity gfp_channel.tif \
--output intensity_stats.xlsx
morphology
Compute per-segment shape descriptors from a segmentation label image. Supports isotropic and anisotropic pixel/voxel sizes so that results are returned in physical units.
segmentation-measurement measure morphology \
--segmentation segmentation.tif \
--output morphology.csv
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--segmentation |
path | yes | Segmentation TIFF file (integer labels) |
--output |
path | yes | Output table file; format inferred from extension |
--scale |
float(s) | no | Physical pixel/voxel size (default: 1.0) |
Scale argument
Pass a single value for isotropic spacing, or one value per spatial dimension for
anisotropic spacing. Dimension order is (Y, X) for 2-D and (Z, Y, X) for 3-D.
# isotropic: 0.5 µm per pixel
--scale 0.5
# anisotropic 2-D: 0.5 µm in Y, 0.25 µm in X
--scale 0.5 0.25
# anisotropic 3-D: 2.0 µm in Z, 0.5 µm in Y and X
--scale 2.0 0.5 0.5
Output columns – 2-D
| Column | Description |
|---|---|
index |
Integer segment label ID |
area |
Area in physical units |
perimeter |
Perimeter length in physical units |
sphericity |
Circularity (1.0 = perfect circle) |
solidity |
Area / convex hull area |
axis_major_length |
Major axis of the fitted ellipse |
axis_minor_length |
Minor axis of the fitted ellipse |
equivalent_diameter |
Diameter of a circle with the same area |
Output columns – 3-D
| Column | Description |
|---|---|
index |
Integer segment label ID |
volume |
Volume in physical units |
surface_area |
Surface area via marching cubes |
sphericity |
Sphericity (1.0 = perfect sphere) |
solidity |
Volume / convex hull volume |
axis_major_length |
Major axis of the fitted ellipsoid |
axis_minor_length |
Minor axis of the fitted ellipsoid |
equivalent_diameter |
Diameter of a sphere with the same volume |
Examples
# 2-D, pixel units
segmentation-measurement measure morphology \
--segmentation cells.tif --output morphology.csv
# 2-D, anisotropic scale
segmentation-measurement measure morphology \
--segmentation cells.tif --output morphology.csv --scale 0.5 0.25
# 3-D, anisotropic scale (Z=2 µm, Y=X=0.5 µm)
segmentation-measurement measure morphology \
--segmentation nuclei_3d.tif --output morphology_3d.csv --scale 2.0 0.5 0.5
cell-nucleus
Compute per-cell measurements that combine a cell segmentation with a nucleus segmentation. For each cell, the command reports how many nuclei it contains, the cell and nucleus area/volume in physical units, and their ratio. When an optional intensity image is supplied, it also reports intensity statistics for the cytoplasmic region (cell minus nucleus) and the nuclear region, together with their ratios.
segmentation-measurement measure cell-nucleus \
--cell-segmentation cells.tif \
--nucleus-segmentation nuclei.tif \
--output cell_nucleus.csv
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--cell-segmentation |
path | yes | Cell segmentation TIFF file (integer labels) |
--nucleus-segmentation |
path | yes | Nucleus segmentation TIFF file (integer labels); must have the same shape as the cell segmentation |
--output |
path | yes | Output table file; format inferred from extension |
--scale |
float(s) | no | Physical pixel/voxel size (default: 1.0); same syntax as morphology |
--intensity |
path | no | Intensity image TIFF file; when provided, per-region intensity statistics and their ratios are included |
Scale argument
Identical to the morphology sub-command: pass a single value for isotropic
spacing, or one value per spatial dimension for anisotropic spacing in
(Y, X) / (Z, Y, X) order.
Output columns – without intensity image (2-D)
| Column | Description |
|---|---|
index |
Integer cell label ID |
n_nuclei |
Number of nucleus labels overlapping with this cell |
cell_area |
Area of the cell in physical units (nucleus included) |
nucleus_area |
Total area of nuclei within this cell in physical units |
area_ratio |
cell_area / nucleus_area; NaN if the cell contains no nucleus |
For 3-D data the columns are cell_volume, nucleus_volume, and volume_ratio
instead.
Additional columns – with intensity image
When --intensity is given, the following columns are added for each statistic
{stat} in mean, median, max, min, percentile_10, percentile_25,
percentile_75, percentile_90:
| Column | Description |
|---|---|
cell_{stat}_intensity |
Statistic over the cytoplasmic region (cell pixels where no nucleus is present) |
nucleus_{stat}_intensity |
Statistic over all nuclear pixels within this cell |
{stat}_intensity_ratio |
cell_{stat}_intensity / nucleus_{stat}_intensity; NaN when either region is empty or the nucleus value is zero |
Supported output formats – same as intensities (.csv, .tsv, .xlsx).
Examples
# Basic measurements, pixel units
segmentation-measurement measure cell-nucleus \
--cell-segmentation cells.tif \
--nucleus-segmentation nuclei.tif \
--output cell_nucleus.csv
# With physical scale (0.5 µm/px, isotropic)
segmentation-measurement measure cell-nucleus \
--cell-segmentation cells.tif \
--nucleus-segmentation nuclei.tif \
--output cell_nucleus.csv \
--scale 0.5
# With intensity ratios
segmentation-measurement measure cell-nucleus \
--cell-segmentation cells.tif \
--nucleus-segmentation nuclei.tif \
--intensity gfp_channel.tif \
--output cell_nucleus_intensity.csv
# Anisotropic 3-D (Z=2 µm, Y=X=0.5 µm) with intensity
segmentation-measurement measure cell-nucleus \
--cell-segmentation cells_3d.tif \
--nucleus-segmentation nuclei_3d.tif \
--intensity gfp_3d.tif \
--output cell_nucleus_3d.csv \
--scale 2.0 0.5 0.5
Table manipulation (table)
The table command operates on saved measurement tables (CSV / TSV / XLSX).
merge
Merge one or more saved measurement tables on a shared key column (index by
default) and optionally drop columns from the merged result. The merge is an
outer join: label IDs that appear in only some inputs are kept, with NaNs in
the missing columns. Non-key columns must be disjoint between input tables —
drop conflicts beforehand or via --drop-columns.
When only one input is given the command becomes a column-drop utility,
useful for cleaning up an existing table. The index column is the segment
identifier and is always preserved — passing it to --drop-columns raises an
error.
segmentation-measurement table merge \
--inputs intensity.csv morphology.csv \
--output combined.csv
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--inputs |
path… | yes | One or more input table files (CSV, TSV, XLSX) |
--output |
path | yes | Output table file (extension picks the format) |
--on |
str | no | Key column shared between input tables (default: index) |
--drop-columns |
str… | no | Columns to drop from the merged table |
Example — merge intensity and morphology tables and drop a column:
segmentation-measurement table merge \
--inputs intensity.csv morphology.csv \
--output combined.csv \
--drop-columns std_intensity
Example — drop a column from a single existing table:
segmentation-measurement table merge \
--inputs combined.csv \
--output trimmed.csv \
--drop-columns std_intensity max_intensity
Analysis (analyze)
cluster
Cluster segments using their measurement features. All numeric columns are used as
features (excluding index, cluster_id, category_id, and category_name).
Features are z-score standardised before clustering.
The output table is the input table with an added cluster_id column. Cluster IDs are
1-based (1, 2, 3, …). Noise points — segments that no cluster claims, as produced by
DBSCAN and HDBSCAN — are assigned cluster_id = -1.
segmentation-measurement analyze cluster \
--table measurements.csv \
--method kmeans \
--n-clusters 4 \
--output clustered.csv
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--table |
path | yes | Input measurement table (CSV, TSV, or XLSX) |
--method |
str | no | Clustering method: kmeans (default), dbscan, hdbscan, or mean_shift |
--n-clusters |
int | no | K-Means: number of clusters (default: 3) |
--eps |
float | no | DBSCAN: neighbourhood radius (default: 0.5) |
--min-samples |
int | no | DBSCAN / HDBSCAN: minimum samples in a neighbourhood (default: 5) |
--min-cluster-size |
int | no | HDBSCAN: minimum cluster size (default: 5) |
--bandwidth |
float | no | Mean Shift: bandwidth; omit or set to 0 for automatic estimation |
--output |
path | yes | Output table file (CSV, TSV, or XLSX) |
--segmentation |
path | no | Segmentation TIFF; required when --output-segmentation is used |
--output-segmentation |
path | no | Output TIFF where each segment is painted with its cluster_id; noise segments are left as background (0) |
Output
The output table is the input table with one additional column:
| Column | Description |
|---|---|
cluster_id |
Integer cluster label (1-based); -1 for noise (DBSCAN / HDBSCAN only) |
When --output-segmentation is specified, each segment pixel is set to the cluster_id
of that segment (background and noise segments remain 0).
Method defaults
| Method | Key parameters and defaults |
|---|---|
kmeans |
--n-clusters 3 |
dbscan |
--eps 0.5, --min-samples 5 |
hdbscan |
--min-cluster-size 5 |
mean_shift |
bandwidth estimated automatically |
Examples
# K-Means with 5 clusters
segmentation-measurement analyze cluster \
--table morphology.csv --method kmeans --n-clusters 5 --output clustered.csv
# DBSCAN – also write a cluster segmentation TIFF
segmentation-measurement analyze cluster \
--table intensity.csv \
--method dbscan --eps 1.0 --min-samples 3 \
--output clustered.csv \
--segmentation cells.tif \
--output-segmentation clusters.tif
# HDBSCAN
segmentation-measurement analyze cluster \
--table morphology.csv --method hdbscan --min-cluster-size 10 --output clustered.csv
# Mean Shift with automatic bandwidth
segmentation-measurement analyze cluster \
--table intensity.csv --method mean_shift --output clustered.csv
threshold
Categorize segments into N named groups based on N-1 thresholds applied to one column of a measurement table. Thresholds can be provided explicitly or suggested automatically from the data distribution.
segmentation-measurement analyze threshold \
--table measurements.csv \
--column mean_intensity \
--n-categories 3 \
--output categorized.csv
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--table |
path | yes | Input measurement table (CSV, TSV, or XLSX) |
--column |
str | yes | Column name to threshold |
--n-categories |
int | yes | Number of output categories |
--thresholds |
float(s) | no | Explicit threshold values (n_categories - 1 values); auto-suggested if omitted |
--category-names |
str(s) | no | Names for each category (n_categories values); defaults to category_1, category_2, … |
--output |
path | yes | Output table file (CSV, TSV, or XLSX) |
--segmentation |
path | no | Segmentation TIFF; required when --output-segmentation is used |
--output-segmentation |
path | no | Output TIFF where each segment is assigned its category ID |
Output
The output table is the input table with two additional columns:
| Column | Description |
|---|---|
category_id |
Integer category (1-based) |
category_name |
Human-readable category name |
Segments with a value below the first threshold are category 1; segments between the first and second threshold are category 2; and so on.
Examples
# Auto-suggest thresholds for 3 categories
segmentation-measurement analyze threshold \
--table intensity.csv \
--column mean_intensity \
--n-categories 3 \
--output categorized.csv
# Explicit thresholds with custom names
segmentation-measurement analyze threshold \
--table morphology.csv \
--column area \
--n-categories 3 \
--thresholds 500 1500 \
--category-names small medium large \
--output categorized.csv
# Also write a category segmentation TIFF
segmentation-measurement analyze threshold \
--table intensity.csv \
--column mean_intensity \
--n-categories 2 \
--thresholds 100 \
--output categorized.csv \
--segmentation cells.tif \
--output-segmentation categories.tif
train-classifier
Train a random forest or logistic regression classifier from one or more annotated
measurement tables and save the fitted pipeline to a .joblib file.
An annotated table is a measurement table that contains an integer annotation column
(or whichever column name you specify with --annotation-column). Rows with a value of
0 in that column are treated as unannotated and excluded from training. Annotated rows
are typically exported from the Classification Analysis napari widget, but you can
also create the column manually.
All numeric columns are used as features (excluding index, annotation,
classification_id, classification_name, cluster_id, category_id, and
category_name). Features are z-score standardised inside the saved pipeline so no
separate pre-processing step is needed when applying the classifier.
segmentation-measurement analyze train-classifier \
--tables annotated.csv \
--method random_forest \
--output classifier.joblib
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--tables |
path(s) | yes | One or more annotated measurement tables (CSV, TSV, or XLSX); when multiple files are given they are concatenated before training |
--output |
path | yes | Output classifier file (.joblib) |
--method |
str | no | Classifier type: random_forest (default) or logistic_regression |
--annotation-column |
str | no | Column containing integer annotation labels (default: annotation) |
--n-estimators |
int | no | RF: number of trees (default: 100) |
--max-depth |
int | no | RF: maximum tree depth; omit or set to 0 for unlimited |
--c |
float | no | LR: regularisation strength C (default: 1.0) |
--max-iter |
int | no | LR: maximum number of solver iterations (default: 1000) |
Examples
# Train a random forest from a single annotated CSV
segmentation-measurement analyze train-classifier \
--tables annotated.csv \
--output classifier.joblib
# Train from two experiments combined, with 200 trees
segmentation-measurement analyze train-classifier \
--tables experiment1.csv experiment2.csv \
--n-estimators 200 \
--output classifier.joblib
# Train a logistic regression classifier
segmentation-measurement analyze train-classifier \
--tables annotated.csv \
--method logistic_regression \
--c 0.1 \
--output classifier.joblib
classify
Apply a previously trained classifier (saved with train-classifier or exported from the
napari widget) to a measurement table. The output table gains two new columns:
classification_id (1-based integer) and classification_name (string).
segmentation-measurement analyze classify \
--table measurements.csv \
--classifier classifier.joblib \
--output classified.csv
Arguments
| Argument | Type | Required | Description |
|---|---|---|---|
--table |
path | yes | Input measurement table (CSV, TSV, or XLSX) |
--classifier |
path | yes | Trained classifier file (.joblib) |
--output |
path | yes | Output table file (CSV, TSV, or XLSX) |
--class-names |
str(s) | no | Names for each class in ascending class-label order (e.g. --class-names mitotic interphase); defaults to class_1, class_2, … |
--segmentation |
path | no | Segmentation TIFF; required when --output-segmentation is used |
--output-segmentation |
path | no | Output TIFF where each segment is painted with its classification_id; unclassified segments remain background (0) |
Output columns
| Column | Description |
|---|---|
classification_id |
Integer class label (1-based); 0 for rows whose features were all NaN |
classification_name |
Human-readable class name |
Examples
# Apply classifier and save results as CSV
segmentation-measurement analyze classify \
--table new_measurements.csv \
--classifier classifier.joblib \
--output classified.csv
# Apply and assign human-readable names to classes
segmentation-measurement analyze classify \
--table new_measurements.csv \
--classifier classifier.joblib \
--class-names mitotic interphase apoptotic \
--output classified.csv
# Also write a classification segmentation TIFF
segmentation-measurement analyze classify \
--table new_measurements.csv \
--classifier classifier.joblib \
--output classified.csv \
--segmentation cells.tif \
--output-segmentation classified_seg.tif
1""" 2.. include:: ../doc/start.md 3.. include:: ../doc/napari.md 4.. include:: ../doc/cli.md 5""" 6 7from segmentation_measurement.postprocessing import ( 8 apply_watershed, 9 compute_ring_mask, 10 filter_small_segments, 11 remove_small_holes, 12) 13from segmentation_measurement.intensity import measure_intensities 14from segmentation_measurement.morphology import measure_morphology 15from segmentation_measurement.cell_nucleus import measure_cell_nucleus 16from segmentation_measurement.table_manipulation import ( 17 drop_columns, 18 merge_tables, 19) 20from segmentation_measurement.analysis import ( 21 apply_classifier, 22 categorize_by_threshold, 23 cluster_measurements, 24 suggest_thresholds, 25 train_classifier, 26) 27 28__all__ = [ 29 "filter_small_segments", 30 "remove_small_holes", 31 "compute_ring_mask", 32 "apply_watershed", 33 "measure_intensities", 34 "measure_morphology", 35 "measure_cell_nucleus", 36 "merge_tables", 37 "drop_columns", 38 "suggest_thresholds", 39 "categorize_by_threshold", 40 "cluster_measurements", 41 "train_classifier", 42 "apply_classifier", 43] 44 45__version__ = "0.1.0"
13def filter_small_segments(segmentation: np.ndarray, min_size: int) -> np.ndarray: 14 """Filter out segments below a minimum size threshold. 15 16 Segments with fewer pixels/voxels than ``min_size`` are set to zero 17 (background label). 18 19 Args: 20 segmentation (np.ndarray): Integer-valued label array where 0 is 21 background and each positive integer represents a distinct segment. 22 Supports arbitrary dimensionality. 23 min_size (int): Minimum segment size in pixels/voxels. Segments 24 strictly smaller than this threshold are removed. 25 26 Returns: 27 np.ndarray: Label array with small segments set to zero, same shape 28 and dtype as input. 29 """ 30 result = segmentation.copy() 31 label_ids, counts = np.unique(segmentation, return_counts=True) 32 for label_id, count in zip(label_ids, counts): 33 if label_id == 0: 34 continue 35 if count < min_size: 36 result[segmentation == label_id] = 0 37 return result
Filter out segments below a minimum size threshold.
Segments with fewer pixels/voxels than min_size are set to zero
(background label).
Arguments:
- segmentation (np.ndarray): Integer-valued label array where 0 is background and each positive integer represents a distinct segment. Supports arbitrary dimensionality.
- min_size (int): Minimum segment size in pixels/voxels. Segments strictly smaller than this threshold are removed.
Returns:
np.ndarray: Label array with small segments set to zero, same shape and dtype as input.
40def remove_small_holes(segmentation: np.ndarray, max_hole_size: int) -> np.ndarray: 41 """Remove small holes from segments. 42 43 For each segment, enclosed background regions (holes) smaller than or 44 equal to ``max_hole_size`` pixels/voxels are filled with the segment's 45 label. Pixels belonging to other segments are never overwritten. 46 47 Args: 48 segmentation (np.ndarray): Integer-valued label array where 0 is 49 background and each positive integer represents a distinct segment. 50 Supports arbitrary dimensionality. 51 max_hole_size (int): Maximum hole size in pixels/voxels. Holes smaller 52 than or equal to this threshold are filled. 53 54 Returns: 55 np.ndarray: Label array with small holes filled, same shape and dtype 56 as input. 57 """ 58 result = segmentation.copy() 59 for label_id in np.unique(segmentation): 60 if label_id == 0: 61 continue 62 binary_mask = segmentation == label_id 63 filled_mask = _remove_small_holes(binary_mask, area_threshold=max_hole_size) 64 new_pixels = filled_mask & ~binary_mask 65 # Only fill background pixels; do not overwrite other segments. 66 new_pixels &= (segmentation == 0) 67 result[new_pixels] = label_id 68 return result
Remove small holes from segments.
For each segment, enclosed background regions (holes) smaller than or
equal to max_hole_size pixels/voxels are filled with the segment's
label. Pixels belonging to other segments are never overwritten.
Arguments:
- segmentation (np.ndarray): Integer-valued label array where 0 is background and each positive integer represents a distinct segment. Supports arbitrary dimensionality.
- max_hole_size (int): Maximum hole size in pixels/voxels. Holes smaller than or equal to this threshold are filled.
Returns:
np.ndarray: Label array with small holes filled, same shape and dtype as input.
71def compute_ring_mask( 72 segmentation: np.ndarray, ring_width: int, keep_original: bool = True 73) -> np.ndarray: 74 """Compute the ring mask around each segment. 75 76 For each segment, a ring of specified width is computed by dilating the 77 segment mask by ``ring_width`` iterations and subtracting the original 78 mask. Ring pixels are only placed on background pixels of the original 79 segmentation. If rings from different segments overlap, the segment with 80 the smaller label ID takes precedence. 81 82 This is useful for creating pseudo-cytosol masks around segmented nuclei. 83 84 Args: 85 segmentation (np.ndarray): Integer-valued label array where 0 is 86 background and each positive integer represents a distinct segment. 87 Supports arbitrary dimensionality. 88 ring_width (int): Width of the ring in pixels/voxels. 89 keep_original (bool): If ``True`` (default), original segment pixels 90 are retained in the output alongside the ring pixels. If ``False``, 91 only the ring pixels are labeled and original segment pixels are 92 set to zero (background). 93 94 Returns: 95 np.ndarray: Label array containing the ring regions and, when 96 ``keep_original`` is ``True``, also the original segment pixels. 97 Same shape and dtype as input. 98 """ 99 result = segmentation.copy() if keep_original else np.zeros_like(segmentation) 100 for label_id in np.unique(segmentation): 101 if label_id == 0: 102 continue 103 binary_mask = segmentation == label_id 104 dilated = binary_dilation(binary_mask, iterations=ring_width) 105 ring = dilated & ~binary_mask 106 # Only place ring on background pixels; smaller label IDs take precedence 107 # over later rings via the result == 0 guard. 108 ring &= (segmentation == 0) & (result == 0) 109 result[ring] = label_id 110 return result
Compute the ring mask around each segment.
For each segment, a ring of specified width is computed by dilating the
segment mask by ring_width iterations and subtracting the original
mask. Ring pixels are only placed on background pixels of the original
segmentation. If rings from different segments overlap, the segment with
the smaller label ID takes precedence.
This is useful for creating pseudo-cytosol masks around segmented nuclei.
Arguments:
- segmentation (np.ndarray): Integer-valued label array where 0 is background and each positive integer represents a distinct segment. Supports arbitrary dimensionality.
- ring_width (int): Width of the ring in pixels/voxels.
- keep_original (bool): If
True(default), original segment pixels are retained in the output alongside the ring pixels. IfFalse, only the ring pixels are labeled and original segment pixels are set to zero (background).
Returns:
np.ndarray: Label array containing the ring regions and, when
keep_originalisTrue, also the original segment pixels. Same shape and dtype as input.
113def apply_watershed( 114 segmentation: np.ndarray, 115 heatmap: np.ndarray, 116 mask: Optional[np.ndarray] = None, 117) -> np.ndarray: 118 """Refine a segmentation using the watershed algorithm. 119 120 Uses the input segmentation as seed markers and ``heatmap`` as the 121 topographic landscape for ``skimage.segmentation.watershed``. The 122 watershed algorithm floods uphill from each marker, so pixels with 123 *low* heatmap values are claimed first. For heatmaps where high values 124 indicate cell interiors or distance-to-boundary (e.g. a distance 125 transform), pass the negated heatmap so that high-confidence regions are 126 flooded first. 127 128 Args: 129 segmentation (np.ndarray): Integer-valued label array used as seed 130 markers. 0 is background; each positive integer is a distinct 131 seed. Supports arbitrary dimensionality. 132 heatmap (np.ndarray): Landscape image of the same spatial shape as 133 ``segmentation``. Low values are flooded before high values. 134 mask (Optional[np.ndarray]): Boolean or binary array of the same 135 shape as ``segmentation``. Only pixels where ``mask`` is 136 ``True`` are processed; all other pixels are set to 0 in the 137 output. If ``None`` (default), all pixels are processed. 138 139 Returns: 140 np.ndarray: Refined label array, same shape and dtype as 141 ``segmentation``. 142 """ 143 from skimage.segmentation import watershed 144 result = watershed(heatmap, markers=segmentation, mask=mask) 145 return result.astype(segmentation.dtype)
Refine a segmentation using the watershed algorithm.
Uses the input segmentation as seed markers and heatmap as the
topographic landscape for skimage.segmentation.watershed. The
watershed algorithm floods uphill from each marker, so pixels with
low heatmap values are claimed first. For heatmaps where high values
indicate cell interiors or distance-to-boundary (e.g. a distance
transform), pass the negated heatmap so that high-confidence regions are
flooded first.
Arguments:
- segmentation (np.ndarray): Integer-valued label array used as seed markers. 0 is background; each positive integer is a distinct seed. Supports arbitrary dimensionality.
- heatmap (np.ndarray): Landscape image of the same spatial shape as
segmentation. Low values are flooded before high values. - mask (Optional[np.ndarray]): Boolean or binary array of the same
shape as
segmentation. Only pixels wheremaskisTrueare processed; all other pixels are set to 0 in the output. IfNone(default), all pixels are processed.
Returns:
np.ndarray: Refined label array, same shape and dtype as
segmentation.
17def measure_intensities(segmentation: np.ndarray, intensity_image: np.ndarray) -> pd.DataFrame: 18 """Compute per-segment intensity statistics. 19 20 For each labeled segment, computes mean, median, maximum, minimum, 21 standard deviation and common percentiles of pixel intensities. 22 23 Args: 24 segmentation (np.ndarray): Integer-valued label array where 0 is 25 background. Supports arbitrary dimensionality. 26 intensity_image (np.ndarray): Intensity image with the same shape as 27 ``segmentation``. 28 29 Returns: 30 pd.DataFrame: One row per segment with columns ``index``, 31 ``mean_intensity``, ``median_intensity``, ``max_intensity``, 32 ``min_intensity``, ``std_intensity``, ``percentile_10``, 33 ``percentile_25``, ``percentile_75``, ``percentile_90``. 34 ``index`` holds the integer label ID of each segment. 35 """ 36 props = regionprops(segmentation, intensity_image) 37 if not props: 38 return pd.DataFrame(columns=_COLUMNS) 39 40 rows = [] 41 for region in props: 42 # ``image_intensity`` was renamed in scikit-image 0.26; keep the old 43 # ``intensity_image`` name as a fallback for older versions. 44 intensity_arr = getattr(region, "image_intensity", None) 45 if intensity_arr is None: 46 intensity_arr = region.intensity_image 47 intensities = intensity_arr[region.image].astype(float) 48 rows.append({ 49 "index": region.label, 50 "mean_intensity": float(np.mean(intensities)), 51 "median_intensity": float(np.median(intensities)), 52 "max_intensity": float(np.max(intensities)), 53 "min_intensity": float(np.min(intensities)), 54 "std_intensity": float(np.std(intensities)), 55 "percentile_10": float(np.percentile(intensities, 10)), 56 "percentile_25": float(np.percentile(intensities, 25)), 57 "percentile_75": float(np.percentile(intensities, 75)), 58 "percentile_90": float(np.percentile(intensities, 90)), 59 }) 60 return pd.DataFrame(rows)
Compute per-segment intensity statistics.
For each labeled segment, computes mean, median, maximum, minimum, standard deviation and common percentiles of pixel intensities.
Arguments:
- segmentation (np.ndarray): Integer-valued label array where 0 is background. Supports arbitrary dimensionality.
- intensity_image (np.ndarray): Intensity image with the same shape as
segmentation.
Returns:
pd.DataFrame: One row per segment with columns
index,mean_intensity,median_intensity,max_intensity,min_intensity,std_intensity,percentile_10,percentile_25,percentile_75,percentile_90.indexholds the integer label ID of each segment.
20def measure_morphology( 21 segmentation: np.ndarray, 22 scale: float | tuple = 1.0, 23) -> pd.DataFrame: 24 """Compute per-segment morphological measurements. 25 26 Supported dimensionality is 2D and 3D. For 2D the measurements are area, 27 perimeter, sphericity (circularity), solidity, major and minor axis 28 lengths, and equivalent diameter. For 3D the measurements are volume, 29 surface area (via marching cubes), sphericity, solidity, major and minor 30 axis lengths, and equivalent diameter. 31 32 Physical units are applied via the ``scale`` parameter. Anisotropic 33 voxel sizes are supported by passing a per-dimension tuple. 34 35 Args: 36 segmentation (np.ndarray): Integer-valued label array where 0 is 37 background. Must be 2D or 3D. 38 scale (float | tuple): Physical size of a pixel/voxel. A single float 39 is interpreted as isotropic spacing. A tuple must have one value 40 per spatial dimension in ``(Y, X)`` order for 2D or ``(Z, Y, X)`` 41 order for 3D. Defaults to 1.0 (pixel/voxel units). 42 43 Returns: 44 pd.DataFrame: One row per segment. ``index`` holds the integer label 45 ID of each segment. 2D columns: ``index``, ``area``, 46 ``perimeter``, ``sphericity``, ``solidity``, ``axis_major_length``, 47 ``axis_minor_length``, ``equivalent_diameter``. 3D columns: 48 ``index``, ``volume``, ``surface_area``, ``sphericity``, 49 ``solidity``, ``axis_major_length``, ``axis_minor_length``, 50 ``equivalent_diameter``. 51 52 Raises: 53 ValueError: If ``segmentation`` is not 2D or 3D, or if ``scale`` 54 tuple length does not match ``ndim``. 55 """ 56 ndim = segmentation.ndim 57 if ndim not in (2, 3): 58 raise ValueError( 59 f"measure_morphology requires 2D or 3D input, got {ndim}D." 60 ) 61 62 if isinstance(scale, (int, float)): 63 scale_tuple = tuple([float(scale)] * ndim) 64 else: 65 scale_tuple = tuple(float(s) for s in scale) 66 if len(scale_tuple) != ndim: 67 raise ValueError( 68 f"scale must have {ndim} elements for a {ndim}D segmentation, " 69 f"got {len(scale_tuple)}." 70 ) 71 72 # Pass spacing so regionprops returns all length/area/volume measurements 73 # in physical units, handling anisotropic voxel sizes exactly. 74 props = regionprops(segmentation, spacing=scale_tuple) 75 if not props: 76 columns = _COLUMNS_2D if ndim == 2 else _COLUMNS_3D 77 return pd.DataFrame(columns=columns) 78 79 rows = [] 80 for region in props: 81 row: dict = {"index": region.label, "solidity": float(region.solidity)} 82 83 if ndim == 2: 84 area = float(region.area) # physical area via spacing 85 # region.perimeter raises NotImplementedError for anisotropic spacing. 86 # Compute perimeter in pixel space then scale by the geometric mean of 87 # the spacings. For isotropic spacing this is exact; for anisotropic 88 # it is the best single-factor approximation of the Crofton formula. 89 from skimage.measure import perimeter_crofton 90 pixel_perimeter = float(perimeter_crofton(region.image)) 91 perimeter = pixel_perimeter * float(np.sqrt(np.prod(scale_tuple))) 92 sphericity = ( 93 4.0 * np.pi * area / (perimeter ** 2) if perimeter > 0 else 0.0 94 ) 95 row["area"] = area 96 row["perimeter"] = perimeter 97 row["sphericity"] = sphericity 98 row["axis_major_length"] = float(region.axis_major_length) 99 row["axis_minor_length"] = float(region.axis_minor_length) 100 row["equivalent_diameter"] = float(region.equivalent_diameter_area) 101 102 else: # 3D 103 volume = float(region.area) # physical volume via spacing 104 105 binary = segmentation == region.label 106 padded = np.pad(binary[region.slice], 1) 107 try: 108 from skimage.measure import marching_cubes, mesh_surface_area 109 verts, faces, _, _ = marching_cubes( 110 padded, level=0.5, spacing=scale_tuple 111 ) 112 surface_area = float(mesh_surface_area(verts, faces)) 113 except (ValueError, RuntimeError): 114 surface_area = 0.0 115 116 sphericity = ( 117 np.pi ** (1.0 / 3.0) * (6.0 * volume) ** (2.0 / 3.0) / surface_area 118 if surface_area > 0 else 0.0 119 ) 120 row["volume"] = volume 121 row["surface_area"] = surface_area 122 row["sphericity"] = sphericity 123 row["axis_major_length"] = float(region.axis_major_length) 124 row["axis_minor_length"] = float(region.axis_minor_length) 125 row["equivalent_diameter"] = float(region.equivalent_diameter_area) 126 127 rows.append(row) 128 129 columns = _COLUMNS_2D if ndim == 2 else _COLUMNS_3D 130 return pd.DataFrame(rows, columns=columns)
Compute per-segment morphological measurements.
Supported dimensionality is 2D and 3D. For 2D the measurements are area, perimeter, sphericity (circularity), solidity, major and minor axis lengths, and equivalent diameter. For 3D the measurements are volume, surface area (via marching cubes), sphericity, solidity, major and minor axis lengths, and equivalent diameter.
Physical units are applied via the scale parameter. Anisotropic
voxel sizes are supported by passing a per-dimension tuple.
Arguments:
- segmentation (np.ndarray): Integer-valued label array where 0 is background. Must be 2D or 3D.
- scale (float | tuple): Physical size of a pixel/voxel. A single float
is interpreted as isotropic spacing. A tuple must have one value
per spatial dimension in
(Y, X)order for 2D or(Z, Y, X)order for 3D. Defaults to 1.0 (pixel/voxel units).
Returns:
pd.DataFrame: One row per segment.
indexholds the integer label ID of each segment. 2D columns:index,area,perimeter,sphericity,solidity,axis_major_length,axis_minor_length,equivalent_diameter. 3D columns:index,volume,surface_area,sphericity,solidity,axis_major_length,axis_minor_length,equivalent_diameter.
Raises:
- ValueError: If
segmentationis not 2D or 3D, or ifscaletuple length does not matchndim.
48def measure_cell_nucleus( 49 cell_segmentation: np.ndarray, 50 nucleus_segmentation: np.ndarray, 51 scale: float | tuple = 1.0, 52 intensity_image: np.ndarray | None = None, 53) -> pd.DataFrame: 54 """Compute per-cell measurements combining cell and nucleus segmentations. 55 56 For each cell, computes the number of nuclei it contains, the ratio of 57 cell to nuclear area/volume (in physical units), and optionally the ratio 58 of intensity statistics between the cytoplasmic and nuclear regions. 59 60 The cell area/volume encompasses the nucleus (nuclear mask is **not** 61 excluded). For intensity measurements, the nuclear mask **is** excluded 62 from the cellular region so that cytoplasmic and nuclear intensities are 63 measured independently. 64 65 Supported dimensionality is 2D and 3D. 66 67 Args: 68 cell_segmentation (np.ndarray): Integer-valued label array of cells 69 where 0 is background. Must be 2D or 3D. 70 nucleus_segmentation (np.ndarray): Integer-valued label array of 71 nuclei where 0 is background. Must have the same shape as 72 ``cell_segmentation``. 73 scale (float | tuple): Physical size of a pixel/voxel. A single 74 float is interpreted as isotropic spacing. A tuple must have one 75 value per spatial dimension in ``(Y, X)`` order for 2D or 76 ``(Z, Y, X)`` order for 3D. Defaults to 1.0 (pixel/voxel units). 77 intensity_image (np.ndarray | None): Optional intensity image with 78 the same shape as ``cell_segmentation``. When provided, intensity 79 statistics are computed for the cytoplasmic (cell minus nucleus) 80 and nuclear regions and their ratios are reported. 81 82 Returns: 83 pd.DataFrame: One row per cell. ``index`` holds the integer cell label 84 ID. Columns: ``index``, ``n_nuclei``, ``cell_area``/``cell_volume``, 85 ``nucleus_area``/``nucleus_volume``, ``area_ratio``/``volume_ratio``. When *intensity_image* is given, 86 additional columns are added for ``cell_{stat}_intensity``, 87 ``nucleus_{stat}_intensity``, and ``{stat}_intensity_ratio`` for 88 each stat in mean, median, max, min, percentile_10, percentile_25, 89 percentile_75, percentile_90. The ``area_ratio``/``volume_ratio`` 90 is ``NaN`` for cells with no detected nucleus. Intensity ratios 91 are ``NaN`` when the cytoplasm or nucleus region is empty, or when 92 the nucleus value is zero. 93 94 Raises: 95 ValueError: If ``cell_segmentation`` is not 2D or 3D, if the shapes 96 of ``cell_segmentation`` and ``nucleus_segmentation`` do not match, 97 if ``intensity_image`` shape does not match, or if ``scale`` tuple 98 length does not match ``ndim``. 99 """ 100 ndim = cell_segmentation.ndim 101 if ndim not in (2, 3): 102 raise ValueError( 103 f"measure_cell_nucleus requires 2D or 3D input, got {ndim}D." 104 ) 105 106 if cell_segmentation.shape != nucleus_segmentation.shape: 107 raise ValueError( 108 "cell_segmentation and nucleus_segmentation must have the same shape, " 109 f"got {cell_segmentation.shape} and {nucleus_segmentation.shape}." 110 ) 111 112 if intensity_image is not None and intensity_image.shape != cell_segmentation.shape: 113 raise ValueError( 114 "intensity_image must have the same shape as cell_segmentation, " 115 f"got {intensity_image.shape} and {cell_segmentation.shape}." 116 ) 117 118 if isinstance(scale, (int, float)): 119 scale_tuple = tuple([float(scale)] * ndim) 120 else: 121 scale_tuple = tuple(float(s) for s in scale) 122 if len(scale_tuple) != ndim: 123 raise ValueError( 124 f"scale must have {ndim} elements for a {ndim}D segmentation, " 125 f"got {len(scale_tuple)}." 126 ) 127 128 voxel_size = float(np.prod(scale_tuple)) 129 size_col = "area" if ndim == 2 else "volume" 130 has_intensity = intensity_image is not None 131 132 cell_ids = np.unique(cell_segmentation) 133 cell_ids = cell_ids[cell_ids != 0] 134 135 if len(cell_ids) == 0: 136 return pd.DataFrame(columns=_base_columns(ndim, has_intensity)) 137 138 rows = [] 139 for cell_id in cell_ids: 140 cell_mask = cell_segmentation == cell_id 141 cell_size = float(np.sum(cell_mask)) * voxel_size 142 143 nuc_ids = np.unique(nucleus_segmentation[cell_mask]) 144 nuc_ids = nuc_ids[nuc_ids != 0] 145 n_nuclei = int(len(nuc_ids)) 146 147 nucleus_mask = (nucleus_segmentation != 0) & cell_mask 148 nucleus_size = float(np.sum(nucleus_mask)) * voxel_size 149 150 size_ratio = cell_size / nucleus_size if nucleus_size > 0 else float("nan") 151 152 row: dict = { 153 "index": int(cell_id), 154 "n_nuclei": n_nuclei, 155 f"cell_{size_col}": cell_size, 156 f"nucleus_{size_col}": nucleus_size, 157 f"{size_col}_ratio": size_ratio, 158 } 159 160 if has_intensity: 161 cyto_mask = cell_mask & ~nucleus_mask 162 cell_stats = ( 163 _compute_intensity_stats(intensity_image[cyto_mask]) 164 if np.any(cyto_mask) 165 else _nan_intensity_stats() 166 ) 167 nuc_stats = ( 168 _compute_intensity_stats(intensity_image[nucleus_mask]) 169 if np.any(nucleus_mask) 170 else _nan_intensity_stats() 171 ) 172 173 for stat in _INTENSITY_STATS: 174 row[f"cell_{stat}_intensity"] = cell_stats[stat] 175 for stat in _INTENSITY_STATS: 176 row[f"nucleus_{stat}_intensity"] = nuc_stats[stat] 177 for stat in _INTENSITY_STATS: 178 c = cell_stats[stat] 179 n = nuc_stats[stat] 180 if not (np.isnan(c) or np.isnan(n)) and n != 0: 181 row[f"{stat}_intensity_ratio"] = c / n 182 else: 183 row[f"{stat}_intensity_ratio"] = float("nan") 184 185 rows.append(row) 186 187 return pd.DataFrame(rows, columns=_base_columns(ndim, has_intensity))
Compute per-cell measurements combining cell and nucleus segmentations.
For each cell, computes the number of nuclei it contains, the ratio of cell to nuclear area/volume (in physical units), and optionally the ratio of intensity statistics between the cytoplasmic and nuclear regions.
The cell area/volume encompasses the nucleus (nuclear mask is not excluded). For intensity measurements, the nuclear mask is excluded from the cellular region so that cytoplasmic and nuclear intensities are measured independently.
Supported dimensionality is 2D and 3D.
Arguments:
- cell_segmentation (np.ndarray): Integer-valued label array of cells where 0 is background. Must be 2D or 3D.
- nucleus_segmentation (np.ndarray): Integer-valued label array of
nuclei where 0 is background. Must have the same shape as
cell_segmentation. - scale (float | tuple): Physical size of a pixel/voxel. A single
float is interpreted as isotropic spacing. A tuple must have one
value per spatial dimension in
(Y, X)order for 2D or(Z, Y, X)order for 3D. Defaults to 1.0 (pixel/voxel units). - intensity_image (np.ndarray | None): Optional intensity image with
the same shape as
cell_segmentation. When provided, intensity statistics are computed for the cytoplasmic (cell minus nucleus) and nuclear regions and their ratios are reported.
Returns:
pd.DataFrame: One row per cell.
indexholds the integer cell label ID. Columns:index,n_nuclei,cell_area/cell_volume,nucleus_area/nucleus_volume,area_ratio/volume_ratio. When intensity_image is given, additional columns are added forcell_{stat}_intensity,nucleus_{stat}_intensity, and{stat}_intensity_ratiofor each stat in mean, median, max, min, percentile_10, percentile_25, percentile_75, percentile_90. Thearea_ratio/volume_ratioisNaNfor cells with no detected nucleus. Intensity ratios areNaNwhen the cytoplasm or nucleus region is empty, or when the nucleus value is zero.
Raises:
- ValueError: If
cell_segmentationis not 2D or 3D, if the shapes ofcell_segmentationandnucleus_segmentationdo not match, ifintensity_imageshape does not match, or ifscaletuple length does not matchndim.
12def merge_tables( 13 tables: Sequence[pd.DataFrame], on: str = "index" 14) -> pd.DataFrame: 15 """Merge multiple measurement tables on a shared key column. 16 17 Performs an outer join on ``on`` so that label IDs present in some but not 18 all tables are preserved (missing values become NaN). All tables must 19 contain the ``on`` column, and the *other* columns must be disjoint 20 between tables — otherwise the merge would silently rename or duplicate 21 measurement columns. Use :func:`drop_columns` first to remove conflicts 22 if needed. 23 24 Args: 25 tables (Sequence[pd.DataFrame]): Two or more measurement tables to 26 merge. 27 on (str): Name of the key column shared between tables. Defaults to 28 ``"index"``. 29 30 Returns: 31 pd.DataFrame: A single table containing the union of all rows and 32 columns. 33 34 Raises: 35 ValueError: If fewer than two tables are provided, if any table is 36 missing the ``on`` column, or if non-``on`` columns overlap 37 between tables. 38 """ 39 tables = list(tables) 40 if len(tables) < 2: 41 raise ValueError("merge_tables requires at least two tables.") 42 43 seen_columns: set[str] = set() 44 for i, table in enumerate(tables): 45 if on not in table.columns: 46 raise ValueError( 47 f"Table at index {i} is missing the key column '{on}'." 48 ) 49 other_columns = set(table.columns) - {on} 50 conflicts = seen_columns & other_columns 51 if conflicts: 52 raise ValueError( 53 f"Column(s) {sorted(conflicts)} appear in more than one " 54 f"table; drop them before merging." 55 ) 56 seen_columns.update(other_columns) 57 58 merged = reduce(lambda left, right: pd.merge(left, right, on=on, how="outer"), tables) 59 return merged.sort_values(on).reset_index(drop=True)
Merge multiple measurement tables on a shared key column.
Performs an outer join on on so that label IDs present in some but not
all tables are preserved (missing values become NaN). All tables must
contain the on column, and the other columns must be disjoint
between tables — otherwise the merge would silently rename or duplicate
measurement columns. Use drop_columns() first to remove conflicts
if needed.
Arguments:
- tables (Sequence[pd.DataFrame]): Two or more measurement tables to merge.
- on (str): Name of the key column shared between tables. Defaults to
"index".
Returns:
pd.DataFrame: A single table containing the union of all rows and columns.
Raises:
- ValueError: If fewer than two tables are provided, if any table is
missing the
oncolumn, or if non-oncolumns overlap between tables.
65def drop_columns( 66 table: pd.DataFrame, columns: Union[str, Iterable[str]] 67) -> pd.DataFrame: 68 """Return a copy of ``table`` with the specified columns removed. 69 70 The ``index`` column is the standard segment-identifier key throughout 71 this package and may never be dropped. 72 73 Args: 74 table (pd.DataFrame): Input measurement table. 75 columns (str | Iterable[str]): Single column name or an iterable of 76 column names to drop. 77 78 Returns: 79 pd.DataFrame: New DataFrame without the dropped columns. 80 81 Raises: 82 ValueError: If any requested column is not present in ``table``, or 83 if a protected column (``index``) is requested. 84 """ 85 if isinstance(columns, str): 86 columns_list = [columns] 87 else: 88 columns_list = list(columns) 89 protected = [c for c in columns_list if c in PROTECTED_COLUMNS] 90 if protected: 91 raise ValueError( 92 f"Column(s) {protected} are protected and cannot be dropped." 93 ) 94 missing = [c for c in columns_list if c not in table.columns] 95 if missing: 96 raise ValueError(f"Column(s) {missing} not found in table.") 97 return table.drop(columns=columns_list)
Return a copy of table with the specified columns removed.
The index column is the standard segment-identifier key throughout
this package and may never be dropped.
Arguments:
- table (pd.DataFrame): Input measurement table.
- columns (str | Iterable[str]): Single column name or an iterable of column names to drop.
Returns:
pd.DataFrame: New DataFrame without the dropped columns.
Raises:
- ValueError: If any requested column is not present in
table, or if a protected column (index) is requested.
17def suggest_thresholds(measurements: pd.DataFrame, column: str, n_categories: int) -> list[float]: 18 """Suggest thresholds for categorizing segments. 19 20 Computes ``n_categories - 1`` threshold values at equally-spaced quantiles 21 of the specified column. 22 23 Args: 24 measurements (pd.DataFrame): Measurement DataFrame as returned by 25 :func:`~segmentation_measurement.measure_intensities` or 26 :func:`~segmentation_measurement.measure_morphology`. 27 column (str): Column name to compute thresholds for. 28 n_categories (int): Number of desired categories. Must be >= 2. 29 30 Returns: 31 list[float]: ``n_categories - 1`` threshold values in ascending order. 32 33 Raises: 34 ValueError: If ``n_categories`` < 2 or ``column`` is not in 35 ``measurements``. 36 """ 37 if n_categories < 2: 38 raise ValueError("n_categories must be >= 2.") 39 if column not in measurements.columns: 40 raise ValueError(f"Column '{column}' not found in measurements.") 41 values = measurements[column].dropna().values 42 quantile_positions = np.linspace(0, 100, n_categories + 1)[1:-1] 43 return [float(np.percentile(values, q)) for q in quantile_positions]
Suggest thresholds for categorizing segments.
Computes n_categories - 1 threshold values at equally-spaced quantiles
of the specified column.
Arguments:
- measurements (pd.DataFrame): Measurement DataFrame as returned by
~segmentation_measurement.measure_intensities()or~segmentation_measurement.measure_morphology(). - column (str): Column name to compute thresholds for.
- n_categories (int): Number of desired categories. Must be >= 2.
Returns:
list[float]:
n_categories - 1threshold values in ascending order.
Raises:
- ValueError: If
n_categories< 2 orcolumnis not inmeasurements.
46def categorize_by_threshold( 47 measurements: pd.DataFrame, 48 column: str, 49 thresholds: list[float], 50 category_names: list[str] | None = None, 51) -> pd.DataFrame: 52 """Assign categories to segments based on thresholds. 53 54 Segments with values below the first threshold are assigned category 1, 55 between consecutive thresholds category 2, ..., N. Works with any 56 measurement DataFrame (intensity or morphology). 57 58 Args: 59 measurements (pd.DataFrame): Measurement DataFrame as returned by 60 :func:`~segmentation_measurement.measure_intensities` or 61 :func:`~segmentation_measurement.measure_morphology`. 62 column (str): Column name to apply thresholds to. 63 thresholds (list[float]): ``n_categories - 1`` threshold values. 64 Need not be sorted; they are sorted internally. 65 category_names (list[str] | None): ``n_categories`` names, one per 66 category. Defaults to ``"category_1"``, ``"category_2"``, etc. 67 68 Returns: 69 pd.DataFrame: Copy of ``measurements`` with added columns 70 ``category_id`` (int, 1-based) and ``category_name`` (str). 71 72 Raises: 73 ValueError: If ``column`` is not in ``measurements`` or 74 ``category_names`` has the wrong length. 75 """ 76 if column not in measurements.columns: 77 raise ValueError(f"Column '{column}' not found in measurements.") 78 n_categories = len(thresholds) + 1 79 if category_names is None: 80 category_names = [f"category_{i + 1}" for i in range(n_categories)] 81 if len(category_names) != n_categories: 82 raise ValueError( 83 f"Expected {n_categories} category names, got {len(category_names)}." 84 ) 85 result = measurements.copy() 86 values = result[column].values 87 # NaN values (e.g. the background-padding row added so napari's Features 88 # Table can map row position → label) are not categorized. They get 89 # ``category_id=0`` and an empty ``category_name``. 90 if np.issubdtype(np.asarray(values).dtype, np.number): 91 valid_mask = ~np.isnan(np.asarray(values, dtype=float)) 92 else: 93 valid_mask = np.ones(len(values), dtype=bool) 94 category_ids = np.zeros(len(values), dtype=int) 95 if valid_mask.any(): 96 category_ids[valid_mask] = np.digitize( 97 np.asarray(values, dtype=float)[valid_mask], sorted(thresholds) 98 ) + 1 99 result["category_id"] = category_ids 100 result["category_name"] = [ 101 category_names[cid - 1] if cid > 0 else "" for cid in category_ids 102 ] 103 return result
Assign categories to segments based on thresholds.
Segments with values below the first threshold are assigned category 1, between consecutive thresholds category 2, ..., N. Works with any measurement DataFrame (intensity or morphology).
Arguments:
- measurements (pd.DataFrame): Measurement DataFrame as returned by
~segmentation_measurement.measure_intensities()or~segmentation_measurement.measure_morphology(). - column (str): Column name to apply thresholds to.
- thresholds (list[float]):
n_categories - 1threshold values. Need not be sorted; they are sorted internally. - category_names (list[str] | None):
n_categoriesnames, one per category. Defaults to"category_1","category_2", etc.
Returns:
pd.DataFrame: Copy of
measurementswith added columnscategory_id(int, 1-based) andcategory_name(str).
Raises:
- ValueError: If
columnis not inmeasurementsorcategory_nameshas the wrong length.
106def cluster_measurements( 107 measurements: pd.DataFrame, 108 method: str = "kmeans", 109 **kwargs, 110) -> pd.DataFrame: 111 """Apply clustering to measurement features. 112 113 Clusters segments using all numeric measurement columns, excluding 114 ``index``, ``cluster_id``, ``category_id``, and ``category_name``. 115 Features are z-score standardised before clustering. 116 117 Args: 118 measurements (pd.DataFrame): Measurement DataFrame as returned by 119 :func:`~segmentation_measurement.measure_intensities` or similar. 120 method (str): Clustering method. One of ``'kmeans'``, ``'dbscan'``, 121 ``'hdbscan'``, or ``'mean_shift'``. Defaults to ``'kmeans'``. 122 **kwargs: Keyword arguments forwarded to the underlying scikit-learn 123 estimator. Sensible defaults are used when not provided: 124 k-means – ``n_clusters=3``; DBSCAN – ``eps=0.5``, 125 ``min_samples=5``; HDBSCAN – ``min_cluster_size=5``; Mean Shift – 126 bandwidth is estimated automatically. 127 128 Returns: 129 pd.DataFrame: Copy of ``measurements`` with an added ``cluster_id`` 130 column containing integer cluster labels. ``-1`` marks noise 131 points for methods that support it (DBSCAN, HDBSCAN). 132 133 Raises: 134 ValueError: If ``method`` is unrecognised or no numeric feature 135 columns are found in ``measurements``. 136 """ 137 from sklearn.preprocessing import StandardScaler 138 139 feature_cols = [ 140 c for c in measurements.select_dtypes(include="number").columns 141 if c not in _CLUSTER_EXCLUDE 142 ] 143 if not feature_cols: 144 raise ValueError("No numeric feature columns found in measurements.") 145 146 X = measurements[feature_cols].values.astype(float) 147 valid_mask = ~np.isnan(X).any(axis=1) 148 X_valid = StandardScaler().fit_transform(X[valid_mask]) 149 150 model = _build_clustering_model(method, kwargs) 151 labels_valid = model.fit_predict(X_valid).copy() 152 # Shift to 1-based; noise (-1) stays -1 153 labels_valid[labels_valid >= 0] += 1 154 155 labels = np.full(len(measurements), -1, dtype=int) 156 labels[valid_mask] = labels_valid 157 158 result = measurements.copy() 159 result["cluster_id"] = labels 160 return result
Apply clustering to measurement features.
Clusters segments using all numeric measurement columns, excluding
index, cluster_id, category_id, and category_name.
Features are z-score standardised before clustering.
Arguments:
- measurements (pd.DataFrame): Measurement DataFrame as returned by
~segmentation_measurement.measure_intensities()or similar. - method (str): Clustering method. One of
'kmeans','dbscan','hdbscan', or'mean_shift'. Defaults to'kmeans'. - **kwargs: Keyword arguments forwarded to the underlying scikit-learn
estimator. Sensible defaults are used when not provided:
k-means –
n_clusters=3; DBSCAN –eps=0.5,min_samples=5; HDBSCAN –min_cluster_size=5; Mean Shift – bandwidth is estimated automatically.
Returns:
pd.DataFrame: Copy of
measurementswith an addedcluster_idcolumn containing integer cluster labels.-1marks noise points for methods that support it (DBSCAN, HDBSCAN).
Raises:
- ValueError: If
methodis unrecognised or no numeric feature columns are found inmeasurements.
163def train_classifier( 164 measurements: pd.DataFrame, 165 annotation_column: str = "annotation", 166 method: str = "random_forest", 167 **kwargs, 168) -> object: 169 """Train a classifier on annotated measurement features. 170 171 Args: 172 measurements (pd.DataFrame): Measurement DataFrame with annotation labels. 173 Rows where the annotation value equals zero are treated as unannotated 174 and excluded from training. 175 annotation_column (str): Column containing integer annotation labels 176 (1-based). Defaults to ``'annotation'``. 177 method (str): Classifier type. One of ``'logistic_regression'`` or 178 ``'random_forest'``. Defaults to ``'logistic_regression'``. 179 **kwargs: Keyword arguments forwarded to the underlying scikit-learn 180 estimator. Sensible defaults are applied when not provided: 181 logistic regression – ``max_iter=1000``; random forest – 182 ``n_estimators=100``. 183 184 Returns: 185 object: Trained sklearn ``Pipeline`` (``StandardScaler`` + classifier) 186 that can be passed to :func:`apply_classifier` or serialised with 187 ``joblib``. 188 189 Raises: 190 ValueError: If ``annotation_column`` is missing in ``measurements``, 191 no annotated rows exist (all annotation values are zero), no 192 numeric feature columns are found, or ``method`` is unknown. 193 """ 194 from sklearn.pipeline import Pipeline 195 from sklearn.preprocessing import StandardScaler 196 197 if annotation_column not in measurements.columns: 198 raise ValueError(f"Annotation column '{annotation_column}' not found.") 199 200 feature_cols = [ 201 c for c in measurements.select_dtypes(include="number").columns 202 if c not in _CLASSIFY_EXCLUDE and c != annotation_column 203 ] 204 if not feature_cols: 205 raise ValueError("No numeric feature columns found in measurements.") 206 207 annotated = measurements[measurements[annotation_column] > 0] 208 if len(annotated) == 0: 209 raise ValueError("No annotated rows found (all annotation values are 0).") 210 211 X = annotated[feature_cols].values.astype(float) 212 y = annotated[annotation_column].values.astype(int) 213 214 valid_mask = ~np.isnan(X).any(axis=1) 215 X, y = X[valid_mask], y[valid_mask] 216 if len(X) == 0: 217 raise ValueError("All annotated rows contain NaN feature values.") 218 219 clf = _build_classifier_model(method, kwargs) 220 pipeline = Pipeline([("scaler", StandardScaler()), ("clf", clf)]) 221 pipeline.fit(X, y) 222 return pipeline
Train a classifier on annotated measurement features.
Arguments:
- measurements (pd.DataFrame): Measurement DataFrame with annotation labels. Rows where the annotation value equals zero are treated as unannotated and excluded from training.
- annotation_column (str): Column containing integer annotation labels
(1-based). Defaults to
'annotation'. - method (str): Classifier type. One of
'logistic_regression'or'random_forest'. Defaults to'logistic_regression'. - **kwargs: Keyword arguments forwarded to the underlying scikit-learn
estimator. Sensible defaults are applied when not provided:
logistic regression –
max_iter=1000; random forest –n_estimators=100.
Returns:
object: Trained sklearn
Pipeline(StandardScaler+ classifier) that can be passed toapply_classifier()or serialised withjoblib.
Raises:
- ValueError: If
annotation_columnis missing inmeasurements, no annotated rows exist (all annotation values are zero), no numeric feature columns are found, ormethodis unknown.
225def apply_classifier( 226 measurements: pd.DataFrame, 227 classifier: object, 228 class_names: list[str] | None = None, 229 annotation_column: str = "annotation", 230) -> pd.DataFrame: 231 """Apply a trained classifier to measurement features. 232 233 Args: 234 measurements (pd.DataFrame): Measurement DataFrame to classify. 235 classifier (object): Trained sklearn estimator or ``Pipeline`` as 236 returned by :func:`train_classifier`. 237 class_names (list[str] | None): Optional names for each class, ordered 238 by ascending class label (matching ``classifier.classes_``). Defaults 239 to ``'class_<id>'`` strings. 240 annotation_column (str): Name of the annotation column to exclude from 241 features. Defaults to ``'annotation'``. 242 243 Returns: 244 pd.DataFrame: Copy of ``measurements`` with added columns 245 ``classification_id`` (int, 1-based; 0 for rows with NaN features) 246 and ``classification_name`` (str; empty string for unclassified rows). 247 248 Raises: 249 ValueError: If no numeric feature columns are found. 250 """ 251 feature_cols = [ 252 c for c in measurements.select_dtypes(include="number").columns 253 if c not in _CLASSIFY_EXCLUDE and c != annotation_column 254 ] 255 if not feature_cols: 256 raise ValueError("No numeric feature columns found in measurements.") 257 258 X = measurements[feature_cols].values.astype(float) 259 valid_mask = ~np.isnan(X).any(axis=1) 260 261 predictions = np.zeros(len(measurements), dtype=int) 262 if valid_mask.any(): 263 predictions[valid_mask] = classifier.predict(X[valid_mask]).astype(int) 264 265 classes = sorted(int(c) for c in classifier.classes_) 266 if class_names is not None: 267 name_map = { 268 cid: (class_names[i] if i < len(class_names) else f"class_{cid}") 269 for i, cid in enumerate(classes) 270 } 271 else: 272 name_map = {cid: f"class_{cid}" for cid in classes} 273 274 result = measurements.copy() 275 result["classification_id"] = predictions 276 result["classification_name"] = [ 277 name_map.get(int(p), f"class_{p}") if int(p) > 0 else "" 278 for p in predictions 279 ] 280 return result
Apply a trained classifier to measurement features.
Arguments:
- measurements (pd.DataFrame): Measurement DataFrame to classify.
- classifier (object): Trained sklearn estimator or
Pipelineas returned bytrain_classifier(). - class_names (list[str] | None): Optional names for each class, ordered
by ascending class label (matching
classifier.classes_). Defaults to'class_<id>'strings. - annotation_column (str): Name of the annotation column to exclude from
features. Defaults to
'annotation'.
Returns:
pd.DataFrame: Copy of
measurementswith added columnsclassification_id(int, 1-based; 0 for rows with NaN features) andclassification_name(str; empty string for unclassified rows).
Raises:
- ValueError: If no numeric feature columns are found.