Add object classification for label images #2921

k-dominik · 2024-10-18T04:08:24Z

Inputs such as label images from cellpose or stardist are treated a bit poorly in ilastik - connected components is run again (with possibly different connectivity) resulting in different objects in ilastik after loading such label image data. This is super inconvenient for users: Any prior analysis they did will be hard to merge (different object ids, possibly different objects).

This PR introduces a new workflow, "Object Classification [Raw Data, Label Image]" that internally doesn't do CCA anymore but merely relabels the label image to ensure consecutive labels (required by feature extraction in vigra).

see discussion in #2416

fixes #2416

Summary:

rewrite of OpRelabelConsecutive into an operator that caches two results: the relabeled image and a dictionary that holds the mapping - both necessary in the workflow (relabeld image as input for feature computation, the mapping to export the original labelids to the table).
Serialization logic for this slot that ensures valid saving of both data to the ilp file
Added the new column to the export logic
Added the new Object Classification from Label images workflow

There are some slightly unrelated changes (first two commits):

Binary image layer and object id image have been treated as different sources when displaying, which is not really necessary and can increase cache usage (as two images will be accessed instead of one). This is changed by sourcing the binary layer also from the label image (which has to be computed anyhow for all three flavors of object classification).
Removed the Binary Image input completely from the Object Classification op: it was only used for visualization.
I renamed all usages of Binary image to Segmentation image as we used binary segmentations as well as instance segmentations as inputs there.
Needed to add another pin for pytorch - really need to investigate this: Investigate problems with pytorch on windows for 2.4.0, 2.4.1, 2.5.0 #2922

Changes I didn't include here because changeset is already pretty big:

change the order of columns in export - OriginalObjectID column should be easier to find (and fix order to some sorting)
Allow for export without an image: Currently ilastik forces exporting an image to export the table. If the Label Image already exists outside, and the table correctly links to it, it is really not necessary anymore to export any image.

Todos (before merge, didn't do it yet for the sake of the review):

remove ac9b4d5, has been fixed on main
rebase/fixup commits as indicated by commit messages

label image _has_ to be cached/accessed always, binary image not really. Binary display can be achieved with a two color colormap. As a consequence this should be friendlier for caching. Also, display does not rely anymore on a binary image being present, which could not be the case for label-type inputs.

throughout the code we were referring to the segmentation input as binary, although both, label images and binary segmentations are supported. Also connected Binary Layer Display to Segmentationlayer display with an optimized colormap

codecov · 2024-10-18T17:33:01Z

Codecov Report

Attention: Patch coverage is 77.79851% with 119 lines in your changes missing coverage. Please review.

Project coverage is 56.58%. Comparing base (dfef849) to head (3b8cdc8).
Report is 173 commits behind head on main.

Files with missing lines	Patch %	Lines
...ik/applets/base/appletSerializer/slotSerializer.py	61.16%	33 Missing and 7 partials ⚠️
...tik/applets/objectExtraction/opObjectExtraction.py	64.47%	24 Missing and 3 partials ⚠️
lazyflow/operators/opRelabelConsecutive.py	91.56%	7 Missing and 14 partials ⚠️
...jectClassification/objectClassificationWorkflow.py	70.37%	8 Missing ⚠️
...ts/objectClassification/objectClassificationGui.py	25.00%	5 Missing and 1 partial ⚠️
...ets/objectExtraction/objectExtractionSerializer.py	66.66%	6 Missing ⚠️
...Classification/blockwiseObjectClassificationGui.py	0.00%	3 Missing ⚠️
...applets/objectExtraction/objectExtractionApplet.py	81.25%	3 Missing ⚠️
...ik/applets/objectExtraction/objectExtractionGui.py	57.14%	2 Missing and 1 partial ⚠️
...tClassification/opBlockwiseObjectClassification.py	93.75%	0 Missing and 1 partial ⚠️
... and 1 more

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2921      +/-   ##
==========================================
+ Coverage   56.47%   56.58%   +0.10%     
==========================================
  Files         535      535              
  Lines       62263    62677     +414     
  Branches     7711     7763      +52     
==========================================
+ Hits        35165    35464     +299     
- Misses      25335    25432      +97     
- Partials     1763     1781      +18

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Rewritten OpRelabelConsecutive with a Cache compatible interface. OpRelabelConsecutive5D caches the relabeled array and a mapping dict in sync. Motivation is to support properly label images in object classification where we have to relabel consecutively for vigra feature computation, but also export the original labels in the table.

the `OpRelabelConsecutive` op caches it's output.

in order to _properly_ support label images as inputs in object classification, we want to treat the original labeling more respectfully (don't mess with connectivity) and also add the original object id to Object Features in order to make it available in the table for export.

k-dominik · 2024-10-22T11:52:59Z

putting this back to draft - I must have messed something up when rebasing, the new workflow currently doesn't work 🙄

btbest

Obviously a great feature to have and I can see the pain that the existing interfaces caused expressed in code. I think it's squeezed into a reasonable change set though

Review could have been quicker if the commits documented copy-pasting (bonus points for separating copying commit from subsequent modifications); as far as I could tell the SerialRelabeledDataSlot is a modified SerialBlockSlot and OpRelabelConsecutive5D is obviously mostly a modified copy-paste of its base class OpUnblockedArrayCache

There's a leftover BinaryImage in manualTrackingWorkflow#114 that completely breaks the workflow, the other comments are all just suggestions

ilastik/workflows/edgeTrainingWithMulticut/edgeTrainingWithMulticutWorkflow.py

lazyflow/operators/opRelabelConsecutive.py

btbest · 2024-10-22T13:20:23Z

lazyflow/operators/opRelabelConsecutive.py

+    SerializationOutput = OutputSlot()
+
+    # total dummy slot, there because OpObjectExtraction expects this interface
+    BypassModeEnabled = InputSlot(value=False)


I don't think this is actually necessary - if only there was a test we could run to check... 😁

(OpRelabelConsecutive.BypassModeEnabled is never used, and the workflow ran fine manually after deleting it)

I think at least with this particular operator we should also cache in headless mode. So will remove bypassmodeenabled if possible without adding a lot of special checks downstream.

The semantic dependency tracking here was too hard to understand in what sense "OpObjectExtraction expects this interface"; now I get it.

I think in this context I would keep it on OpRelabelConsecutive to document the dependency (and the fact that OpRelabelConsecutive is actually an "OpLabelVolume", even if they share no base class or naming), but remove it from OpRelabelConsecutive5D so that at least that op can focus on its job

lazyflow/operators/opRelabelConsecutive.py

btbest · 2024-10-22T13:31:17Z

lazyflow/operators/opRelabelConsecutive.py

+        pass
+
+
+class OpRelabelConsecutive5D(OpUnblockedArrayCache):


I can't really see the value of nesting the cached op here, seems to me one could just put the dtype functionality in the caching op and remove the wrapper? Then the caching op also wouldn't have to double check the dtype on its own side again...

Not sure I understand what you mean by nesting here. You mean Having OpRelabelConsecutive5D as an object being used in OpRelabelConsecutive?

The latter operates in terms of input data dimensions, and OpRelabelConsecutive5D implementation is simplified by assuming 5D...

Now I see OpRelabelConsecutive mirrors the internal architecture of OpLabelVolume pretty much 1:1, though it does read a lot cleaner here :)

This implicit relationship has tripped me up a few times during the review... I feel it could be helpful to have a noop OpObjectsABC or sth purely to document the connection, and/or rename OpRelabelConsecutive into OpRelabelVolume. I'm immediately also jumping to "well and extract the reorder-dtype-convert-MakeLabels-reorderToInput" cascade into the base class"... Or maybe OpRelabelConsecutive5D should really be a subclass of OpLabelingABC. But I remember you've mentioned going back and forth between implementations and discarding others, so I'm happy to accept what we have here was the cleanest way to solve this.

it was good you left the comment as the common interface between OpRelabelConsecutive and OpLabelVolume for object classification is pretty minimal as it turns out:

class OpLabelBase(Operator): """Minimal interface for Operators that takes an input volume and produces a consecutive labeling as an output volume""" Input = InputSlot() """Volume to (re)label, up to 5D""" CachedOutput = OutputSlot() """Cached label image - internally always whole time slices are stored Axistags and shape are the same as on the Input, dtype is an integer datatype. This slot extends the ROI to the full xyz volume (c and t are unaffected) and computes the labeling for the whole volume. As long as the input does not get dirty, subsequent requests to this slot guarantee consistent labelings. """ CleanBlocks = OutputSlot() """Expose for Cache serialization, slicings for clean blocks stored in the cache"""

will probably do the job (still shifting things around a bit). It's a pity that not more of the interface can be changed, but well...

ilastik/applets/base/appletSerializer/slotSerializer.py

tests/test_lazyflow/test_operators/testOpRelabelConsecutive.py

btbest · 2024-10-22T13:38:47Z

tests/test_lazyflow/test_operators/testOpRelabelConsecutive.py

+
+
+@patch("vigra.analysis.relabelConsecutive", wraps=vigra.analysis.relabelConsecutive)
+def test_time_slice_blocking_dict(relabel_mock: MagicMock, oprelabel, labels_t):


I'd move this up to line 174 (just so that the time_slice_blocking tests are together)

btbest · 2024-10-22T13:39:35Z

tests/test_lazyflow/test_operators/testOpRelabelConsecutive.py

-    def test_startlabel(self):
-        op = OpRelabelConsecutive(graph=Graph())
-        op.StartLabel.setValue(10)
+    freed = cache_op.freeBlock(((5, 5), (6, 6)))


What does this test ensure? I couldn't find a roi here that failed the test

it just ensures that there will be noerrors clearing cache keys. It's a feature all the caches implement and this test documents that it is working as expected.

btbest · 2024-10-22T13:39:46Z

tests/test_lazyflow/test_operators/testOpRelabelConsecutive.py

+def test_simple(oprelabel: OpRelabelConsecutive, labels: vigra.VigraArray):
+    oprelabel.Input.setValue(labels)
+    expected = labels // 2
+    relabeled: npt.ArrayLike = oprelabel.CachedOutput[:].wait()


Suggested change

relabeled: npt.ArrayLike = oprelabel.CachedOutput[:].wait()

relabeled: npt.NDArray = oprelabel.CachedOutput[:].wait()

don't have the connection to check now but I remember ArrayLike being deceptive; sounds like "something that looks like an array" but specifically means "something that implements __array__"

yeah, total oversight :)

* removed unused / cleaned-up imports * removed `_standarize_roi` from `OpRelabelConsecutive5D` - implementation identical to base class `OpUnblockedArrayCache` * Fix Manual tracking: BinaryImage -> SegmentationImage * removed a redundant return Co-authored-by: Benedikt Best <[email protected]>

Clash on Win - pyshtools installs openmp, which clobbers dlls needed by pytorch's intel-openmp since pytorch 2.4 See bfab86d (similar problem)

k-dominik added 2 commits October 17, 2024 15:59

oc: refactor BinaryAnything to SegmentationAnything

dc2c4c4

throughout the code we were referring to the segmentation input as binary, although both, label images and binary segmentations are supported. Also connected Binary Layer Display to Segmentationlayer display with an optimized colormap

k-dominik changed the title ~~Add oc for label2~~ Add object classification for label images Oct 18, 2024

k-dominik mentioned this pull request Oct 18, 2024

Add Object Classification from label image workflow #2854

Closed

8 tasks

k-dominik force-pushed the add-oc-for-label2 branch 2 times, most recently from 91f3f91 to 5f00632 Compare October 18, 2024 16:50

k-dominik force-pushed the add-oc-for-label2 branch 2 times, most recently from 9fbbdf3 to 5985f92 Compare October 19, 2024 08:46

k-dominik added 3 commits October 19, 2024 12:13

MC: no need for extra relabel cache with reworked OpRelabelConsecutive

77d8936

the `OpRelabelConsecutive` op caches it's output.

Add serializer for OpRelabelConsecutive

210d92c

k-dominik force-pushed the add-oc-for-label2 branch from c644ee6 to c05bba3 Compare October 19, 2024 10:13

k-dominik added 5 commits October 21, 2024 06:16

Blockwise-object classification: adapt test to removed binary image

5edf5d0

Cleanup/aftermath of removal of binary image from OpObjectExtraction

35393d8

Added Object Classification from Labels Workflow

1c612e7

restrict pytorch version !=2.5.0 - fails on WIN CI

ac9b4d5

k-dominik force-pushed the add-oc-for-label2 branch from e9a3883 to ac9b4d5 Compare October 21, 2024 04:17

k-dominik marked this pull request as ready for review October 21, 2024 04:45

k-dominik requested a review from btbest October 21, 2024 04:45

k-dominik marked this pull request as draft October 22, 2024 11:53

btbest requested changes Oct 22, 2024

View reviewed changes

k-dominik and others added 2 commits November 29, 2024 15:26

fixup! OpRelabelConsecutive rewrite with cache

cfc94cf

k-dominik force-pushed the add-oc-for-label2 branch from c153d2a to d735102 Compare November 29, 2024 18:19

fixup! Review suggestions

4f94d5f

k-dominik force-pushed the add-oc-for-label2 branch from ee163e9 to 3f4f69e Compare November 29, 2024 18:42

Constrain pytorch to <2.4 (ilastik#2927)

3b8cdc8

Clash on Win - pyshtools installs openmp, which clobbers dlls needed by pytorch's intel-openmp since pytorch 2.4 See bfab86d (similar problem)

k-dominik force-pushed the add-oc-for-label2 branch from 3f4f69e to 3b8cdc8 Compare December 9, 2024 15:35

k-dominik mentioned this pull request Jun 20, 2025

Add Object Classification from Label images worfklow (the sequel) #3032

Merged

2 tasks

k-dominik closed this in #3032 Jul 10, 2025



		@patch("vigra.analysis.relabelConsecutive", wraps=vigra.analysis.relabelConsecutive)
		def test_time_slice_blocking_dict(relabel_mock: MagicMock, oprelabel, labels_t):

	relabeled: npt.ArrayLike = oprelabel.CachedOutput[:].wait()
	relabeled: npt.NDArray = oprelabel.CachedOutput[:].wait()

Add object classification for label images #2921

Add object classification for label images #2921

Uh oh!

Conversation

k-dominik commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Todos (before merge, didn't do it yet for the sake of the review):

Uh oh!

codecov bot commented Oct 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

k-dominik commented Oct 22, 2024

Uh oh!

btbest left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

btbest Oct 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

k-dominik Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

btbest Oct 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

k-dominik commented Oct 18, 2024 •

edited

Loading

codecov bot commented Oct 18, 2024 •

edited

Loading

btbest left a comment •

edited

Loading

btbest Oct 24, 2024 •

edited

Loading

k-dominik Dec 9, 2024 •

edited

Loading

btbest Oct 22, 2024 •

edited

Loading