-
Notifications
You must be signed in to change notification settings - Fork 168
Add object classification for label images #2921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
label image _has_ to be cached/accessed always, binary image not really. Binary display can be achieved with a two color colormap. As a consequence this should be friendlier for caching. Also, display does not rely anymore on a binary image being present, which could not be the case for label-type inputs.
throughout the code we were referring to the segmentation input as binary, although both, label images and binary segmentations are supported. Also connected Binary Layer Display to Segmentationlayer display with an optimized colormap
91f3f91 to
5f00632
Compare
9fbbdf3 to
5985f92
Compare
Rewritten OpRelabelConsecutive with a Cache compatible interface. OpRelabelConsecutive5D caches the relabeled array and a mapping dict in sync. Motivation is to support properly label images in object classification where we have to relabel consecutively for vigra feature computation, but also export the original labels in the table.
the `OpRelabelConsecutive` op caches it's output.
c644ee6 to
c05bba3
Compare
in order to _properly_ support label images as inputs in object classification, we want to treat the original labeling more respectfully (don't mess with connectivity) and also add the original object id to Object Features in order to make it available in the table for export.
e9a3883 to
ac9b4d5
Compare
|
putting this back to draft - I must have messed something up when rebasing, the new workflow currently doesn't work 🙄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Obviously a great feature to have and I can see the pain that the existing interfaces caused expressed in code. I think it's squeezed into a reasonable change set though
Review could have been quicker if the commits documented copy-pasting (bonus points for separating copying commit from subsequent modifications); as far as I could tell the SerialRelabeledDataSlot is a modified SerialBlockSlot and OpRelabelConsecutive5D is obviously mostly a modified copy-paste of its base class OpUnblockedArrayCache
There's a leftover BinaryImage in manualTrackingWorkflow#114 that completely breaks the workflow, the other comments are all just suggestions
ilastik/workflows/edgeTrainingWithMulticut/edgeTrainingWithMulticutWorkflow.py
Show resolved
Hide resolved
| SerializationOutput = OutputSlot() | ||
|
|
||
| # total dummy slot, there because OpObjectExtraction expects this interface | ||
| BypassModeEnabled = InputSlot(value=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is actually necessary - if only there was a test we could run to check... 😁
(OpRelabelConsecutive.BypassModeEnabled is never used, and the workflow ran fine manually after deleting it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at least with this particular operator we should also cache in headless mode. So will remove bypassmodeenabled if possible without adding a lot of special checks downstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantic dependency tracking here was too hard to understand in what sense "OpObjectExtraction expects this interface"; now I get it.
I think in this context I would keep it on OpRelabelConsecutive to document the dependency (and the fact that OpRelabelConsecutive is actually an "OpLabelVolume", even if they share no base class or naming), but remove it from OpRelabelConsecutive5D so that at least that op can focus on its job
| pass | ||
|
|
||
|
|
||
| class OpRelabelConsecutive5D(OpUnblockedArrayCache): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't really see the value of nesting the cached op here, seems to me one could just put the dtype functionality in the caching op and remove the wrapper? Then the caching op also wouldn't have to double check the dtype on its own side again...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand what you mean by nesting here. You mean Having OpRelabelConsecutive5D as an object being used in OpRelabelConsecutive?
The latter operates in terms of input data dimensions, and OpRelabelConsecutive5D implementation is simplified by assuming 5D...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I see OpRelabelConsecutive mirrors the internal architecture of OpLabelVolume pretty much 1:1, though it does read a lot cleaner here :)
This implicit relationship has tripped me up a few times during the review... I feel it could be helpful to have a noop OpObjectsABC or sth purely to document the connection, and/or rename OpRelabelConsecutive into OpRelabelVolume. I'm immediately also jumping to "well and extract the reorder-dtype-convert-MakeLabels-reorderToInput" cascade into the base class"... Or maybe OpRelabelConsecutive5D should really be a subclass of OpLabelingABC. But I remember you've mentioned going back and forth between implementations and discarding others, so I'm happy to accept what we have here was the cleanest way to solve this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was good you left the comment as the common interface between OpRelabelConsecutive and OpLabelVolume for object classification is pretty minimal as it turns out:
class OpLabelBase(Operator):
"""Minimal interface for Operators that takes an input volume and produces a
consecutive labeling as an output volume"""
Input = InputSlot()
"""Volume to (re)label, up to 5D"""
CachedOutput = OutputSlot()
"""Cached label image - internally always whole time slices are stored
Axistags and shape are the same as on the Input, dtype is an integer
datatype.
This slot extends the ROI to the full xyz volume (c and t are unaffected)
and computes the labeling for the whole volume. As long as the input does
not get dirty, subsequent requests to this slot guarantee consistent
labelings.
"""
CleanBlocks = OutputSlot()
"""Expose for Cache serialization, slicings for clean blocks stored in the cache"""will probably do the job (still shifting things around a bit). It's a pity that not more of the interface can be changed, but well...
|
|
||
|
|
||
| @patch("vigra.analysis.relabelConsecutive", wraps=vigra.analysis.relabelConsecutive) | ||
| def test_time_slice_blocking_dict(relabel_mock: MagicMock, oprelabel, labels_t): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this up to line 174 (just so that the time_slice_blocking tests are together)
| def test_startlabel(self): | ||
| op = OpRelabelConsecutive(graph=Graph()) | ||
| op.StartLabel.setValue(10) | ||
| freed = cache_op.freeBlock(((5, 5), (6, 6))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this test ensure? I couldn't find a roi here that failed the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it just ensures that there will be noerrors clearing cache keys. It's a feature all the caches implement and this test documents that it is working as expected.
| def test_simple(oprelabel: OpRelabelConsecutive, labels: vigra.VigraArray): | ||
| oprelabel.Input.setValue(labels) | ||
| expected = labels // 2 | ||
| relabeled: npt.ArrayLike = oprelabel.CachedOutput[:].wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| relabeled: npt.ArrayLike = oprelabel.CachedOutput[:].wait() | |
| relabeled: npt.NDArray = oprelabel.CachedOutput[:].wait() |
don't have the connection to check now but I remember ArrayLike being deceptive; sounds like "something that looks like an array" but specifically means "something that implements __array__"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, total oversight :)
* removed unused / cleaned-up imports * removed `_standarize_roi` from `OpRelabelConsecutive5D` - implementation identical to base class `OpUnblockedArrayCache` * Fix Manual tracking: BinaryImage -> SegmentationImage * removed a redundant return Co-authored-by: Benedikt Best <[email protected]>
c153d2a to
d735102
Compare
ee163e9 to
3f4f69e
Compare
Clash on Win - pyshtools installs openmp, which clobbers dlls needed by pytorch's intel-openmp since pytorch 2.4 See bfab86d (similar problem)
3f4f69e to
3b8cdc8
Compare
Inputs such as label images from cellpose or stardist are treated a bit poorly in ilastik - connected components is run again (with possibly different connectivity) resulting in different objects in ilastik after loading such label image data. This is super inconvenient for users: Any prior analysis they did will be hard to merge (different object ids, possibly different objects).
This PR introduces a new workflow, "Object Classification [Raw Data, Label Image]" that internally doesn't do CCA anymore but merely relabels the label image to ensure consecutive labels (required by feature extraction in vigra).
see discussion in #2416
fixes #2416
Summary:
OpRelabelConsecutiveinto an operator that caches two results: the relabeled image and a dictionary that holds the mapping - both necessary in the workflow (relabeld image as input for feature computation, the mapping to export the original labelids to the table).There are some slightly unrelated changes (first two commits):
Changes I didn't include here because changeset is already pretty big:
Todos (before merge, didn't do it yet for the sake of the review):