-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use label id as segment numbers #485
Use label id as segment numbers #485
Conversation
Make sure input data has been initialized by user before public methods are called. Made logging message more consistent. Made code more readable.
When converting from ITK to DICOM, allow to use Label IDs as Segment Numbers so that even if reading from various NRRD inputs, the Segments in the DICOM file will still use the Label IDs from the NRRD files. This is particularly helpful for testing since it allows a roundtrip test from DICOM -> multiple NRRD files using potentially multiple segments each -> back to DICOM. If no mapping to the Label IDs takes place, the converter will use ascending Segment Numbers, so the order will depend on the read order, which again usually depends on the order in the JSON meta information. This sorting behavior is disabled by default. It is available as cli option (--sortByLabelID) which is handed to the library through the Itk2DicomConverter::itkimage2dcmSegmentation() call.
Added missing test dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent quite some time with this PR, and I have a lot of questions. I am not sure it will work.
Also, note that MultiFrameDimensionModule
utilizes ReferencedSegmentNumber
in the per-frame FGs, see: https://github.com/QIICR/dcmqi/blob/master/libsrc/Itk2DicomConverter.cpp#L58-L62.
I would think that all of the frames would need to be reordered in the general case if the segments are renumbered.
Let me know if it is easier to have a screen share to go over this!
@@ -64,6 +64,15 @@ | |||
<description>Skip empty slices while encoding segmentation image. By default, empty slices will not be encoded, resulting in a smaller output file size.</description> | |||
</boolean> | |||
|
|||
<boolean> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelonken do we really need this flag? I do not see any downsides in always sorting label IDs and always reusing those if possible in SEG. Since you already implemented this option, why not keep it but set the default to true
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used this approach for two reasons:
- Don't break backward compatibility
- Make sure that conversion succeeds also in cases where label IDs cannot be sorted (since then the call will fail). I assume the user does often not know beforehand, whether the labelmaps will allow sorting. So the idea is that you only turn on this functionality if the user knows the prerequisites for this are fulfilled.
If 1. does not matter, one could also enable it by default and fall back to the old behavior if re-sorting does not work. I am not sure whether this can be implemented well, since the "rollback" after half of the re-resorting is probably difficult. And finding this out before actually starting the sorting work costs extra time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree! I made this comment in the beginning of my review, and if I revisited it after the review was done, I would most definitely revert it!
libsrc/Itk2DicomConverter.cpp
Outdated
@@ -367,16 +372,16 @@ namespace dcmqi { | |||
CHECK_COND(fgder->addDerivationImageItem(CodeSequenceMacro(code_seg.CodeValue,code_seg.CodingSchemeDesignator, | |||
code_seg.CodeMeaning),"",derimgItem)); | |||
|
|||
//cout << "Total of " << siVector.size() << " source image items will be added" << endl; | |||
// cout << "Total of " << siVector.size() << " source image items will be added" << endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you either remove this and the following commented out line, or swap for a logging command? I know this is most likely my code, not yours ....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sure. There are several places IIRC where I was not sure what to do with old "debug-like" commands. If you say I can do what I find useful, I will over all log messages and try to bring them into good shape.
@@ -285,6 +291,7 @@ namespace dcmqi { | |||
|
|||
Uint16 segmentNumber; | |||
CHECK_COND(segdoc->addSegment(segment, segmentNumber /* returns logical segment number */)); | |||
segNum2Label.insert(make_pair(segmentNumber, label)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, different values of segmentNumber
can map to the same value of label
, which may not be unique across input segmentation files. Is this ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, looking further, this will definitely be a problem, since this map is later used for renumbering segments!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe an easy thing to do is to check if label value is already present in the map prior to inserting it, and if it is indeed present, reset sortByLabelID
to false
, since sorting becomes impossible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assumed that labels are unique in the input. If this assumption does not hold, then indeed the sorting does not make sense (since the idea to use the original labels as Segment Numbers cannot work then if two labels use the same number)
libsrc/Itk2DicomConverter.cpp
Outdated
|
||
if (sortByLabelID) | ||
{ | ||
sortByLabel(segdocDataset.get(), segNum2Label); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see how this could possibly work. I do not see any checks for non-consecutive segment numbering, so if there are gaps in label numbers, the result will violate the conventions for having consecutive values for SegmentNumber
. Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right, this is not addressed. Probably it can easily by done by either checking the labels in in segNum2Label, or iterating over the segments in Segment Sequence once mapping has been performed.
Thank you for the thorough review @fedorov!
I will check soon and provide feedback.
…--
Sent from mobile
14.12.2023 22:35:48 Andrey Fedorov ***@***.***>:
***@***.**** requested changes on this pull request.
I spent quite some time with this PR, and I have a lot of questions. I am not sure it will work.
Also, note that *MultiFrameDimensionModule* utilizes *ReferencedSegmentNumber* in the per-frame FGs, see: https://github.com/QIICR/dcmqi/blob/master/libsrc/Itk2DicomConverter.cpp#L58-L62.
I would think that all of the frames would need to be reordered in the general case if the segments are renumbered.
Let me know if it is easier to have a screen share to go over this!
----------------------------------------
In apps/seg/itkimage2segimage.xml[#485 (comment)]:
> @@ -64,6 +64,15 @@
<description>Skip empty slices while encoding segmentation image. By default, empty slices will not be encoded, resulting in a smaller output file size.</description>
</boolean>
+ <boolean>
@michaelonken[https://github.com/michaelonken] do we really need this flag? I do not see any downsides in always sorting label IDs and always reusing those if possible in SEG. Since you already implemented this option, why not keep it but set the default to *true*?
----------------------------------------
In libsrc/Itk2DicomConverter.cpp[#485 (comment)]:
> @@ -367,16 +372,16 @@ namespace dcmqi {
CHECK_COND(fgder->addDerivationImageItem(CodeSequenceMacro(code_seg.CodeValue,code_seg.CodingSchemeDesignator,
code_seg.CodeMeaning),"",derimgItem));
- //cout << "Total of " << siVector.size() << " source image items will be added" << endl;
+ // cout << "Total of " << siVector.size() << " source image items will be added" << endl;
Can you either remove this and the following commented out line, or swap for a logging command? I know this is most likely my code, not yours ....
----------------------------------------
In libsrc/Itk2DicomConverter.cpp[#485 (comment)]:
> @@ -285,6 +291,7 @@ namespace dcmqi {
Uint16 segmentNumber;
CHECK_COND(segdoc->addSegment(segment, segmentNumber /* returns logical segment number */));
+ segNum2Label.insert(make_pair(segmentNumber, label));
Here, different values of *segmentNumber* can map to the same value of *label*, which may not be unique across input segmentation files. Is this ok?
----------------------------------------
In libsrc/Itk2DicomConverter.cpp[#485 (comment)]:
> @@ -483,10 +494,117 @@ namespace dcmqi {
segmentsOverlap = "NO";
else
segmentsOverlap = "UNDEFINED";
- CHECK_COND(segdocDataset.putAndInsertString(DCM_SegmentsOverlap, segmentsOverlap.c_str()));
+ CHECK_COND(segdocDataset->putAndInsertString(DCM_SegmentsOverlap, segmentsOverlap.c_str()));
+ }
+
+ if (sortByLabelID)
+ {
+ sortByLabel(segdocDataset.get(), segNum2Label);
I do not see how this could possibly work. I do not see any checks for non-consecutive segment numbering, so if there are gaps in label numbers, the result will violate the conventions for having consecutive values for *SegmentNumber*. Did I miss something?
----------------------------------------
In libsrc/Itk2DicomConverter.cpp[#485 (comment)]:
> @@ -285,6 +291,7 @@ namespace dcmqi {
Uint16 segmentNumber;
CHECK_COND(segdoc->addSegment(segment, segmentNumber /* returns logical segment number */));
+ segNum2Label.insert(make_pair(segmentNumber, label));
Yes, looking further, this will definitely be a problem, since this map is later used for renumbering segments!
—
Reply to this email directly, view it on GitHub[#485 (review)], or unsubscribe[https://github.com/notifications/unsubscribe-auth/ABL56T6F2P24KZ5DNYYRDPLYJNWLHAVCNFSM6AAAAABAMMVHZCVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTOOBQGE3TMMZSGY].
You are receiving this because you were mentioned.
[Verfolgungsbild][https://github.com/notifications/beacon/ABL56T7GUQIWXK3DIMLBYG3YJNWLHA5CNFSM6AAAAABAMMVHZCWGG33NNVSW45C7OR4XAZNRKB2WY3CSMVYXKZLTORJGK5TJMV32UY3PNVWWK3TUL5UWJTTKDNK4M.gif]
|
Good catch, i missed the dimension assignment. The After thinking about it, I assume it would be enough to also update the Comment 2023-12-18: I gave it some more thought and we could even keep it as is. The number of frames for display must not change at all if we re-number the segments (only). The display would not start any more with the old (unmapped) Segment 1 but with whatever Segment this has been mapped to, but this is not invalid or wrong. Of course one can adapt the order to start with new (mapped) 1 by applying the mapping to the Dimension Index Values as well in a second step. |
Aha! That's the key. Yes, I do not think we can assume anything about the label uniqueness. So what do we do with this PR? |
apps/seg/itkimage2segimage.xml
Outdated
@@ -64,6 +64,15 @@ | |||
<description>Skip empty slices while encoding segmentation image. By default, empty slices will not be encoded, resulting in a smaller output file size.</description> | |||
</boolean> | |||
|
|||
<boolean> | |||
<name>sortByLabelID</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about renaming this as useLabelIDasSegmentNumber
?
As discussed (offline), I am pretty sure I know how to fix this. But we will make the flag not the default behavior for now. |
Renamed feature from sortByLabel to useLabelIDAsSegmentNumber. Added check for unique, monotonically increasing label IDs before starting reassignment. Added missing update of ReferencedFrameNumber.
This pull request contains two parts:
New functionality:
Also, a new test group is added that uses some (more than before) sophisticated segment setup with overlapping segments. A first test produces a DICOM file from three NRRD inputs, which are merged together in one DICOM file. A second test splits them again into 3 NRRD files (where 2 NRRD files will contain more than one segment) and compares them to the original inputs from test 1. A third test compares metadata input used in test 1 to metadata output produced by test 2.
Various small enhancements in ITK to DICOM conversion context: