-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve validation frame distribution in tasks with honeypots #8776
Conversation
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe changes in this pull request focus on enhancing the management of validation frames within honeypot tasks in the CVAT application. Key updates include the introduction of a more uniform selection process for validation frames, improvements in the handling of honeypot frames, and refinements to the validation logic in various serializers and task management functions. Additionally, the test suite has been expanded and updated to ensure robust coverage of these changes, particularly concerning the interaction between validation frames and task annotations. Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (3)
cvat/apps/engine/serializers.py (1)
1356-1358
: Simplify the assignment ofvalidation_layout
The assignment of
validation_layout
can be simplified by removing unnecessary parentheses.Apply this diff to improve readability:
- validation_layout: Optional[models.ValidationLayout] = ( - getattr(instance.data, 'validation_layout', None) - ) + validation_layout: Optional[models.ValidationLayout] = getattr(instance.data, 'validation_layout', None)tests/python/rest_api/test_tasks.py (2)
2268-2276
: Refactor duplicated validation distribution test into a helper functionThe code for testing the uniform distribution of validation frames is duplicated. To improve maintainability and reduce redundancy, consider extracting this logic into a helper function.
Apply this diff to replace the duplicated code with a helper function call:
- assert max(validation_frame_counts.values()) <= 1 + min( - validation_frame_counts.values() - ) + assert_uniform_distribution(validation_frame_counts)Define the helper function outside the test methods:
def assert_uniform_distribution(counts): assert max(counts.values()) <= 1 + min(counts.values())
4480-4488
: Refactor duplicated validation distribution test using the helper functionThe code for testing the uniform distribution of validation frames is duplicated here. Use the helper function to improve code maintainability.
Apply this diff to replace the duplicated code:
- assert max(validation_frame_counts.values()) <= 1 + min( - validation_frame_counts.values() - ) + assert_uniform_distribution(validation_frame_counts)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
changelog.d/20241205_161129_mzhiltso_update_honeypot_selection.md
(1 hunks)cvat/apps/engine/serializers.py
(5 hunks)cvat/apps/engine/task.py
(2 hunks)tests/python/rest_api/test_tasks.py
(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- changelog.d/20241205_161129_mzhiltso_update_honeypot_selection.md
🔇 Additional comments (2)
cvat/apps/engine/task.py (1)
1282-1283
: Validation frames allocation logic is correct
The implementation for assigning validation frames to jobs using itertools.islice
and adding them to job_frames
ensures a uniform distribution of validation frames across jobs, as intended.
cvat/apps/engine/serializers.py (1)
1417-1418
: Verify that assigning -1
to db_image.real_frame
is acceptable
Setting db_image.real_frame
to -1
may cause issues if negative values are not handled appropriately elsewhere in the codebase. Please verify that this assignment is intended and that all dependent code can handle real_frame
being -1
.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8776 +/- ##
===========================================
- Coverage 73.96% 73.89% -0.07%
===========================================
Files 409 409
Lines 43942 43930 -12
Branches 3985 3986 +1
===========================================
- Hits 32500 32464 -36
- Misses 11442 11466 +24
|
cvat/apps/engine/quality_control.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a name related to honeypots, since we already have a quality control app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't suppose this file to contain only honeypot-related logic. Ideally, I'd extract some common bits from regular gt job logic handling as well.
@@ -1354,6 +1360,8 @@ def update(self, instance: models.Task, validated_data: dict[str, Any]) -> model | |||
gt_job_meta_serializer.is_valid(raise_exception=True) | |||
gt_job_meta_serializer.save() | |||
|
|||
validation_layout.refresh_from_db() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify why we need to make one more request to the database if we can use the data already returned after the update?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I got the idea. gt_job_meta_serializer
can change task validation layout. How do you suggest to update the validation_layout
variable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e.g. this way should work:
gt_job = gt_job_meta_serializer.save()
instance = gt_job.segment.task
validation_layout = instance.data.validation_layout
I guess we need to follow the approach: serializer changes data and returns actual object -> use them without making requests to refresh data from a database.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, but why a GT job meta serializer should be expected to return some deeply nested object in the actual state? I'd understand if it was just the returned value or it's direct member field, but in this case it's 4 levels deep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but here it at least mentioned in the serializer name. Actually, the argument can probably be changed to be validation layout.
Co-authored-by: Maria Khrustaleva <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, check these 3 comments:
#8776 (comment)
#8776 (comment)
#8776 (comment)
Otherwise LGTM
…to zm/update-honeypot-selection
|
||
if not rng: | ||
# Use a known uniform distribution | ||
rng = np.random.Generator(np.random.MT19937()) |
Check notice
Code scanning / SonarCloud
Results that depend on random number generation should be reproducible Low
Quality Gate passedIssues Measures |
Motivation and context
How has this been tested?
Checklist
develop
branch(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Tests