Skip to content

Torchvision API - ColorJitter and Grayscale operators#6272

Open
mdabek-nvidia wants to merge 10 commits intoNVIDIA:mainfrom
mdabek-nvidia:torchvision_color
Open

Torchvision API - ColorJitter and Grayscale operators#6272
mdabek-nvidia wants to merge 10 commits intoNVIDIA:mainfrom
mdabek-nvidia:torchvision_color

Conversation

@mdabek-nvidia
Copy link
Collaborator

Category:

New feature

Description:

Implementation of Torchvison OO API operators:

  • ColorJitter
  • Grayscale

Implementation of Torchvision functional operators:

  • rgb_to_grayscale
  • to_grayscale

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

Signed-off-by: Marek Dabek <[email protected]>
Signed-off-by: Marek Dabek <[email protected]>
@mdabek-nvidia
Copy link
Collaborator Author

!build

@mdabek-nvidia
Copy link
Collaborator Author

@greptileai - please review

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46735031]: BUILD STARTED

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 22, 2026

Greptile Summary

This PR adds ColorJitter and Grayscale operators (OO API) and to_grayscale / rgb_to_grayscale functional API entries to the experimental torchvision compatibility layer. As part of the change, layout-dimension extraction logic previously duplicated in resize.py and centercrop.py is consolidated into two new helpers in operator.py (get_HWC_from_layout_dynamic and get_HWC_from_layout_pipeline), and both existing operators are updated to use the new 4-tuple unpacking.

Key points:

  • ColorJitter delegates to fn.color_twist and supports both CPU and GPU devices. Parameters are validated in __init__ via VerificationBCS / VerificationHue before the DALI pipeline is constructed.
  • Grayscale uses fn.color_space_conversion (RGB→GRAY), fn.cat (1→3 channels), or fn.hsv with saturation=0 (3→3 desaturate), covering all four input/output channel combinations.
  • Validation gap in VerificationHue: a negative scalar hue (e.g. hue=-0.1) silently passes validation but causes __init__ to produce an inverted range (0.1, -0.1). This in turn causes fn.random.uniform(range=(0.1, -0.1)) to fail at runtime. The fix is a simple non-negativity guard on the scalar branch, mirroring torchvision's own check.
  • Dead else branch in _create_param: returns a bare float instead of a list, which would break _get_BCSH if reached (it is currently unreachable).
  • The test suite is otherwise solid, covering all device/channel combinations and invalid-parameter cases — just missing the negative-scalar-hue case.

Confidence Score: 4/5

  • Safe to merge once the negative-scalar hue validation gap in VerificationHue is addressed — all other changes are clean refactors or well-tested new features.
  • The PR is well-structured and previous review concerns (integer inputs, HW layout IndexError) have been resolved. One targeted P1 fix remains: negative scalar hue bypasses validation and creates an inverted range at runtime. That path is easy to reproduce and easy to fix, making this a 4 rather than a 5.
  • dali/python/nvidia/dali/experimental/torchvision/v2/color.py — specifically VerificationHue.verify and the _create_param else-branch.

Important Files Changed

Filename Overview
dali/python/nvidia/dali/experimental/torchvision/v2/color.py New file implementing ColorJitter and Grayscale operators. Contains a P1 bug: negative scalar hue (e.g. hue=-0.1) passes VerificationHue but creates an inverted range (0.1, -0.1) in init, causing fn.random.uniform to fail at runtime. Also has a dead else-branch in _create_param that would return a scalar instead of a list.
dali/python/nvidia/dali/experimental/torchvision/v2/operator.py Adds VerifyIfNonNegative, get_input_shape_dynamic, get_HWC_from_layout_dynamic, and get_HWC_from_layout_pipeline helpers, consolidating layout-parsing logic previously duplicated across resize.py and centercrop.py. The new get_HWC_from_layout_dynamic correctly handles "HW" layout returning c=1; the pipeline variant does not but that path is unreachable from adjust_input.
dali/python/nvidia/dali/experimental/torchvision/v2/functional/color.py New functional API file exposing to_grayscale and rgb_to_grayscale. Both delegate to the same _grayscale helper, correctly decorated with @adjust_input. Logic mirrors the Grayscale operator and looks correct.
dali/test/python/torchvision/test_tv_color.py New test file covering Grayscale and ColorJitter. Good coverage for PIL inputs across devices, but missing a test case for negative scalar hue, which is currently an undetected validation gap. GPU path for functional API (to_grayscale/rgb_to_grayscale) is not exercised in the parametrised tests.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["User input\n(PIL.Image / torch.Tensor)"] --> B["adjust_input decorator\ntransform_input()"]
    B --> C{"Input type"}
    C -->|"PIL.Image"| D["ndd.Tensor\nlayout=HWC"]
    C -->|"torch.Tensor 3D"| E["ndd.Tensor\nlayout=CHW"]
    C -->|"torch.Tensor >3D"| F["ndd.Batch\nlayout=CHW"]

    D & E & F --> G{"Operator"}

    G -->|"ColorJitter"| H["VerificationBCS + VerificationHue\n(in super().__init__)"]
    H --> I["_create_param(brightness/contrast/saturation)\n→ list[float, float]"]
    I --> J["hue scalar → (-hue, hue) tuple"]
    J --> K["_kernel: _get_BCSH\nsamples random factors"]
    K --> L["fn.color_twist()\nbright/contrast/sat/hue"]

    G -->|"Grayscale"| M["VerificationGSOutputChannels\n(in super().__init__)"]
    M --> N["preprocess_data:\nget_HWC_from_layout_pipeline\n→ (h, w, c, tensor)"]
    N --> O{"num_output_channels × c"}
    O -->|"1×3: RGB→Gray"| P["fn.color_space_conversion\nRGB→GRAY"]
    O -->|"1×1: no-op"| Q["pass"]
    O -->|"3×1: replicate"| R["fn.cat × 3 on C axis"]
    O -->|"3×3: desaturate"| S["fn.hsv saturation=0"]

    L & P & Q & R & S --> T["output ndd.Tensor/Batch"]
    T --> U["adjust_output\n→ PIL.Image or torch.Tensor"]
Loading

Comments Outside Diff (1)

  1. dali/python/nvidia/dali/experimental/torchvision/v2/color.py, line 79-85 (link)

    Negative scalar hue bypasses validation and creates an inverted range

    When a negative scalar is passed (e.g., hue=-0.1), VerificationHue.verify converts it to [-0.1, -0.1] and then checks hue[0] < -0.5 or hue[1] > 0.5 — both conditions are false, so validation passes silently.

    Meanwhile, in __init__ the scalar is converted via (-float(hue), float(hue)), producing (0.1, -0.1) — an inverted range where min > max. When _get_BCSH later calls fn.random.uniform(range=(0.1, -0.1)), the inverted bounds will cause a runtime error.

    Torchvision enforces 0 <= hue <= 0.5 for scalar inputs. The verification should mirror this:

    The test suite in test_tv_color.py doesn't include hue=-0.1 among the invalid-param cases, so this path is currently untested.

Reviews (4): Last reviewed commit: "Review fixes" | Re-trigger Greptile

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46735031]: BUILD PASSED

@mdabek-nvidia mdabek-nvidia force-pushed the torchvision_color branch 2 times, most recently from ac68f8c to 994315b Compare March 23, 2026 08:40
@mdabek-nvidia
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46765427]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46765427]: BUILD FAILED

mdabek-nvidia and others added 7 commits March 24, 2026 11:34
Signed-off-by: Marek Dabek <[email protected]>
Signed-off-by: Marek Dabek <[email protected]>
Co-authored-by: Kamil Tokarski <[email protected]>
Signed-off-by: Marek Dabek <[email protected]>
Signed-off-by: Marek Dabek <[email protected]>
Signed-off-by: Marek Dabek <[email protected]>
Signed-off-by: Marek Dabek <[email protected]>
@mdabek-nvidia
Copy link
Collaborator Author

@greptileai please re-review

@mdabek-nvidia
Copy link
Collaborator Author

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46875703]: BUILD STARTED

@mdabek-nvidia mdabek-nvidia marked this pull request as ready for review March 24, 2026 14:02
Signed-of-by: Marek Dabek <[email protected]>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [46875703]: BUILD PASSED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants