Adds an experimental one-ahead prefetching utility#1525
Adds an experimental one-ahead prefetching utility#1525peterdsharpe wants to merge 3 commits intoNVIDIA:mainfrom
Conversation
…ssingDataset - Introduced a new `prefetch_map` function to optimize data loading by overlapping CPU-bound preparation with GPU-bound processing. - Updated `CachedPreprocessingDataset` to raise a ValueError if `sample_paths` is empty, ensuring robust error handling. - Added `prefetch_map` to the module's exports for accessibility.
Greptile SummaryThis PR adds a
Important Files Changed
Last reviewed commit: "Add prefetch_map uti..." |
| with ThreadPoolExecutor(max_workers=1) as pool: | ||
| it = iter(iterable) | ||
| try: | ||
| future = pool.submit(fn, next(it)) | ||
| except StopIteration: | ||
| return | ||
|
|
||
| for item in it: | ||
| next_future = pool.submit(fn, item) | ||
| yield future.result() | ||
| future = next_future | ||
|
|
||
| yield future.result() |
There was a problem hiding this comment.
Blocking cleanup on early exit is undocumented
When a consumer does not fully exhaust the iterator (e.g., break, or an exception in the consumer), Python calls generator.close(), which throws GeneratorExit at the suspended yield on line 60. This triggers ThreadPoolExecutor.__exit__, which calls shutdown(wait=True) — blocking until the in-flight next_future completes.
Since fn is documented as a potentially expensive CPU-bound operation (subsampling, geometry precomputation, host-to-device transfer), this can cause a silent and surprising stall at loop teardown time. For example:
for item in prefetch_map(huge_dataloader, expensive_fn): # fn takes ~5 s per item
if error_condition:
break # silently blocks for up to ~5 s during cleanupIt would be worth adding a note to the docstring warning about this behavior, e.g.:
Notes
-----
If the iterator is not fully consumed (e.g., due to an early ``break`` or
an exception in the caller), cleanup will block until the background
thread's in-flight task completes.
…cumentation - Updated the `prefetch_map` function to ensure the background thread is properly shut down by using a `finally` block for `pool.shutdown()`, allowing for non-blocking cancellation of in-flight tasks. - Added notes to the docstring to clarify behavior when the iterator is not fully consumed, improving documentation for better user understanding.
prefetch_mapfunction to optimize data loading by overlapping CPU-bound preparation with GPU-bound processing.CachedPreprocessingDatasetto raise a ValueError ifsample_pathsis empty, ensuring robust error handling.prefetch_mapto the module's exports for accessibility.PhysicsNeMo Pull Request
Description
Used downstream in GLOBE training on DrivAerML (PR to come; this is just setting the stage in nice chunks).
Checklist
Dependencies
Review Process
All PRs are reviewed by the PhysicsNeMo team before merging.
Depending on which files are changed, GitHub may automatically assign a maintainer for review.
We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.
AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.