Skip to content

Task / Feature request: native in-memory batched inference API for Model Zoo #3218

@deruyter92

Description

@deruyter92

Task / Feature request: native in-memory batched inference API for Model Zoo

Summary
Add a first-class inference pipeline that accepts in-memory image/frame arrays (np.ndarray / optionally torch.Tensor) and runs batched prediction without requiring files on disk.

Motivation
Current high-level Model Zoo flows are mainly file-path based (video/image paths, output files). This is awkward for services pipelines and notebooks where frames already exist in memory.

Problem today
No clear public API for batched in-memory inputs. Existing convenience APIs emphasize disk I/O and artifact writing. For array-based predictions users end up stitching together lower-level runners or relying on the web-app inference class (which needs extra glue code, is not documented and can be error-prone).

Requested API shape (example)

preds = deeplabcut.modelzoo.video_analysis(
   frames=frames, # list/iterable of HxWx3 arrays or [N,H,W,3]
   superanimal_name="superanimal_topviewmouse", 
   model_name="hrnet_w32", 
   detector_name="fasterrcnn_resnet50_fpn_v2", 
   batch_size=16,  
   detector_batch_size=16,  
   max_individuals=3,   
   device="cuda:0",   
)

Acceptance criteria
Supports top-down and bottom-up SuperAnimal models.
Fully in-memory path (no required writes to image/video/h5/json).
True batching for detector and pose stages.
Returns structured predictions per frame.
Clear validation/errors for shape/dtype/device mismatches.
Documented example usage.

Impact
This would make DLC Model Zoo much easier to use in production / integrated projects, reduce boilerplate, and improve reliability/performance for non-file-based workflows.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions