-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Task / Feature request: native in-memory batched inference API for Model Zoo
Summary
Add a first-class inference pipeline that accepts in-memory image/frame arrays (np.ndarray / optionally torch.Tensor) and runs batched prediction without requiring files on disk.
Motivation
Current high-level Model Zoo flows are mainly file-path based (video/image paths, output files). This is awkward for services pipelines and notebooks where frames already exist in memory.
Problem today
No clear public API for batched in-memory inputs. Existing convenience APIs emphasize disk I/O and artifact writing. For array-based predictions users end up stitching together lower-level runners or relying on the web-app inference class (which needs extra glue code, is not documented and can be error-prone).
Requested API shape (example)
preds = deeplabcut.modelzoo.video_analysis(
frames=frames, # list/iterable of HxWx3 arrays or [N,H,W,3]
superanimal_name="superanimal_topviewmouse",
model_name="hrnet_w32",
detector_name="fasterrcnn_resnet50_fpn_v2",
batch_size=16,
detector_batch_size=16,
max_individuals=3,
device="cuda:0",
)Acceptance criteria
Supports top-down and bottom-up SuperAnimal models.
Fully in-memory path (no required writes to image/video/h5/json).
True batching for detector and pose stages.
Returns structured predictions per frame.
Clear validation/errors for shape/dtype/device mismatches.
Documented example usage.
Impact
This would make DLC Model Zoo much easier to use in production / integrated projects, reduce boilerplate, and improve reliability/performance for non-file-based workflows.