[WIP][diffusion] new model: support Wan Animate with Data preprocessing #16113

tom-jerr · 2025-12-30T02:20:48Z

Motivation

Supporting Wan Animate (https://huggingface.co/Wan-AI/Wan2.2-Animate-14B-Diffusers)

See issue #13867

Based on PR #15419

Modifications

This PR introduces a customized orchestration pipeline for the Wan Animate model. By implementing a Segment Loop mechanism, it overcomes VRAM limitations for long video generation. Additionally, it integrates a Data Preprocessing stage that supports direct extraction of pose and face features from raw videos, significantly lowering the barrier to entry for end-users.

Pipeline Design

`WanAnimatePipeline`:

Preprocessing -> Validation -> Text/Image Encoding -> VAE Encoding -> Video Processing -> Condition Construction -> Segment Loop

SegmentLoopStage

The SegmentLoopStage manages the cur_segment index and orchestrates a sub-pipeline consisting of five stages: Conditioning → Timestep → Latent → Denoising → Decoding
The WanAnimateConditioningStage dynamically slices the Pose/Face conditions for the current segment.
The DecodingStage triggers postprocess_decoded_frames for incremental frame stitching and returns a Req object to drive the next iteration or returns OutputBatch to terminate.

Data Preprocessing Stage

We provide a Data Preprocessing Stage, if users provide the onnx model path of the preprocess model, we can directly process the original video passed in by the user instead of the user passing in the preprocessed pose video and face video.

With preprocess_model_path:

sglang generate --model-path Wan-AI/Wan2.2-Animate-14B-Diffusers \
 --prompt='People in the video are doing actions.' --image-path [ref image path] --video-path [video path] --preprocess-model-path [process model path] --width  1280 --height 720 --save-output

Without preprocess_model_path:

sglang generate --model-path Wan-AI/Wan2.2-Animate-14B-Diffusers\\n --prompt='People in the video are doing actions.'  --image-path [ref image path] --pose-video-path [processed pose video path] --face-video-path [processed face video path] --width  1280 --height 720 --save-output

TODO

replace mode for wan animate
retarget and use flux for data preprocess

Accuracy Tests

Replicate.com wan animate animation:

replicate-prediction-h3jf38znwnrma0csgty9ejwmbg.mp4

Our results:

People_in_the_video_are_doing_actions._20251229-141238_91e44799.mp4

Checklist

Format your code according to the Format code with pre-commit.
Add unit tests according to the Run and add unit tests.
Update documentation according to Write documentations.
Provide accuracy and speed benchmark results according to Test the accuracy and Benchmark the speed.
Follow the SGLang code style guidance.
Work with maintainers to merge your PR. See the PR Merge Process

Signed-off-by: lzy <[email protected]>

gemini-code-assist · 2025-12-30T02:20:51Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

tom-jerr · 2025-12-30T02:22:25Z

I have a problem, whether we should do data preprocess in inference pipeline?

yhyang201 · 2025-12-30T03:59:28Z

python/sglang/multimodal_gen/runtime/entrypoints/diffusion_generator.py


        # 2. send requests to scheduler, one at a time
        # TODO: send batch when supported
+        # import debugpy


Mellonta and others added 2 commits December 29, 2025 08:20

tmp

470b98c

support Wan Animate, add preprocess stage and segmentloop stage

fce6956

Signed-off-by: lzy <[email protected]>

tom-jerr requested review from mickqian and yhyang201 as code owners December 30, 2025 02:20

github-actions bot added npu diffusion SGLang Diffusion labels Dec 30, 2025

yhyang201 reviewed Dec 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP][diffusion] new model: support Wan Animate with Data preprocessing #16113

[WIP][diffusion] new model: support Wan Animate with Data preprocessing #16113

tom-jerr commented Dec 30, 2025

Uh oh!

gemini-code-assist bot commented Dec 30, 2025

Uh oh!

tom-jerr commented Dec 30, 2025

Uh oh!

yhyang201 Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[WIP][diffusion] new model: support Wan Animate with Data preprocessing #16113

Are you sure you want to change the base?

[WIP][diffusion] new model: support Wan Animate with Data preprocessing #16113

Conversation

tom-jerr commented Dec 30, 2025

Motivation

Modifications

Pipeline Design

WanAnimatePipeline:

SegmentLoopStage

Data Preprocessing Stage

TODO

Accuracy Tests

Checklist

Uh oh!

gemini-code-assist bot commented Dec 30, 2025

Uh oh!

tom-jerr commented Dec 30, 2025

Uh oh!

yhyang201 Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`WanAnimatePipeline`: