-
Notifications
You must be signed in to change notification settings - Fork 467
Feat; Add support for Wan/Qwen TAEHV decoding #937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Sorry for the unrelated whitespace changes and the debug spam, will fix later |
|
Now tae decoding for the outputs of Wan2.1 models (and Wan2.2 A14B) works in txt2img mode. Video decoding is running as well, but the results are obviously incorrect (flashing lights warning) If someone can see what I'm doing wrong when decoding videos, let me know. |
Co-authored-by: Ollin Boer Bohan <[email protected]>
|
Video is still completely broken, but image decoding works very well now. |



https://github.com/madebyollin/taehv
Model weights: https://github.com/madebyollin/taehv/blob/main/taew2_1.pth
Only tested "successfuly" for decoding Qwen-Image outputs, still need some work to support video models. Encoding seems to work too, at least in img2img mode.
.\bin\Release\sd.exe --diffusion-model ..\..\ComfyUI\models\diffusion_models\qwen-image-Q8_0.gguf --vae ..\..\ComfyUI\models\vae\qwen_image_vae.safetensors --qwen2vl ..\..\ComfyUI\models\text_encoders\Qwen2.5-VL-7B-Instruct-Q8_0.gguf -p '一个穿着"QWEN"标志的T恤的中国美女正拿着黑色的马克笔面相镜头微笑。她身后的玻璃板上手写体写着 “一、Qwen-Image的技术路线: 探索视觉生成基础模型的极限,开创理解与生成一体化的未来。二、Qwen-Image的模型特色:1、复杂文字渲染。支持中英渲染、自动布局; 2、精准图像编辑。支持文字编辑、物体增减、风格变换。三、Qwen-Image的未来愿景:赋能专业内容创作、助力生成式AI发展。”' --cfg-scale 2.5 --sampling-method euler -v --offload-to-cpu -H 1024 -W 1024 --diffusion-fa --flow-shift 3 --tae ..\ComfyUI\models\vae_approx\taew2_1.pth --vae-conv-directSpeedup and memory saving aren't that impressive yet, maybe it can be improved further?