Alibaba Wan 2.6 Models are now live - Cinematic, Multi-shot 1080P with Flash Version

Alibaba’s Wan 2.6 Models are a compact, production-ready set of video generation endpoints covering the three core workflows: text-to-video, image-to-video, and reference-to-video. Built for fast iteration and reliable results, Wan 2.6 emphasizes stronger visual consistency, smoother motion, and more controllable cinematic style across generations.

Wan 2.6 Series — Text, Image & Reference to Video

Wan 2.6 offers three focused endpoints so you can generate from scratch, animate a still, or guide generation with a reference video—ideal for commercial pipelines, creative testing, and repeatable content production.

Wan 2.6 Text-to-Video — Generate coherent, cinematic motion from text prompts for story beats, ads, and creative shorts.
alibaba/wan-2.6/text-to-video
Wan 2.6 Image-to-Video — Animate a still image into natural motion while preserving subject and composition for product shots and visual assets.
alibaba/wan-2.6/image-to-video
alibaba/wan-2.6/image-to-video-flash
Wan 2.6 Reference-to-Video — Use a reference video to guide motion style, pacing, and framing to create new, high-consistency scenes.
alibaba/wan-2.6/reference-to-video
alibaba/wan-2.6/reference-to-video-flash
Wan 2.6 Text-to-Image — Generate high-quality images from text prompts for key visuals, concept frames, and rapid creative exploration.
alibaba/wan-2.6/text-to-image
Wan 2.6 Image-Edit — Edit and refine images with precise, controlled changes while preserving structure and subject consistency.
alibaba/wan-2.6/image-edit

Highlights

5s Video Reference → Character + Voice Consistency: Recreate a character and voice from a short reference clip for more stable identity across shots.
Cinematic Multi-Shot Storytelling: Turn prompts or storyboards into coherent, multi-scene sequences with stronger continuity.
1080P Output: Generate high-fidelity videos suitable for social and production workflows.
Sound-Picture Sync: Native audio alignment with visuals—dialogue, music, and sound effects that land on beat.
Fast Iteration, More Control: Built for rapid creative testing while keeping style, framing, and motion more consistent.

Wan 2.6 Series — Text, Image & Reference to Video

Wan 2.6 Text-to-Video — Generate coherent, cinematic motion from text prompts for story beats, ads, and creative shorts.
alibaba/wan-2.6/text-to-video
Wan 2.6 Image-to-Video — Animate a still image into natural motion while preserving subject and composition for product shots and visual assets.
alibaba/wan-2.6/image-to-video
alibaba/wan-2.6/image-to-video-flash
Wan 2.6 Reference-to-Video — Use a reference video to guide motion style, pacing, and framing to create new, high-consistency scenes.
alibaba/wan-2.6/reference-to-video
alibaba/wan-2.6/reference-to-video-flash
Wan 2.6 Text-to-Image — Generate high-quality images from text prompts for key visuals, concept frames, and rapid creative exploration.
alibaba/wan-2.6/text-to-image
Wan 2.6 Image-Edit — Edit and refine images with precise, controlled changes while preserving structure and subject consistency.
alibaba/wan-2.6/image-edit

Highlights

5s Video Reference → Character + Voice Consistency: Recreate a character and voice from a short reference clip for more stable identity across shots.
Cinematic Multi-Shot Storytelling: Turn prompts or storyboards into coherent, multi-scene sequences with stronger continuity.
1080P Output: Generate high-fidelity videos suitable for social and production workflows.
Sound-Picture Sync: Native audio alignment with visuals—dialogue, music, and sound effects that land on beat.
Fast Iteration, More Control: Built for rapid creative testing while keeping style, framing, and motion more consistent.

Wan 2.6 Models

Wan 2.6 Series — Text, Image & Reference to Video

Highlights

alibaba

alibaba

alibaba

alibaba

alibaba

alibaba

alibaba

alibaba

alibaba

alibaba

Wan 2.6 Series — Text, Image & Reference to Video

Highlights