Bytedance Seedance 2.0 Image To Video
Playground
Try it on WavespeedAI!Seedance 2.0 (Image-to-Video) generates Hollywood-grade cinematic videos from reference images and text prompts with native audio-visual synchronization, director-level camera and lighting control, and exceptional motion stability. Built on Seed’s unified multimodal architecture, it preserves the input image’s subject and composition while adding expressive, physically accurate motion.
Features
Seedance 2.0 Image-to-Video
Seedance 2.0 is Seed’s latest video generation model, built on a unified multimodal architecture. The Image-to-Video mode generates production-grade cinematic videos from reference images and text prompts — preserving the input image’s subject, composition, and style while adding expressive motion with native audio synchronization.
Key Features
-
Unified multimodal architecture A single model that handles text, image, audio, and video inputs for comprehensive creative flexibility.
-
Image-faithful generation Preserves the reference image’s subject identity, composition, lighting, and style while animating it into motion.
-
Multi-image reference support Guide generation with up to 4 reference images for consistent style, characters, or scenes.
-
Native audio-visual synchronization Generates video with synchronized audio in a single pass.
-
Director-level control Granular control over camera movement, lighting, shadows, and character performance through prompts.
-
Exceptional motion stability Industry-leading motion coherence with stable subjects, consistent physics, and fluid transitions.
Parameters
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Detailed description of the cinematic scene |
| image | Yes | Start image URL to guide the video generation |
| last_image | No | Last frame image URL for video continuation |
| duration | No | Video length in seconds: 4-15 (default: 5) |
| aspect_ratio | No | Output format: 16:9, 9:16, 4:3, 3:4, 1:1, 21:9 (default: adaptive) |
| resolution | No | Output resolution: 480p, 720p (default), or 1080p |
How to Use
- Upload a start image — provide an image to guide the video generation.
- Write your prompt — describe the scene with cinematic detail: action, camera movement, lighting, mood.
- Set duration — choose any duration from 4 to 15 seconds.
- Run — submit and download your cinematic video with synchronized audio.
Pricing
| Resolution | Duration | Cost |
|---|---|---|
| 480p | 5 s | $0.60 |
| 480p | 10 s | $1.20 |
| 480p | 15 s | $1.80 |
| 720p | 5 s | $1.20 |
| 720p | 10 s | $2.40 |
| 720p | 15 s | $3.60 |
| 1080p | 5 s | $3.00 |
| 1080p | 10 s | $6.00 |
| 1080p | 15 s | $9.00 |
Prices scale linearly with duration (4-15 seconds).
Billing Rules
- Base rate (480p): $0.60 per 5 seconds
- 720p: 2x the 480p price
- 1080p: 5x the 480p price (2.5x the 720p price)
- Duration range: 4-15 seconds (continuous)
Best Use Cases
- Product Demos — Animate product shots into cinematic showcase videos.
- Ad Creatives — Turn storyboard frames into polished commercial footage.
- Character Animation — Bring character art or portraits to life with natural motion.
- Scene Extension — Transform a single keyframe into a full cinematic sequence.
- Style-Consistent Series — Use reference images to maintain visual consistency across multiple clips.
Pro Tips
- Upload high-quality reference images for the best subject preservation.
- Write prompts like a film director — include lighting, camera angles, and mood.
- Use multiple reference images for better style and character consistency.
- Start with a short duration (4-5s) to iterate, then extend up to 15s for the final cut.
- Describe character expressions and actions for more engaging scenes.
Notes
- Native audio generation is included — videos come with synchronized sound.
- Up to 4 reference images can be uploaded.
- Duration range: 4-15 seconds (continuous).
- Aspect ratio follows the input image composition.
Related Models
- Seedance 2.0 Text-to-Video — Generate video from text prompts alone.
- Seedance 2.0 Fast Image-to-Video — Faster generation at lower cost.
- Seedance 2.0 Fast Text-to-Video — Fast text-guided video generation.
- Seedance V1.5 Pro Image-to-Video — Previous generation Seedance model.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/bytedance/seedance-2.0/image-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"resolution": "720p",
"duration": 5,
"enable_web_search": false
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | Describe the scene, action, camera movement, and mood for the video. | |
| image | string | Yes | - | Start image URL to guide the video generation. | |
| last_image | string | No | - | - | Last frame image URL for video continuation. |
| aspect_ratio | string | No | - | 16:9, 9:16, 4:3, 3:4, 1:1, 21:9 | The aspect ratio of the generated video. If not specified, adapts to the input image. |
| resolution | string | No | 720p | 480p, 720p, 1080p | The output video resolution. |
| duration | integer | No | 5 | 4 ~ 15 | The duration of the generated video in seconds (4-15s). |
| enable_web_search | boolean | No | false | - | Enable web search for real-time information. |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | object | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |