Kling V3.0 4K delivers top-tier 4K image-to-video generation with smooth motion, cinematic visuals, accurate prompt adherence, and optional audio. Supports start/end frame control, multi-prompt, and element references. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.
Idle
$2.1per run
Kling V3.0 4K Image-to-Video is Kuaishou's premium image animation model delivering 4K output. Upload a reference image and describe the motion — the model generates cinematic video with superior detail, optional start-to-end frame guidance, and synchronized sound.
4K quality The highest visual fidelity and motion realism in the Kling V3.0 family.
Flexible duration Generate videos from 3 to 15 seconds.
Start-end frame guidance Optional end image for controlled transitions between two frames.
Sound generation Optional synchronized sound effects generated alongside the video.
Multi-prompt and element list support Chain prompt segments for scene transitions and lock in specific visual elements for consistency.
| Parameter | Required | Description |
|---|---|---|
| image | Yes | Start frame image to animate (URL or upload). |
| prompt | Yes | Text description of the desired motion and action. |
| negative_prompt | No | Elements to exclude from the video. |
| end_image | No | End frame image for guided transitions. |
| duration | No | Video length in seconds (3-15, default: 5). |
| cfg_scale | No | Prompt guidance strength (0-1, default: 0.5). |
| sound | No | Generate synchronized sound alongside the video. Default: disabled. |
| shot_type | No | Editing mode: customize (default) or intelligent. |
| multi_prompt | No | Additional prompts for complex scene compositions. |
| element_list | No | List of visual elements to maintain consistency throughout the clip. |
$0.42 per second of video, regardless of whether audio is on or off.
| Duration | Cost |
|---|---|
| 3s | $1.26 |
| 5s | $2.10 |
| 10s | $4.20 |
| 15s | $6.30 |