Grok Imagine Video API
xAI Grok Imagine Video — text-to-video, image-to-video, reference-to-video, video-extend, and edit-video from xAI's Grok Imagine Video model. Customizable duration, aspect ratio, and resolution with synchronized audio on generation variants.
Five endpoints: text-to-video, image-to-video, reference-to-video (preserved identity, style, and scene composition), video-extend (smooth continuation of short clips), and edit-video (transform existing videos with text prompts).
About the Grok Imagine Video API
What Grok Imagine Video does, how it fits in the xAI model lineup, and why teams reach for it.
Grok Imagine Video is a video generation model from xAI, available through the WaveSpeedAI REST API. xAI Grok Imagine Video — text-to-video, image-to-video, reference-to-video, video-extend, and edit-video from xAI's Grok Imagine Video model. Customizable duration, aspect ratio, and resolution with synchronized audio on generation variants.
Five endpoints: text-to-video, image-to-video, reference-to-video (preserved identity, style, and scene composition), video-extend (smooth continuation of short clips), and edit-video (transform existing videos with text prompts).
The Grok Imagine Video family on WaveSpeedAI ships 6 REST endpoints covering Image-To-Video, Video-Extend, Video-To-Video, Text-To-Video workflows. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.
Run Grok Imagine Video through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.
All Grok Imagine Video API endpoints
6 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Reference To Video
X-AI Grok Imagine Video Reference-to-Video generates videos from multiple reference images with preserved identity, style, and scene composition. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Video Extend
X-AI Grok Imagine Video Extend turns short clips into longer videos with smooth motion continuity and natural scene extension. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Edit Video
X-AI Grok Imagine Video Edit enables video editing using xAI's Grok Imagine Video model. Transform and modify existing videos with text prompts for seamless AI-powered edits. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Image To Video
X-AI Grok Imagine Video transforms images into videos using xAI's Grok Imagine Video model. Animate still images with natural motion, scene continuity, and synchronized audio. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Text To Video
X-AI Grok Imagine Video generates videos from text descriptions using xAI's Grok Imagine Video model. Create high-quality videos with customizable duration, aspect ratio, and resolution. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Image To Video
xAI Grok Imagine Video v1.5 is a fast AI image-to-video generation model that turns a reference image into a short video guided by a text prompt, with 480p and 720p output options. Ready-to-use REST inference API for animating images, social media clips, creative storytelling, product visuals, marketing videos, concept videos, and professional image-to-video workflows with simple integration, no coldstarts, and affordable pricing.
See Grok Imagine Video in action
Real outputs generated by the Grok Imagine Video API. Hover any video to preview, click to open the full-size viewer.
How to use the Grok Imagine Video API
Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.
- 1
Get an API key
Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.
- 2
Submit a prediction
POST your input as JSON to https://api.wavespeed.ai/api/v3/x-ai/grok-imagine-video/text-to-video. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.
- 3
Poll for completion
GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result every 1-2 seconds. The response includes a status field; keep polling until it flips from"queued" or"processing" to"completed".
- 4
Read the output URL
Once status is"completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the Grok Imagine Video variant you called.
What you can build with Grok Imagine Video
Common workflows developers and creators use the Grok Imagine Video API for.
Text-to-video with customizable output
x-ai/grok-imagine-video/text-to-video generates videos from text descriptions with customizable duration, aspect ratio, and resolution — xAI's general-purpose Grok Imagine Video generation endpoint.
Image-to-video with scene continuity
x-ai/grok-imagine-video/image-to-video animates still images with natural motion, scene continuity, and synchronized audio — useful when the starting frame is fixed and motion must feel connected to the source.
Reference-to-video for identity preservation
x-ai/grok-imagine-video/reference-to-video generates videos from multiple reference images with preserved identity, style, and scene composition — the variant when referenced subjects must stay recognizable.
Video-extend for longer sequences
x-ai/grok-imagine-video/video-extend turns short clips into longer videos with smooth motion continuity and natural scene extension — chain extends to build narrative sequences beyond a single generation window.
Edit-video with text prompts
x-ai/grok-imagine-video/edit-video transforms and modifies existing videos with text prompts for seamless AI-powered edits — prompt-driven changes on source footage without re-generating from scratch.
Flexible duration and aspect ratio
Catalog claim: customizable duration, aspect ratio, and resolution across generation variants. Set delivery parameters up front rather than cropping or re-encoding after generation.
Tips for prompting Grok Imagine Video
Practical advice for getting better outputs from Grok Imagine Video — drawn from the patterns that work across video models in production pipelines.
Set duration and aspect ratio up front
Catalog claim: customizable duration, aspect ratio, and resolution. Set delivery parameters in the API call — don't generate landscape and crop to 9:16 afterward.
Reference-to-video for ensemble scenes
When multiple referenced subjects must stay coherent, use reference-to-video with all reference images supplied — identity and scene composition carry from refs to output.
Video-extend for narrative continuity
Generate a strong 5-8s base clip, then extend with video-extend for smooth continuation. Chain extends for longer sequences rather than one long generation.
Edit-video for targeted changes
Use edit-video when you have existing footage that needs prompt-driven modification — faster than re-generating the full scene from scratch.
Image-to-video preserves scene continuity
Catalog emphasizes scene continuity on image-to-video. Supply a clean, well-lit reference still — the model anchors motion to the starting composition.
Grok Imagine Video API pricing
Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).
| Endpoint | Type | Starting price |
|---|---|---|
| x-ai/grok-imagine-video/reference-to-video | image-to-video | $0.050 |
| x-ai/grok-imagine-video/video-extend | video-extend | $0.050 |
| x-ai/grok-imagine-video/edit-video | video-to-video | $0.065 |
| x-ai/grok-imagine-video/image-to-video | image-to-video | $0.050 |
| x-ai/grok-imagine-video/text-to-video | text-to-video | $0.050 |
| x-ai/grok-imagine-video-v1.5/image-to-video | image-to-video | $0.12 |
Call the Grok Imagine Video API
Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.
HTTP example
# 1. Submit a prediction
curl -X POST "https://api.wavespeed.ai/api/v3/x-ai/grok-imagine-video/text-to-video" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $WAVESPEED_API_KEY" \
-d '{}'
# 2. Poll the result until status = "completed"
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
-H "Authorization: Bearer $WAVESPEED_API_KEY"
# Read the output URL from data.outputs[0].Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY
const result = await client.run("x-ai/grok-imagine-video/text-to-video", {});
console.log(result.outputs[0]); // → URL of the generated outputPython example
# pip install wavespeed
import wavespeed
output = wavespeed.run(
"x-ai/grok-imagine-video/text-to-video",
{}
)
print(output["outputs"][0]) # → URL of the generated outputGrok Imagine Video vs alternatives
When to pick Grok Imagine Video over similar models on WaveSpeedAI.
Grok Imagine Video vs Seedance 2.0
Seedance 2.0 ships native audio across every tier plus Fast/Standard/Turbo pricing tiers and video-edit as a dedicated restyle endpoint. Grok Imagine Video covers a similar five-variant surface (including edit-video and video-extend) under xAI's Grok branding.
Grok Imagine Video vs Wan 2.7
Wan 2.7 adds image-edit and text-to-image variants in the same family for cross-modal pipelines. Grok Imagine Video stays video-focused with edit-video and video-extend as first-class endpoints.
Grok Imagine Video vs Veo 3.1
Veo 3.1 has stronger photorealism reputation for human faces and three Google pricing tiers. Grok Imagine Video offers customizable duration/resolution and a five-endpoint surface including edit-video — different provider, comparable API integration pattern.
Grok Imagine Video API — Frequently asked questions
Pricing, license, integration — common questions about running Grok Imagine Video on WaveSpeedAI.
What is the Grok Imagine Video API?
Grok Imagine Video is a xAI video generation model exposed as a REST API on WaveSpeedAI. xAI Grok Imagine Video — text-to-video, image-to-video, reference-to-video, video-extend, and edit-video from xAI's Grok Imagine Video model. Customizable duration, aspect ratio, and resolution with synchronized audio on generation variants. You can call it programmatically or try it from the playground linked above.
How do I call the Grok Imagine Video API?
Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/x-ai/grok-imagine-video/text-to-video with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to "completed", then read the output URL from data.outputs[0]. Full Python / Node.js / cURL examples are above.
How much does the Grok Imagine Video API cost?
Grok Imagine Video starts at $0.050 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.
Which Grok Imagine Video variants are available?
WaveSpeedAI hosts 6 Grok Imagine Video endpoints: x-ai/grok-imagine-video/reference-to-video, x-ai/grok-imagine-video/video-extend, x-ai/grok-imagine-video/edit-video, x-ai/grok-imagine-video/image-to-video, x-ai/grok-imagine-video/text-to-video, x-ai/grok-imagine-video-v1.5/image-to-video. Each variant has its own playground page and pricing.
Can I use Grok Imagine Video outputs commercially?
Commercial usage rights follow the xAI model license. Most xAI models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.
Why use Grok Imagine Video on WaveSpeedAI instead of going direct?
One API key + one billing account across Grok Imagine Video AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below xAI's direct API.
About xAI
The team behind Grok Imagine Video and the broader xAI model lineup on WaveSpeedAI.
xAI is Elon Musk's AI company, shipping Grok models for chat, reasoning, and multimodal generation. Grok Imagine Video covers text-to-video, image-to-video, reference-to-video, video-extend, and edit-video with customizable duration, aspect ratio, and resolution — available through WaveSpeedAI's unified API alongside models from every major provider.
Start building with Grok Imagine Video on WaveSpeedAI
Free starter credits on signup. One API key across 1,000+ AI models from xAI and every other provider.