Grok Imagine Video Reference-to-Video
Grok Imagine Video Reference-to-Video is X-AI's multi-image reference model that generates videos from up to 7 reference images. Provide reference images and describe the desired motion — the model generates a video that preserves the identity, style, and composition from your references with smooth, natural movement.
Why Choose This?
-
Multi-image reference
Use up to 7 reference images to guide video generation with rich visual context.
-
Identity preservation
Characters, objects, and scenes maintain consistent appearance across generated frames.
-
Flexible duration
Generate videos at 6 or 10 seconds to match your scene pacing.
-
Resolution options
Output in 720p or 480p based on your quality and speed requirements.
Parameters
| Parameter | Required | Description |
|---|
| images | Yes | Array of reference image URLs (1-7 images). |
| prompt | Yes | Text description of the desired motion, camera movement, and scene. |
| duration | No | Video length in seconds. Options: 6, 10. |
| resolution | No | Output resolution: 720p (default) or 480p. |
How to Use
- Upload your reference images — provide 1 to 7 reference images via URL or drag-and-drop upload.
- Write your prompt — describe the motion, camera movement, and scene details. Reference the uploaded images in your prompt using @image1, @image2, etc.
- Set duration — choose 6 or 10 seconds based on your scene length.
- Select resolution — 720p for higher quality, 480p for faster processing.
- Run — submit and download your video.
Pricing
| Duration | Cost |
|---|
| 6s | $0.30 |
| 10s | $0.50 |
Billing Rules
- Rate: $0.05 per second
- Duration options: 6 or 10 seconds
- Billing is based on the selected duration, not actual playback length
Best Use Cases
- Character Consistency — Generate videos with consistent character appearance across multiple shots using reference images.
- Product Showcases — Create dynamic product videos from multiple product photos.
- Multi-angle References — Use different angles of the same subject to generate richer, more accurate video.
- Social Media Content — Create engaging video clips from image collections for Reels, TikTok, and Shorts.
- Creative Projects — Combine multiple visual references to create unique video compositions.
Pro Tips
- Use high-quality, well-lit reference images for better identity preservation.
- Reference uploaded images in your prompt using @image1, @image2, etc. for precise control.
- Keep reference content and prompt aligned — if references show a character, describe that character's actions.
- Start with fewer references and add more if needed for richer context.
- Use 6-second generations to test your prompt before committing to 10 seconds.
Notes
- Both images and prompt are required fields.
- Up to 7 reference images are supported.
- Ensure image URLs are publicly accessible.
- Maximum duration is 10 seconds.
Related Models
- Grok Imagine Video Image-to-Video — Generate video from a single image.
- Grok Imagine Video Extend — Extend existing videos with smooth continuation.
- Grok Imagine Video Edit — Edit existing videos with text instructions.