Openai Sora 2 Text To Video
Playground
Try it on WavespeedAI!OpenAI Sora 2 is a state-of-the-art text-to-video model with realistic visuals, accurate physics, synchronized audio, and strong steerability. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Features
Notice — Service Stability
The Sora 2 family is currently unstable. Generations may fall back to alternative models without notice and the service can be temporarily unavailable. OpenAI is also expected to discontinue this model in the future.
If you need an equally capable, stable alternative, we recommend Seedance 2: bytedance/seedance-2.0/text-to-video.
Sora 2 Text-to-Video
Sora 2 Text-to-Video is OpenAI’s text-to-video model purpose-built for scenes featuring multiple distinct characters simultaneously. Describe the scene in natural language, reference your pre-defined character IDs, and the model renders a cohesive, temporally consistent video where every character looks and moves exactly as intended — no manual compositing required.
Why Choose This?
-
True multi-character consistency Reference two or more character IDs in a single generation. Each character retains its unique appearance, proportions, and style throughout every frame.
-
Natural-language scene control Describe interactions, environments, and actions in plain text. The model understands spatial relationships and character dynamics to produce believable compositions.
-
Flexible aspect ratio support Choose between portrait (720×1280) and landscape (1280×720) orientations to match your target platform.
-
Scalable duration Generate clips from 4 seconds up to 20 seconds in fixed steps, giving you full control over pacing and output cost.
-
Production-ready output Delivers smooth, artifact-free motion suitable for marketing content, storytelling, game cinematics, and social media video.
Parameters
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Text description of the scene, characters, actions, and environment. |
| size | No | Output resolution: 720×1280 (portrait) or 1280×720 (landscape). |
| duration | No | Clip length in seconds. Options: 4, 8, 12, 16, 20. |
| characters | No | List of character IDs to include. Add one or more char_… identifiers. |
How to Use
- Write your prompt — describe what the characters are doing and where the scene takes place.
- Select size — portrait (720×1280) for mobile/social, landscape (1280×720) for widescreen.
- Set duration — choose 4, 8, 12, 16, or 20 seconds based on your scene length.
- Add character IDs — click Add Item under the characters section to include each character by their unique identifier.
- Submit — generate, preview, and download your video.
Pricing
| Duration | Cost per Generation |
|---|---|
| 4s | $0.40 |
| 8s | $0.80 |
| 12s | $1.20 |
| 16s | $1.60 |
| 20s | $2.00 |
Billing Rules
- Rate: $0.10 per second
- Duration options: 4, 8, 12, 16, or 20 seconds
- Billing is based on the selected duration, not actual playback length
Best Use Cases
- Brand & Marketing Videos — Feature multiple characters or spokespeople in a single scene without manual compositing.
- Social Media Content — Produce portrait-format multi-character clips optimized for Reels, TikTok, and Shorts.
- Game & IP Storytelling — Render in-world scenes with established characters maintaining consistent visual identity.
- Educational & Explainer Content — Animate two or more characters interacting to illustrate concepts or narratives.
- Advertising & Campaigns — Generate diverse cast scenarios rapidly for A/B testing creative variations.
Pro Tips
- Be specific about character positions and actions in your prompt for better spatial composition.
- Use portrait mode (720×1280) for mobile-first platforms and landscape (1280×720) for cinematic or desktop use.
- Start with a 4-second generation to validate composition and character rendering before committing to a longer duration.
- Ensure all referenced character IDs are valid and accessible in your account before submitting.
Notes
- Character IDs must be created and saved in advance — this model references existing character profiles and does not create new definitions.
- Only prompt is a required field; size, duration, and characters are optional.
- Complex multi-character scenes benefit from concise, clearly structured prompts.
Related Models
- Sora 2 Characters — Create and save reusable character IDs for use in this model.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/openai/sora-2/text-to-video" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"size": "720*1280",
"duration": 4
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The positive prompt for the generation. | |
| size | string | No | 720*1280 | 720*1280, 1280*720 | The size of the generated media in pixels (width*height). |
| duration | integer | No | 4 | 4, 8, 12, 16, 20 | The duration of the generated video in seconds. |
| characters | array | No | - | - | Element reference list. To get available elements and their IDs, visit: https://wavespeed.ai/models/openai/sora-2/characters |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | string | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |