Openai Sora 2 Text To Video Pro
Playground
Try it on WavespeedAI!OpenAI Sora 2 Text-to-Video Pro creates high-fidelity videos with synchronized audio, realistic physics, and enhanced steerability. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Features
Notice — Service Stability
The Sora 2 family is currently unstable. Generations may fall back to alternative models without notice and the service can be temporarily unavailable. OpenAI is also expected to discontinue this model in the future.
If you need an equally capable, stable alternative, we recommend Seedance 2: bytedance/seedance-2.0/text-to-video.
Sora 2 Text-to-Video Pro
Sora 2 Text-to-Video Pro is OpenAI’s premium text-to-video model. Describe any scene in natural language — AI renders it into a cinematic, high-resolution video with physics-aware motion, temporal consistency, and optional multi-character support. Compared to the standard version, Pro delivers higher fidelity output, broader resolution choices, and enhanced motion coherence for production-grade results.
- Need to animate an existing image? Try Sora 2 Image-to-Video Pro
- Looking for a lower-cost option? Try Sora 2 Text-to-Video
Why Choose This?
-
Premium cinematic quality Higher fidelity output with enhanced detail, motion coherence, and richer scene composition than the standard version.
-
Physics-aware motion Understands contact, inertia, and momentum so objects, people, and environments move and interact believably.
-
Multi-character scene support Reference pre-defined character IDs to maintain consistent character identity across a single generation — no manual compositing required.
-
Broad resolution support Six output sizes spanning portrait and landscape orientations from 720p up to 1080p-class resolutions, suitable for social, cinematic, and broadcast workflows.
-
Temporal consistency Stable identities, minimal flicker and ghosting, and clean frame-to-frame transitions throughout.
-
Scalable duration Generate clips from 4 seconds up to 20 seconds to match your pacing and production needs.
Parameters
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Text description of the scene, action, environment, camera style, and mood. |
| size | No | Output resolution. Options: 720×1280, 1280×720, 1024×1792, 1792×1024, 1080×1920, 1920×1080. |
| duration | No | Clip length in seconds. Options: 4, 8, 12, 16, 20. |
| characters | No | List of character IDs to include. Add one or more char_… identifiers for consistent characters. |
How to Use
- Write your prompt — describe the scene, characters, actions, camera angle, lighting, and style in detail.
- Select size — choose portrait or landscape orientation and resolution tier based on your delivery target.
- Set duration — choose 4, 8, 12, 16, or 20 seconds based on your scene length.
- Add character IDs (optional) — click Add Item under the characters section to reference pre-defined characters.
- Submit — generate, preview, and download your video.
Example Prompt
In a 90s documentary-style interview, an old Swedish man sits in a study and says, “I still remember when I was young.”
Pricing
| Duration | 720×1280 / 1280×720 | 1024×1792 / 1792×1024 | 1080×1920 / 1920×1080 |
|---|---|---|---|
| 4s | $1.20 | $2.00 | $2.80 |
| 8s | $2.40 | $4.00 | $5.60 |
| 12s | $3.60 | $6.00 | $8.40 |
| 16s | $4.80 | $8.00 | $11.20 |
| 20s | $6.00 | $10.00 | $14.00 |
Billing Rules
- 720×1280 / 1280×720: $0.30 per second
- 1024×1792 / 1792×1024: $0.50 per second
- 1080×1920 / 1920×1080: $0.70 per second
- Duration options: 4, 8, 12, 16, or 20 seconds
- Billing is based on the selected duration and size, not actual playback length
Best Use Cases
- Cinematic Storytelling — Render rich, narrative-driven scenes from detailed text descriptions.
- Commercial & Brand Video — Produce premium-quality footage for marketing campaigns without a film crew.
- Social Media Content — Generate portrait-format clips optimized for Reels, TikTok, and Shorts.
- Documentary & Interview Style — Recreate specific camera aesthetics and era-accurate visual styles.
- Multi-Character Scenes — Animate ensemble casts with consistent identity across the full clip.
Pro Tips
- The more specific your prompt, the better the result — include camera style, lighting, era, mood, and character behavior.
- Use portrait sizes (720×1280, 1024×1792, 1080×1920) for mobile-first platforms and landscape for cinematic or desktop formats.
- Start with a 4-second generation at a lower resolution to validate your prompt before committing to longer, higher-resolution runs.
- Character IDs must be created in advance — ensure they are saved and accessible in your account before adding them.
Notes
- Only prompt is required; size, duration, and characters are optional.
- Character IDs reference existing character profiles — this model does not create new character definitions.
- Please follow OpenAI’s usage policies when crafting prompts.
Related Models
- Sora 2 Text-to-Video — Standard version at lower cost for faster iteration.
- Sora 2 Image-to-Video Pro — Animate a still image into a cinematic video with the same Pro quality.
- Sora 2 Characters — Create and save reusable character IDs for multi-character generations.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/openai/sora-2/text-to-video-pro" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"size": "720*1280",
"duration": 4
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The positive prompt for the generation. | |
| size | string | No | 720*1280 | 720*1280, 1280*720, 1024*1792, 1792*1024, 1920*1080, 1080*1920 | The size of the generated media in pixels (width*height). |
| duration | integer | No | 4 | 4, 8, 12, 16, 20 | The duration of the generated video in seconds. |
| characters | array | No | - | - | Element reference list. To get available elements and their IDs, visit: https://wavespeed.ai/models/openai/sora-2/characters |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | string | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |