Openai Sora 2 Text To Video Pro
Playground
Try it on WavespeedAI!OpenAI Sora 2 Text-to-Video Pro creates high-fidelity videos with synchronized audio, realistic physics, and enhanced steerability. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Features
Sora 2 Text-to-Video Pro
Sora 2 Text-to-Video Pro is OpenAI’s premium text-to-video model. Describe any scene in natural language — AI renders it into a cinematic, high-resolution video with physics-aware motion, temporal consistency, and optional multi-character support. Compared to the standard version, Pro delivers higher fidelity output, broader resolution choices, and enhanced motion coherence for production-grade results.
- Need to animate an existing image? Try Sora 2 Image-to-Video Pro
- Looking for a lower-cost option? Try Sora 2 Text-to-Video
Why Choose This?
-
Premium cinematic quality Higher fidelity output with enhanced detail, motion coherence, and richer scene composition than the standard version.
-
Physics-aware motion Understands contact, inertia, and momentum so objects, people, and environments move and interact believably.
-
Multi-character scene support Reference pre-defined character IDs to maintain consistent character identity across a single generation — no manual compositing required.
-
Broad resolution support Six output sizes spanning portrait and landscape orientations from 720p up to 1080p-class resolutions, suitable for social, cinematic, and broadcast workflows.
-
Temporal consistency Stable identities, minimal flicker and ghosting, and clean frame-to-frame transitions throughout.
-
Scalable duration Generate clips from 4 seconds up to 20 seconds to match your pacing and production needs.
Parameters
| Parameter | Required | Description |
|---|---|---|
| prompt | Yes | Text description of the scene, action, environment, camera style, and mood. |
| size | No | Output resolution. Options: 720×1280, 1280×720, 1024×1792, 1792×1024, 1080×1920, 1920×1080. |
| duration | No | Clip length in seconds. Options: 4, 8, 12, 16, 20. |
| characters | No | List of character IDs to include. Add one or more char_… identifiers for consistent characters. |
How to Use
- Write your prompt — describe the scene, characters, actions, camera angle, lighting, and style in detail.
- Select size — choose portrait or landscape orientation and resolution tier based on your delivery target.
- Set duration — choose 4, 8, 12, 16, or 20 seconds based on your scene length.
- Add character IDs (optional) — click Add Item under the characters section to reference pre-defined characters.
- Submit — generate, preview, and download your video.
Example Prompt
In a 90s documentary-style interview, an old Swedish man sits in a study and says, “I still remember when I was young.”
Pricing
| Duration | 720×1280 / 1280×720 | 1024×1792 / 1792×1024 | 1080×1920 / 1920×1080 |
|---|---|---|---|
| 4s | $1.20 | $2.00 | $2.80 |
| 8s | $2.40 | $4.00 | $5.60 |
| 12s | $3.60 | $6.00 | $8.40 |
| 16s | $4.80 | $8.00 | $11.20 |
| 20s | $6.00 | $10.00 | $14.00 |
Billing Rules
- 720×1280 / 1280×720: $0.30 per second
- 1024×1792 / 1792×1024: $0.50 per second
- 1080×1920 / 1920×1080: $0.70 per second
- Duration options: 4, 8, 12, 16, or 20 seconds
- Billing is based on the selected duration and size, not actual playback length
Best Use Cases
- Cinematic Storytelling — Render rich, narrative-driven scenes from detailed text descriptions.
- Commercial & Brand Video — Produce premium-quality footage for marketing campaigns without a film crew.
- Social Media Content — Generate portrait-format clips optimized for Reels, TikTok, and Shorts.
- Documentary & Interview Style — Recreate specific camera aesthetics and era-accurate visual styles.
- Multi-Character Scenes — Animate ensemble casts with consistent identity across the full clip.
Pro Tips
- The more specific your prompt, the better the result — include camera style, lighting, era, mood, and character behavior.
- Use portrait sizes (720×1280, 1024×1792, 1080×1920) for mobile-first platforms and landscape for cinematic or desktop formats.
- Start with a 4-second generation at a lower resolution to validate your prompt before committing to longer, higher-resolution runs.
- Character IDs must be created in advance — ensure they are saved and accessible in your account before adding them.
Notes
- Only prompt is required; size, duration, and characters are optional.
- Character IDs reference existing character profiles — this model does not create new character definitions.
- Please follow OpenAI’s usage policies when crafting prompts.
Related Models
- Sora 2 Text-to-Video — Standard version at lower cost for faster iteration.
- Sora 2 Image-to-Video Pro — Animate a still image into a cinematic video with the same Pro quality.
- Sora 2 Characters — Create and save reusable character IDs for multi-character generations.
Authentication
For authentication details, please refer to the Authentication Guide.
API Endpoints
Submit Task & Query Result
# Submit the task
curl --location --request POST "https://api.wavespeed.ai/api/v3/openai/sora-2/text-to-video-pro" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}" \
--data-raw '{
"size": "720*1280",
"duration": 4
}'
# Get the result
curl --location --request GET "https://api.wavespeed.ai/api/v3/predictions/${requestId}/result" \
--header "Authorization: Bearer ${WAVESPEED_API_KEY}"
Parameters
Task Submission Parameters
Request Parameters
| Parameter | Type | Required | Default | Range | Description |
|---|---|---|---|---|---|
| prompt | string | Yes | - | The positive prompt for the generation. | |
| size | string | No | 720*1280 | 720*1280, 1280*720, 1024*1792, 1792*1024, 1920*1080, 1080*1920 | The size of the generated media in pixels (width*height). |
| duration | integer | No | 4 | 4, 8, 12, 16, 20 | The duration of the generated video in seconds. |
| characters | array | No | - | - | Element reference list. To get available elements and their IDs, visit: https://wavespeed.ai/models/openai/sora-2/characters |
Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data.id | string | Unique identifier for the prediction, Task Id |
| data.model | string | Model ID used for the prediction |
| data.outputs | array | Array of URLs to the generated content (empty when status is not completed) |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.has_nsfw_contents | array | Array of boolean values indicating NSFW detection for each output |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |
Result Request Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| id | string | Yes | - | Task ID |
Result Response Parameters
| Parameter | Type | Description |
|---|---|---|
| code | integer | HTTP status code (e.g., 200 for success) |
| message | string | Status message (e.g., “success”) |
| data | object | The prediction data object containing all details |
| data.id | string | Unique identifier for the prediction, the ID of the prediction to get |
| data.model | string | Model ID used for the prediction |
| data.outputs | string | Array of URLs to the generated content (empty when status is not completed). |
| data.urls | object | Object containing related API endpoints |
| data.urls.get | string | URL to retrieve the prediction result |
| data.status | string | Status of the task: created, processing, completed, or failed |
| data.created_at | string | ISO timestamp of when the request was created (e.g., “2023-04-01T12:34:56.789Z”) |
| data.error | string | Error message (empty if no error occurred) |
| data.timings | object | Object containing timing details |
| data.timings.inference | integer | Inference time in milliseconds |