Enjoy 50% OFF Vidu Q3 & Q3 Pro models • Only on WaveSpeedAI | May 20 – Jun 2

Hailuo 2.3 T2V Pro

minimax /

MiniMax Hailuo 2.3 Pro is a text-to-video model delivering 1080p videos with 2.5x efficiency and 85% complex-instruction accuracy. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-video
Input
The model automatically optimizes incoming prompts to enhance output quality. This also activates the safety checker, which ensures content safety by detecting and filtering potential risks.

Idle

$0.49per run·~20 / $10

Next:

ExamplesView all

Camera: A slow, steady wide shot (as if gently floating) that moves through a dense, lush, sun-dappled forest. The camera pauses slightly as it reveals a small, friendly forest spirit. Effect: Tiny, glowing dust motes (tree spirits / Kodama) slowly drift and sparkle through the shafts of sunlight. Leaves on the trees gently sway in a soft, visible breeze. A small, forest spirit (like a Kodama or Totoro-esque creature) blinks slowly and turns its head, then nods gently to the camera. Sounds/Voices: Soft, ambient forest sounds: the gentle chirping of unseen birds, the distant trickle of water, and the rustling of leaves in the breeze. A delicate, whimsical flute melody plays softly, accompanied by a faint, magical "tinkle" when the spirit nods. Mood: Whimsical, peaceful, magical, enchanting, and serene. A sense of wonder and gentle calm. Lighting: Warm, golden, dappled sunlight filters through the dense tree canopy, creating soft, glowing patches on the forest floor and highlighting the lush greenery. Subtle lens flares appear in the brightest areas.

Camera: A high-angle helicopter/drone shot overlooking a coastal city, shaking violently. The camera pans from the panicking crowds in the streets to the horizon, revealing the approaching wave. Effect: A colossal tsunami wave, as wide as the city itself and hundreds of feet tall, fills the entire horizon. It moves with terrifying speed, violently impacting the outermost buildings, sending water, cars, and debris exploding hundreds of feet into the air. Sounds/Voices: A deafening, low-frequency "ROAR" of the ocean. The piercing sound of city-wide emergency sirens. The massive, crunching, and crashing sounds of thousands of buildings breaking and collapsing. Mood: Utterly terrifying, apocalyptic, unstoppable, and catastrophic. Lighting: Sickly, grey, overcast daylight. The water is a dark, murky blue-green. Visibility is low due to the mist and spray kicked up by the wave.

Camera: A playful 360-degree orbit shot (medium shot) around three dancers in a bright, candy-themed, pastel-colored set. They are smiling and laughing. Effect: As they perform their signature "heart-hands" point dance (a key move), cartoon-style sparkles and small, colorful hearts pop and animate around their hands. Sounds/Voices: Upbeat, bubbly, fast-paced K-pop or J-pop music. A cute "chime" or "boing" sound effect when the sparkles appear. Audible, light giggles from the members. Mood: Joyful, energetic, sweet, playful, and infectious. Lighting: Extremely bright, high-key, shadowless studio lighting. Soft pink, lavender, and mint-green colors flood the set. Warm, glowing lens flares.

Camera: First-person perspective (POV), the beam of a flashlight is the only viewpoint. The camera moves tensely and slowly down a pitch-black, decaying hospital corridor. The camera suddenly jerks to the right. Effect: The flashlight beam only illuminates a few feet ahead, catching dust motes in the air. As the camera jerks right, the beam briefly illuminates a pale face that vanishes in less than half a second. Voices/Sounds: Only the character's shaky, shallow breathing and the distant echo of a single water drop. A short, sharp violin screech (stinger) hits the moment the face appears. Mood: Extreme tension, claustrophobic, jump-scare, deep unease. Lighting: Total darkness, punctuated only by the narrow, cold-white beam of the unstable handheld flashlight.

A detective stands on a rainy street corner, looking down at a mysterious brass compass in his palm. The needle is spinning wildly. Camera pulls back from a close-up of the compass to reveal the detective's puzzled face. Film noir, neon reflections on wet streets, heavy shadows.

Related Models

README

MiniMax Hailuo 2.3 — Text-to-Video (T2V) Pro

Hailuo 2.3 Pro is the premium text-to-video model from MiniMax, engineered for creators who demand cinematic realism, dynamic motion, and superior visual coherence. It transforms text prompts into richly detailed 5-second 1080p videos — merging professional-grade quality with cutting-edge physical simulation.

🎬 Why It Looks Great

  • Cinematic Fidelity – Generates ultra-smooth motion, realistic lighting, and lifelike shadows in every frame.
  • Advanced Physics & Scene Logic – Accurately models object dynamics, reflections, and camera movement.
  • High Prompt Accuracy – Faithfully interprets natural-language descriptions with exceptional semantic precision.
  • Consistent Characters – Maintains subject identity and spatial layout throughout the clip.
  • Refined Aesthetic – Tuned for film-like color grading, depth, and atmosphere.

⚙️ Limits and Performance

  • Input: text prompt only
  • Output duration: fixed — 5 seconds
  • Resolution: up to 1080p
  • Processing time: approximately 40–70 seconds per job (depending on complexity and queue load)

💰 Pricing

DurationResolutionCost per Job
5 seconds1080p$0.49

🚀 How to Use

  1. Write a clear text prompt describing your scene, characters, lighting, and motion. Example: “A traveler walks through a neon-lit rainy street at night, reflections glowing on wet pavement.”
  2. Submit your job — no reference image required.
  3. Wait for processing (typically under 1 minute).
  4. Download your completed 5-second cinematic video.

💡 Pro Tips

  • Use film-style language — include camera direction (wide shot, slow zoom, tracking).
  • Mention lighting type (sunset glow, neon reflections, soft cinematic light).
  • Keep prompts concise (1–2 sentences) for best fidelity.
  • For stable subjects, include descriptors like same person or consistent background.
Accessibility:This website uses AI models provided by third parties.

Hailuo 2.3 T2v Pro API — Quick start

Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/minimax/hailuo-2.3/t2v-pro with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Hailuo 2.3 T2v Pro below.

HTTP example
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/minimax/hailuo-2.3/t2v-pro" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "enable_prompt_expansion": true
}'

# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# When status is "completed", read the output from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');

const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env

const result = await client.run("minimax/hailuo-2.3/t2v-pro", {
        "prompt": "A cinematic shot of a city at sunset, soft golden light",
        "enable_prompt_expansion": true
});

console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "minimax/hailuo-2.3/t2v-pro",
    {
    "prompt": "A cinematic shot of a city at sunset, soft golden light",
    "enable_prompt_expansion": true
}
)

print(output["outputs"][0])  # → URL of the generated output

Hailuo 2.3 T2v Pro API — Frequently asked questions

What is the Hailuo 2.3 T2v Pro API?

Hailuo 2.3 T2v Pro is a MiniMax model for video generation, exposed as a REST API on WaveSpeedAI. MiniMax Hailuo 2.3 Pro is a text-to-video model delivering 1080p videos with 2.5x efficiency and 85% complex-instruction accuracy. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.

How do I call the Hailuo 2.3 T2v Pro API?

POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/minimax/minimax-hailuo-2.3-t2v-pro.

How much does Hailuo 2.3 T2v Pro cost per run?

Hailuo 2.3 T2v Pro starts at $0.49 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.

What inputs does Hailuo 2.3 T2v Pro accept?

Key inputs: `prompt`, `enable_prompt_expansion`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/minimax/minimax-hailuo-2.3-t2v-pro.

How long does Hailuo 2.3 T2v Pro take to generate?

Average end-to-end generation time on WaveSpeedAI is around 166 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.

Can I use Hailuo 2.3 T2v Pro outputs commercially?

Commercial usage rights depend on the model's license, set by its provider (MiniMax). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.