MiniMax Hailuo 2.3 Standard is an image-to-video model producing physics-aware 768p output with a 2.5x efficiency improvement. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Idle
$0.28per run·~35 / $10
Camera: A ground-level, low-angle shot that shakes violently (like an earthquake). The camera rapidly pulls back (dolly out) as Hulk raises both fists high above his head. Effect: Hulk lets out a furious, wide-mouthed roar, then smashes both fists down onto the street in front of him. The asphalt explodes upwards in a shockwave of shattered concrete chunks and dust. Nearby parked cars are thrown into the air. Sounds/Voices: An earth-shattering, deep-chested "ROOOOAAAAR!" followed immediately by the deafening "CRACK-BOOM" of the impact. The sounds of twisting metal and shattering glass echo. Mood: Uncontrollable rage, catastrophic destruction, and immense, raw power. Lighting: Bright, harsh daylight (like the image). The air fills with thick concrete dust, catching the sunbeams. Strong shadows emphasize Hulk's massive muscles.
A medieval archer in chainmail and helmet, with intense, focused eyes, fully draws his longbow and releases the arrow directly towards the enemy lines. Instantly, the camera locks onto the arrow, rapidly accelerating and flying with it in a thrilling 'arrow-cam' perspective. The arrow soars high over the chaotic battlefield, revealing clashing armies, burning siege engines, and the crumbling castle ruins below. After a few seconds of high-speed flight, the arrow makes a dramatic, hard impact, embedding itself deep into the wooden shield of an advancing enemy soldier. Cinematic, high-action, blockbuster movie scene, projectile POV, dynamic camera, epic medieval battle, intense, high-speed, dramatic impact.
A vibrant K-pop girl group of four members, each with unique, colorful hairstyles and stylish stage outfits, performs a high-energy, perfectly synchronized dance routine on a brightly lit concert stage. They execute sharp, precise movements with powerful grace, transitioning seamlessly between formations. Their expressions are confident and charismatic, engaging directly with the audience. Dynamic camera angles capture their full body choreography and close-ups of their captivating visuals. Fast-paced, energetic, pop concert atmosphere, glamorous, bold fashion, professional choreography, bright spotlights, fan cheering implied, blockbuster music video quality.
Camera: A slow, deliberate low-angle orbit shot around the armored warrior. The camera then subtly pulls back (dolly out), revealing more of the desolate, burning cityscape behind them. Effect: The warrior's steam-punk-esque goggles subtly fog and clear. The burning car in the background flickers with intense orange flames and billows thick, dark smoke that slowly drifts across the frame. Loose debris (dust, embers) gently shifts and floats in the polluted air. The neon sign flickers erratically. Sounds/Voices: A low, tense, ambient drone mixed with the crackling and roaring of the burning car. The distant, hollow sound of wind whistling through derelict buildings. The erratic, soft "BZZZT" of the neon sign. Mood: Gritty, desolate, dangerous, and atmospheric. A sense of weary vigilance in a ruined world. Lighting: The scene is dominated by the sickly, ominous green glow of the sky and the harsh, flickering orange-red light from the burning car, casting dynamic shadows. The neon sign emits a stark, pulsing pink and blue glow, reflecting off the rusty armor.
Camera: A dynamic, low-angle orbit shot that quickly circles the robot. As the laser fires, the camera suddenly whips over and tracks the laser beam to the alien warlord, then violently shakes upon impact. Effect: The robot's laser weapons fire intense, glowing red beams that visibly scorch the air as they cut across the ruined cityscape. The alien warlord hovers menacingly, surrounded by crackling purple energy that pulses and flares as it deflects the laser. Debris from the surrounding ruined buildings continuously falls and shifts with dramatic smoke and sparks. Sounds/Voices: The sharp, powerful "ZAP!" and continuous "HISSSS" of the robot's laser fire. The alien warlord emits a deep, guttural, distorted growl/chant as it deflects the energy. The creaking and groaning of collapsing metal and distant explosions fill the soundscape. Mood: Intense, climactic, desperate, and visually spectacular. A desperate, high-stakes final battle. Lighting: The scene is dominated by the blinding red glow of the robot's lasers and the vibrant, pulsating purple energy of the alien warlord. These primary light sources cast dynamic, contrasting shadows across the ruined city, highlighting the textures of metal and debris. Flames from burning wreckage provide flickering orange highlights.
Hailuo 2.3 I2V Standard is MiniMax’s latest image-to-video model, designed to animate static images into smooth, cinematic video clips. It combines natural motion synthesis, physical realism, and character consistency — enabling creators to bring still visuals vividly to life.
| Duration | Resolution | Cost per Job |
|---|---|---|
| 6 seconds | 768p | $0.28 |
| 10 seconds | 768p | $0.56 |
Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/minimax/hailuo-2.3/i2v-standard with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Hailuo 2.3 I2v Standard below.
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/minimax/hailuo-2.3/i2v-standard" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $WAVESPEED_API_KEY" \
-d '{
"image": "https://example.com/your-input.jpg",
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"duration": 6,
"enable_prompt_expansion": false
}'
# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
-H "Authorization: Bearer $WAVESPEED_API_KEY"
# When status is "completed", read the output from data.outputs[0].// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env
const result = await client.run("minimax/hailuo-2.3/i2v-standard", {
"image": "https://example.com/your-input.jpg",
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"duration": 6,
"enable_prompt_expansion": false
});
console.log(result.outputs[0]); // → URL of the generated output# pip install wavespeed
import wavespeed
output = wavespeed.run(
"minimax/hailuo-2.3/i2v-standard",
{
"image": "https://example.com/your-input.jpg",
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"duration": 6,
"enable_prompt_expansion": false
}
)
print(output["outputs"][0]) # → URL of the generated outputHailuo 2.3 I2v Standard is a MiniMax model for video generation from images, exposed as a REST API on WaveSpeedAI. MiniMax Hailuo 2.3 Standard is an image-to-video model producing physics-aware 768p output with a 2.5x efficiency improvement. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.
POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/minimax/minimax-hailuo-2.3-i2v-standard.
Hailuo 2.3 I2v Standard starts at $0.28 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.
Key inputs: `prompt`, `image`, `duration`, `enable_prompt_expansion`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/minimax/minimax-hailuo-2.3-i2v-standard.
Average end-to-end generation time on WaveSpeedAI is around 116 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.
Commercial usage rights depend on the model's license, set by its provider (MiniMax). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.