Alibaba·image·From $0.020/run

Qwen Image API

Alibaba Qwen-Image — 20B MMDiT next-gen text-to-image and editing toolkit with bilingual Chinese/English support, multi-image editing, LoRA customization, layered compositing, and a 96-pose camera-angle system.

Text-to-image at base, enhanced 2512, and 2.0-pro variants. Edit endpoints include Edit, Edit-Plus (multi-image, ControlNet), Edit-LoRA, Edit-Multiple-Angles (96-pose camera system), and Layered (prompt-guided decomposition). Qwen Image 2.0 family variants included under the same prefix.

Open Playground →View API Docs

About the Qwen Image API

What Qwen Image does, how it fits in the Alibaba model lineup, and why teams reach for it.

Qwen Image is a image generation and editing model from Alibaba, available through the WaveSpeedAI REST API. Alibaba Qwen-Image — 20B MMDiT next-gen text-to-image and editing toolkit with bilingual Chinese/English support, multi-image editing, LoRA customization, layered compositing, and a 96-pose camera-angle system.

Text-to-image at base, enhanced 2512, and 2.0-pro variants. Edit endpoints include Edit, Edit-Plus (multi-image, ControlNet), Edit-LoRA, Edit-Multiple-Angles (96-pose camera system), and Layered (prompt-guided decomposition). Qwen Image 2.0 family variants included under the same prefix.

The Qwen Image family on WaveSpeedAI ships 21 REST endpoints covering Image-To-Image, Text-To-Image, Training, Lora-Support workflows. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.

Run Qwen Image through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.

All Qwen Image API endpoints

21 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Edit

Qwen Image 2.0 Edit is an advanced image-editing model with improved quality and better understanding of instructions. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.030

Edit

Qwen Image 2.0 Pro Edit is a professional-grade image editing model with superior quality and advanced instruction understanding. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.070

Text To Image

Qwen Image 2.0 is an advanced text-to-image model with enhanced image quality and improved prompt understanding. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-imagefrom $0.030

Text To Image

Qwen Image 2.0 Pro is a professional-grade text-to-image model with superior quality and advanced prompt understanding. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-imagefrom $0.070

Edit 2509 Multiple Angles

Qwen Image Edit 2509 Multiple Angles is an AI image editing model that generates multiple-angle views of objects or scenes from a single image. Transform perspectives and create diverse viewpoints with text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.025

Edit

Qwen Image Max Edit is an AI model for image editing with text prompts, supporting both Chinese and English languages. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.070

Text To Image

Qwen Image Max is a text-to-image model with high-quality image generation supporting Chinese and English prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-imagefrom $0.070

Edit Multiple Angles

Generate specific camera angles from a single image using a 96-pose camera system. Control horizontal rotation, vertical tilt, and zoom to create front, side, back views and more. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.025

Qwen Image 2512 Lora Trainer

Qwen-Image-2512 LoRA Trainer lets you train custom LoRA models 10x faster with style, character, and object training. From concept to model in minutes, not hours—upload a ZIP file containing images to start. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

trainingfrom $1.00

Text To Image 2512 Lora

Qwen-Image-2512 LoRA is an enhanced 20B MMDiT text-to-image model with LoRA support for fast customization and refined image generation. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

lora-supportfrom $0.025

Text To Image 2512

Qwen Image 2512 is Qwen's latest text-to-image model with enhanced prompt understanding, superior text rendering, and versatile aspect ratio support. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

text-to-imagefrom $0.020

Edit 2511 Lora

Qwen Image Edit 2511 LoRA is an enhanced version with custom LoRA support for personalized styles. It delivers stronger edit consistency, robust multi-person identity/pose consistency, custom LoRA styles, enhanced industrial/product design, and improved geometric reasoning for structure-preserving edits. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

lora-supportfrom $0.025

Edit 2511

Qwen Image Edit 2511 is a major upgrade over 2509 for real-world image editing and design. It delivers stronger edit consistency, robust multi-person identity/pose consistency, built-in LoRA styles, enhanced industrial/product design, and improved geometric reasoning for structure-preserving edits. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

image-to-imagefrom $0.020

Layered

Qwen-Image Layered is a unified image-layer decomposition model for prompt-guided compositing. Provide points, boxes, or rough masks to isolate subjects and regions, and the model splits a single image into multiple RGBA layers with clean alpha, soft edges, and correct occlusion order. Ready-to-use REST inference API with fast response, no cold starts, and affordable pricing.

image-to-imagefrom $0.025

Edit Plus Lora

Qwen-Image-Edit-Plus (2509) is 20B MMDiT image-to-image editor supporting multi-image edits, single-image consistency, and native ControlNet. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

lora-supportfrom $0.025

Edit Plus

Qwen-Image-Edit-Plus (2509) is a 20B MMDiT image editor with multi-image editing, single-image consistency and native ControlNet support. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.020

Edit Lora

Qwen-Image-Edit LoRA (20B) enables bilingual Chinese/English image-to-image editing with style preservation and semantic and appearance edits. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

lora-supportfrom $0.025

Edit

Qwen-Image-Edit is a 20B MMDiT image-to-image model offering precise bilingual (Chinese & English) text edits while preserving style. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.020

Qwen Image Lora Trainer

Train custom Qwen-Image LoRA models 10x faster. Style training, character training, object training. From concept to model in minutes, not hours. Upload a ZIP file containing images to start!

trainingfrom $1.00

Text To Image Lora

Qwen-Image LoRA is a 20B MMDiT next-gen text-to-image model with LoRA support for fast customization and refined image generation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

lora-supportfrom $0.025

Text To Image

Qwen-Image is a 20B MMDiT next-gen text-to-image model that generates images from text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-imagefrom $0.020

See Qwen Image in action

Real outputs generated by the Qwen Image API. Hover any video to preview, click to open the full-size viewer.

How to use the Qwen Image API

Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.

1
Get an API key
Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.
2
Submit a prediction
POST your input as JSON to https://api.wavespeed.ai/api/v3/wavespeed-ai/qwen-image/text-to-image. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.
3
Poll for completion
GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result every 1-2 seconds. The response includes a status field; keep polling until it flips from"queued" or"processing" to"completed".
4
Read the output URL
Once status is"completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the Qwen Image variant you called.

What you can build with Qwen Image

Common workflows developers and creators use the Qwen Image API for.

20B MMDiT text-to-image

wavespeed-ai/qwen-image/text-to-image is a 20B MMDiT next-gen text-to-image model that generates images from text prompts — the base generation endpoint in the Qwen Image family.

text-to-image20bmmdit

Enhanced 2512 with superior text rendering

wavespeed-ai/qwen-image/text-to-image-2512 is Qwen's latest text-to-image model with enhanced prompt understanding, superior text rendering, and versatile aspect ratio support per the catalog.

2512typographyaspect-ratio

Multi-image editing with Edit-Plus

wavespeed-ai/qwen-image/edit-plus is a 20B MMDiT editor with multi-image editing, single-image consistency, and native ControlNet support — useful for complex edits that reference multiple source images.

edit-plusmulti-imagecontrolnet

96-pose camera angle control

wavespeed-ai/qwen-image/edit-multiple-angles generates specific camera angles from a single image using a 96-pose camera system — control horizontal rotation, vertical tilt, and zoom for front, side, back views and more.

camera-angles96-pose3d-views

Layered compositing decomposition

wavespeed-ai/qwen-image/layered is a unified image-layer decomposition model for prompt-guided compositing — provide points, boxes, or rough masks to isolate subjects and split a single image into layers.

layeredcompositingdecomposition

LoRA customization

LoRA-supported variants (text-to-image-2512-lora, edit-lora, edit-plus-lora) enable fast customization and refined generation — train or apply LoRA checkpoints for style, character, or brand consistency.

loracustomizationconsistency

Tips for prompting Qwen Image

Practical advice for getting better outputs from Qwen Image — drawn from the patterns that work across image models in production pipelines.

Use 2512 for latest text rendering

text-to-image-2512 is the enhanced variant with superior text rendering and prompt understanding — pick over base text-to-image when typography matters.

Edit-Multiple-Angles for product views

edit-multiple-angles generates front, side, back, and custom camera angles from a single product photo — useful for e-commerce catalogs without a multi-camera shoot.

Edit-Plus for multi-image references

When an edit depends on multiple source images, use edit-plus with native ControlNet support rather than single-image edit.

Layered for compositing workflows

Use layered to decompose an image into prompt-guided layers — isolate subjects with points, boxes, or rough masks for downstream compositing.

LoRA variants for brand consistency

Apply trained LoRA checkpoints via text-to-image-2512-lora or edit-plus-lora for recurring style, character, or brand identity across generations.

Bilingual Chinese/English editing

Edit and Edit-LoRA support bilingual Chinese/English image-to-image editing with style preservation — useful for localization workflows.

Qwen Image API pricing

Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).

Endpoint	Type	Starting price
wavespeed-ai/qwen-image-2.0/edit	image-to-image	$0.030
wavespeed-ai/qwen-image-2.0-pro/edit	image-to-image	$0.070
wavespeed-ai/qwen-image-2.0/text-to-image	text-to-image	$0.030
wavespeed-ai/qwen-image-2.0-pro/text-to-image	text-to-image	$0.070
wavespeed-ai/qwen-image/edit-2509-multiple-angles	image-to-image	$0.025
wavespeed-ai/qwen-image-max/edit	image-to-image	$0.070
wavespeed-ai/qwen-image-max/text-to-image	text-to-image	$0.070
wavespeed-ai/qwen-image/edit-multiple-angles	image-to-image	$0.025
wavespeed-ai/qwen-image-2512-lora-trainer	training	$1.00
wavespeed-ai/qwen-image/text-to-image-2512-lora	lora-support	$0.025
wavespeed-ai/qwen-image/text-to-image-2512	text-to-image	$0.020
wavespeed-ai/qwen-image/edit-2511-lora	lora-support	$0.025
wavespeed-ai/qwen-image/edit-2511	image-to-image	$0.020
wavespeed-ai/qwen-image/layered	image-to-image	$0.025
wavespeed-ai/qwen-image/edit-plus-lora	lora-support	$0.025
wavespeed-ai/qwen-image/edit-plus	image-to-image	$0.020
wavespeed-ai/qwen-image/edit-lora	lora-support	$0.025
wavespeed-ai/qwen-image/edit	image-to-image	$0.020
wavespeed-ai/qwen-image-lora-trainer	training	$1.00
wavespeed-ai/qwen-image/text-to-image-lora	lora-support	$0.025
wavespeed-ai/qwen-image/text-to-image	text-to-image	$0.020

Call the Qwen Image API

Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.

HTTP example

# 1. Submit a prediction
curl -X POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/qwen-image/text-to-image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{}'

# 2. Poll the result until status = "completed"
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# Read the output URL from data.outputs[0].

Node.js example

// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY

const result = await client.run("wavespeed-ai/qwen-image/text-to-image", {});
console.log(result.outputs[0]); // → URL of the generated output

Python example

# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "wavespeed-ai/qwen-image/text-to-image",
    {}
)
print(output["outputs"][0])  # → URL of the generated output

Qwen Image vs alternatives

When to pick Qwen Image over similar models on WaveSpeedAI.

Qwen Image vs Seedream 4.5

Seedream 4.5 emphasizes typography and ships Sequential variants for multi-image identity locking. Qwen Image covers a broader editing surface — Edit-Plus, multiple-angles, layered compositing, and bilingual Chinese/English editing.

Qwen Image vs GPT Image 2

GPT Image 2 has explicit quality tiers and reference-image edit workflow. Qwen Image ships native ControlNet support, 96-pose camera angles, and layered decomposition — different editing primitives at a lower per-call cost.

Qwen Image vs Nano Banana 2

Nano Banana 2 ships multi-character consistency (up to 5) and web-search grounding. Qwen Image wins on editing depth — multi-image Edit-Plus, camera-angle generation, and prompt-guided layer decomposition.

Qwen Image API — Frequently asked questions

Pricing, license, integration — common questions about running Qwen Image on WaveSpeedAI.

What is the Qwen Image API?

Qwen Image is a Alibaba image generation model exposed as a REST API on WaveSpeedAI. Alibaba Qwen-Image — 20B MMDiT next-gen text-to-image and editing toolkit with bilingual Chinese/English support, multi-image editing, LoRA customization, layered compositing, and a 96-pose camera-angle system. You can call it programmatically or try it from the playground linked above.

How do I call the Qwen Image API?

Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/wavespeed-ai/qwen-image/text-to-image with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to "completed", then read the output URL from data.outputs[0]. Full Python / Node.js / cURL examples are above.

How much does the Qwen Image API cost?

Qwen Image starts at $0.020 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.

Which Qwen Image variants are available?

WaveSpeedAI hosts 21 Qwen Image endpoints: wavespeed-ai/qwen-image-2.0/edit, wavespeed-ai/qwen-image-2.0-pro/edit, wavespeed-ai/qwen-image-2.0/text-to-image, wavespeed-ai/qwen-image-2.0-pro/text-to-image, wavespeed-ai/qwen-image/edit-2509-multiple-angles, wavespeed-ai/qwen-image-max/edit, wavespeed-ai/qwen-image-max/text-to-image, wavespeed-ai/qwen-image/edit-multiple-angles, and more. Each variant has its own playground page and pricing.

Can I use Qwen Image outputs commercially?

Commercial usage rights follow the Alibaba model license. Most Alibaba models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.

Why use Qwen Image on WaveSpeedAI instead of going direct?

One API key + one billing account across Qwen Image AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below Alibaba's direct API.

About Alibaba

The team behind Qwen Image and the broader Alibaba model lineup on WaveSpeedAI.

Alibaba's Tongyi Lab produces the Wan family of video models and the Qwen family of LLMs. Wan is notable for being released with open weights, broad variant coverage (text-to-video, image-to-video, reference-to-video, video-edit, video-extend, image-edit, text-to-image), and consistent strength on motion stability and prompt adherence across multilingual prompts.

Related model APIs on WaveSpeedAI

Other AI APIs from Alibaba and the rest of the image model lineup — one API key, one billing account.

Wan 2.7 API

Alibaba

Alibaba WAN 2.7 — coherent cinematic video with crisp detail, stable motion, and strong instruction-following. Separate endpoints for text-to-video, image-to-video, reference-to-video, video-edit, video-extend, plus image-edit and text-to-image variants in the same family.

Happy Horse 1.0 API

Alibaba

Alibaba Happy Horse 1.0 — cinematic 720p / 1080p video with smooth camera movement, expressive motion, and strong prompt fidelity. Includes reference-to-video for consistent character/style identity across generations.

Wan 2.2 API

Alibaba

Alibaba's Wan 2.2 — open-weight video toolkit deployed on WaveSpeedAI with 35+ first-party variants: Animate (120s character animation), Video Edit, Speech-to-Video (10-min audio-driven), Fun-Control (Apache 2.0 licensed), plus image-to-video and text-to-video at multiple model sizes (5B, A14B) and resolutions (480p / 720p).

Wan 2.6 API

Alibaba

Alibaba WAN 2.6 — text-to-video and image-to-video with synced audio at 720p/1080p, plus reference-to-video, video-extend, image-edit, and text-to-image in the same family. Flash and Spicy tiers for speed and scalable content generation.

Nano Banana Pro API

Google

Google Nano Banana Pro (Gemini 3.0 Pro Image) — high-res 4K text-to-image and image editing optimized for phones. Standard, Ultra (higher-res), and Multi (multi-output) variants for both generation and edit.

Nano Banana 2 API

Google

Google Nano Banana 2 (Gemini 3.1 Flash Image) — Pro-quality image generation at Flash speed. 512px to 4K resolution, improved text rendering, character consistency for up to 5 characters, and real-world knowledge integration.

Start building with Qwen Image on WaveSpeedAI

Free starter credits on signup. One API key across 1,000+ AI models from Alibaba and every other provider.

Open Qwen Image Playground →Get an API Key

Qwen Image API

About the Qwen Image API

All Qwen Image API endpoints

Edit

Edit

Text To Image

Text To Image

Edit 2509 Multiple Angles

Edit

Text To Image

Edit Multiple Angles

Qwen Image 2512 Lora Trainer

Text To Image 2512 Lora

Text To Image 2512

Edit 2511 Lora

Edit 2511

Layered

Edit Plus Lora

Edit Plus

Edit Lora

Edit

Qwen Image Lora Trainer

Text To Image Lora

Text To Image

See Qwen Image in action

How to use the Qwen Image API

Get an API key

Submit a prediction

Poll for completion

Read the output URL

What you can build with Qwen Image

20B MMDiT text-to-image

Enhanced 2512 with superior text rendering

Multi-image editing with Edit-Plus

96-pose camera angle control

Layered compositing decomposition

LoRA customization

Tips for prompting Qwen Image

Use 2512 for latest text rendering

Edit-Multiple-Angles for product views

Edit-Plus for multi-image references

Layered for compositing workflows

LoRA variants for brand consistency

Bilingual Chinese/English editing

Qwen Image API pricing

Call the Qwen Image API

Qwen Image vs alternatives

Qwen Image vs Seedream 4.5

Qwen Image vs GPT Image 2

Qwen Image vs Nano Banana 2

Qwen Image API — Frequently asked questions

About Alibaba

Related model APIs on WaveSpeedAI

Wan 2.7 API

Happy Horse 1.0 API

Wan 2.2 API

Wan 2.6 API

Nano Banana Pro API

Nano Banana 2 API

Start building with Qwen Image on WaveSpeedAI