Seedance 2.0 20% OFF | Create in Video Generator →
Alibaba·image·From $0.020/run

Qwen Image API

Alibaba Qwen-Image — 20B MMDiT next-gen text-to-image and editing toolkit with bilingual Chinese/English support, multi-image editing, LoRA customization, layered compositing, and a 96-pose camera-angle system.

Text-to-image at base, enhanced 2512, and 2.0-pro variants. Edit endpoints include Edit, Edit-Plus (multi-image, ControlNet), Edit-LoRA, Edit-Multiple-Angles (96-pose camera system), and Layered (prompt-guided decomposition). Qwen Image 2.0 family variants included under the same prefix.

Qwen Image sample output

About the Qwen Image API

What Qwen Image does, how it fits in the Alibaba model lineup, and why teams reach for it.

Qwen Image is a image generation and editing model from Alibaba, available through the WaveSpeedAI REST API. Alibaba Qwen-Image — 20B MMDiT next-gen text-to-image and editing toolkit with bilingual Chinese/English support, multi-image editing, LoRA customization, layered compositing, and a 96-pose camera-angle system.

Text-to-image at base, enhanced 2512, and 2.0-pro variants. Edit endpoints include Edit, Edit-Plus (multi-image, ControlNet), Edit-LoRA, Edit-Multiple-Angles (96-pose camera system), and Layered (prompt-guided decomposition). Qwen Image 2.0 family variants included under the same prefix.

The Qwen Image family on WaveSpeedAI ships 21 REST endpoints covering Image-To-Image, Text-To-Image, Training, Lora-Support workflows. Each variant carries its own pricing, parameter knobs, and example outputs — pick the one that matches your input modality and production constraints, or call several from the same API key to compose multi-step pipelines.

Run Qwen Image through the same API key, billing account, and rate-limit envelope you use for the other 1,000+ AI models on WaveSpeedAI. No separate vendor setup, no per-provider SDKs, no per-vendor rate-limit envelopes — one integration covers everything from text-to-image and text-to-video through audio synthesis, 3D generation, upscaling, and editing.

All Qwen Image API endpoints

21 endpoints available now on WaveSpeedAI — pick the variant that matches your workflow.

Edit — Qwen Image image-to-image preview from Alibaba

Edit

Qwen Image 2.0 Edit is an advanced image-editing model with improved quality and better understanding of instructions. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.030
Edit — Qwen Image image-to-image preview from Alibaba

Edit

Qwen Image 2.0 Pro Edit is a professional-grade image editing model with superior quality and advanced instruction understanding. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.070
Text To Image — Qwen Image text-to-image preview from Alibaba

Text To Image

Qwen Image 2.0 is an advanced text-to-image model with enhanced image quality and improved prompt understanding. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-imagefrom $0.030
Text To Image — Qwen Image text-to-image preview from Alibaba

Text To Image

Qwen Image 2.0 Pro is a professional-grade text-to-image model with superior quality and advanced prompt understanding. Up to 2k. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-imagefrom $0.070
Edit 2509 Multiple Angles — Qwen Image image-to-image preview from Alibaba

Edit 2509 Multiple Angles

Qwen Image Edit 2509 Multiple Angles is an AI image editing model that generates multiple-angle views of objects or scenes from a single image. Transform perspectives and create diverse viewpoints with text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.025
Edit — Qwen Image image-to-image preview from Alibaba

Edit

Qwen Image Max Edit is an AI model for image editing with text prompts, supporting both Chinese and English languages. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.070
Text To Image — Qwen Image text-to-image preview from Alibaba

Text To Image

Qwen Image Max is a text-to-image model with high-quality image generation supporting Chinese and English prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-imagefrom $0.070
Edit Multiple Angles — Qwen Image image-to-image preview from Alibaba

Edit Multiple Angles

Generate specific camera angles from a single image using a 96-pose camera system. Control horizontal rotation, vertical tilt, and zoom to create front, side, back views and more. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.025
Qwen Image 2512 Lora Trainer — Qwen Image training preview from Alibaba

Qwen Image 2512 Lora Trainer

Qwen-Image-2512 LoRA Trainer lets you train custom LoRA models 10x faster with style, character, and object training. From concept to model in minutes, not hours—upload a ZIP file containing images to start. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

trainingfrom $1.00
Text To Image 2512 Lora — Qwen Image lora-support preview from Alibaba

Text To Image 2512 Lora

Qwen-Image-2512 LoRA is an enhanced 20B MMDiT text-to-image model with LoRA support for fast customization and refined image generation. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

lora-supportfrom $0.025
Text To Image 2512 — Qwen Image text-to-image preview from Alibaba

Text To Image 2512

Qwen Image 2512 is Qwen's latest text-to-image model with enhanced prompt understanding, superior text rendering, and versatile aspect ratio support. Ready-to-use REST inference API, best performance, no cold starts, affordable pricing.

text-to-imagefrom $0.020
Edit 2511 Lora — Qwen Image lora-support preview from Alibaba

Edit 2511 Lora

Qwen Image Edit 2511 LoRA is an enhanced version with custom LoRA support for personalized styles. It delivers stronger edit consistency, robust multi-person identity/pose consistency, custom LoRA styles, enhanced industrial/product design, and improved geometric reasoning for structure-preserving edits. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

lora-supportfrom $0.025
Edit 2511 — Qwen Image image-to-image preview from Alibaba

Edit 2511

Qwen Image Edit 2511 is a major upgrade over 2509 for real-world image editing and design. It delivers stronger edit consistency, robust multi-person identity/pose consistency, built-in LoRA styles, enhanced industrial/product design, and improved geometric reasoning for structure-preserving edits. Built for stable production use with a ready-to-use REST API, no cold starts, and predictable pricing.

image-to-imagefrom $0.020
Layered — Qwen Image image-to-image preview from Alibaba

Layered

Qwen-Image Layered is a unified image-layer decomposition model for prompt-guided compositing. Provide points, boxes, or rough masks to isolate subjects and regions, and the model splits a single image into multiple RGBA layers with clean alpha, soft edges, and correct occlusion order. Ready-to-use REST inference API with fast response, no cold starts, and affordable pricing.

image-to-imagefrom $0.025
Edit Plus Lora — Qwen Image lora-support preview from Alibaba

Edit Plus Lora

Qwen-Image-Edit-Plus (2509) is 20B MMDiT image-to-image editor supporting multi-image edits, single-image consistency, and native ControlNet. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

lora-supportfrom $0.025
Edit Plus — Qwen Image image-to-image preview from Alibaba

Edit Plus

Qwen-Image-Edit-Plus (2509) is a 20B MMDiT image editor with multi-image editing, single-image consistency and native ControlNet support. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.020
Edit Lora — Qwen Image lora-support preview from Alibaba

Edit Lora

Qwen-Image-Edit LoRA (20B) enables bilingual Chinese/English image-to-image editing with style preservation and semantic and appearance edits. Ready-to-use REST API, best performance, no coldstarts, affordable pricing.

lora-supportfrom $0.025
Edit — Qwen Image image-to-image preview from Alibaba

Edit

Qwen-Image-Edit is a 20B MMDiT image-to-image model offering precise bilingual (Chinese & English) text edits while preserving style. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-imagefrom $0.020
Qwen Image Lora Trainer — Qwen Image training preview from Alibaba

Qwen Image Lora Trainer

Train custom Qwen-Image LoRA models 10x faster. Style training, character training, object training. From concept to model in minutes, not hours. Upload a ZIP file containing images to start!

trainingfrom $1.00
Text To Image Lora — Qwen Image lora-support preview from Alibaba

Text To Image Lora

Qwen-Image LoRA is a 20B MMDiT next-gen text-to-image model with LoRA support for fast customization and refined image generation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

lora-supportfrom $0.025
Text To Image — Qwen Image text-to-image preview from Alibaba

Text To Image

Qwen-Image is a 20B MMDiT next-gen text-to-image model that generates images from text prompts. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

text-to-imagefrom $0.020

See Qwen Image in action

Real outputs generated by the Qwen Image API. Hover any video to preview, click to open the full-size viewer.

How to use the Qwen Image API

Four steps from signup to a finished generation. Full Python, Node.js, and cURL examples are in the API section below.

  1. 1

    Get an API key

    Sign up for a WaveSpeedAI account and copy your API key from the dashboard. New accounts come with free starter credits — enough to run the playground a few dozen times before billing kicks in.

  2. 2

    Submit a prediction

    POST your input as JSON to https://api.wavespeed.ai/api/v3/wavespeed-ai/qwen-image/text-to-image. The endpoint returns a prediction id immediately — generations are async so you don't hold an open connection during inference.

  3. 3

    Poll for completion

    GET https://api.wavespeed.ai/api/v3/predictions/{request_id}/result every 1-2 seconds. The response includes a status field; keep polling until it flips from"queued" or"processing" to"completed".

  4. 4

    Read the output URL

    Once status is"completed", read the URL from data.outputs[0]. The URL points to your generated media on the WaveSpeedAI CDN — image, video, audio, or 3D file depending on the Qwen Image variant you called.

What you can build with Qwen Image

Common workflows developers and creators use the Qwen Image API for.

20B MMDiT text-to-image

wavespeed-ai/qwen-image/text-to-image is a 20B MMDiT next-gen text-to-image model that generates images from text prompts — the base generation endpoint in the Qwen Image family.

text-to-image20bmmdit

Enhanced 2512 with superior text rendering

wavespeed-ai/qwen-image/text-to-image-2512 is Qwen's latest text-to-image model with enhanced prompt understanding, superior text rendering, and versatile aspect ratio support per the catalog.

2512typographyaspect-ratio

Multi-image editing with Edit-Plus

wavespeed-ai/qwen-image/edit-plus is a 20B MMDiT editor with multi-image editing, single-image consistency, and native ControlNet support — useful for complex edits that reference multiple source images.

edit-plusmulti-imagecontrolnet

96-pose camera angle control

wavespeed-ai/qwen-image/edit-multiple-angles generates specific camera angles from a single image using a 96-pose camera system — control horizontal rotation, vertical tilt, and zoom for front, side, back views and more.

camera-angles96-pose3d-views

Layered compositing decomposition

wavespeed-ai/qwen-image/layered is a unified image-layer decomposition model for prompt-guided compositing — provide points, boxes, or rough masks to isolate subjects and split a single image into layers.

layeredcompositingdecomposition

LoRA customization

LoRA-supported variants (text-to-image-2512-lora, edit-lora, edit-plus-lora) enable fast customization and refined generation — train or apply LoRA checkpoints for style, character, or brand consistency.

loracustomizationconsistency

Tips for prompting Qwen Image

Practical advice for getting better outputs from Qwen Image — drawn from the patterns that work across image models in production pipelines.

Use 2512 for latest text rendering

text-to-image-2512 is the enhanced variant with superior text rendering and prompt understanding — pick over base text-to-image when typography matters.

Edit-Multiple-Angles for product views

edit-multiple-angles generates front, side, back, and custom camera angles from a single product photo — useful for e-commerce catalogs without a multi-camera shoot.

Edit-Plus for multi-image references

When an edit depends on multiple source images, use edit-plus with native ControlNet support rather than single-image edit.

Layered for compositing workflows

Use layered to decompose an image into prompt-guided layers — isolate subjects with points, boxes, or rough masks for downstream compositing.

LoRA variants for brand consistency

Apply trained LoRA checkpoints via text-to-image-2512-lora or edit-plus-lora for recurring style, character, or brand identity across generations.

Bilingual Chinese/English editing

Edit and Edit-LoRA support bilingual Chinese/English image-to-image editing with style preservation — useful for localization workflows.

Qwen Image API pricing

Pricing is per-output. The final charge scales with the parameters you set in each variant's playground (resolution, duration, output count, references).

Call the Qwen Image API

Sign up for an API key at wavespeed.ai/accesskey, then submit a prediction via REST. The playground generates ready-to-paste samples for any combination of inputs.

HTTP example
# 1. Submit a prediction
curl -X POST "https://api.wavespeed.ai/api/v3/wavespeed-ai/qwen-image/text-to-image" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY" \
  -d '{}'

# 2. Poll the result until status = "completed"
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
  -H "Authorization: Bearer $WAVESPEED_API_KEY"

# Read the output URL from data.outputs[0].
Node.js example
// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY

const result = await client.run("wavespeed-ai/qwen-image/text-to-image", {});
console.log(result.outputs[0]); // → URL of the generated output
Python example
# pip install wavespeed
import wavespeed

output = wavespeed.run(
    "wavespeed-ai/qwen-image/text-to-image",
    {}
)
print(output["outputs"][0])  # → URL of the generated output

Qwen Image vs alternatives

When to pick Qwen Image over similar models on WaveSpeedAI.

Qwen Image vs Seedream 4.5

Seedream 4.5 emphasizes typography and ships Sequential variants for multi-image identity locking. Qwen Image covers a broader editing surface — Edit-Plus, multiple-angles, layered compositing, and bilingual Chinese/English editing.

Qwen Image vs GPT Image 2

GPT Image 2 has explicit quality tiers and reference-image edit workflow. Qwen Image ships native ControlNet support, 96-pose camera angles, and layered decomposition — different editing primitives at a lower per-call cost.

Qwen Image vs Nano Banana 2

Nano Banana 2 ships multi-character consistency (up to 5) and web-search grounding. Qwen Image wins on editing depth — multi-image Edit-Plus, camera-angle generation, and prompt-guided layer decomposition.

Qwen Image API — Frequently asked questions

Pricing, license, integration — common questions about running Qwen Image on WaveSpeedAI.

What is the Qwen Image API?

Qwen Image is a Alibaba image generation model exposed as a REST API on WaveSpeedAI. Alibaba Qwen-Image — 20B MMDiT next-gen text-to-image and editing toolkit with bilingual Chinese/English support, multi-image editing, LoRA customization, layered compositing, and a 96-pose camera-angle system. You can call it programmatically or try it from the playground linked above.

How do I call the Qwen Image API?

Sign up for a WaveSpeedAI account, copy your API key from /accesskey, then POST to https://api.wavespeed.ai/api/v3/wavespeed-ai/qwen-image/text-to-image with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to "completed", then read the output URL from data.outputs[0]. Full Python / Node.js / cURL examples are above.

How much does the Qwen Image API cost?

Qwen Image starts at $0.020 per run. The exact cost scales with the parameters you set (resolution, duration, output count, references). The live cost preview next to the Generate button in the playground shows the exact price for your current input.

Which Qwen Image variants are available?

WaveSpeedAI hosts 21 Qwen Image endpoints: wavespeed-ai/qwen-image-2.0/edit, wavespeed-ai/qwen-image-2.0-pro/edit, wavespeed-ai/qwen-image-2.0/text-to-image, wavespeed-ai/qwen-image-2.0-pro/text-to-image, wavespeed-ai/qwen-image/edit-2509-multiple-angles, wavespeed-ai/qwen-image-max/edit, wavespeed-ai/qwen-image-max/text-to-image, wavespeed-ai/qwen-image/edit-multiple-angles, and more. Each variant has its own playground page and pricing.

Can I use Qwen Image outputs commercially?

Commercial usage rights follow the Alibaba model license. Most Alibaba models permit commercial output use; see each model's playground page for the specific license summary, and WaveSpeedAI's Terms of Service for platform-level conditions.

Why use Qwen Image on WaveSpeedAI instead of going direct?

One API key + one billing account across Qwen Image AND 1,000+ other AI models from other providers. No per-vendor SDK setup, no separate rate-limit envelopes, no rewrite-per-vendor integration code. Pricing is typically at parity with or below Alibaba's direct API.

About Alibaba

The team behind Qwen Image and the broader Alibaba model lineup on WaveSpeedAI.

Alibaba's Tongyi Lab produces the Wan family of video models and the Qwen family of LLMs. Wan is notable for being released with open weights, broad variant coverage (text-to-video, image-to-video, reference-to-video, video-edit, video-extend, image-edit, text-to-image), and consistent strength on motion stability and prompt adherence across multilingual prompts.

Start building with Qwen Image on WaveSpeedAI

Free starter credits on signup. One API key across 1,000+ AI models from Alibaba and every other provider.