Google's Nano Banana pro (Gemini 3.0 Pro Image) is a cutting-edge text-to-image model enabling high-res 4K image generation optimized for phones. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.
Idle

$0.14per run·~71 / $10

Create a vibrant and modern magazine cover for Women’s Health, themed for April 2025. The main background is a warm, orange gradient with soft shadows, evoking a fresh spring mood. Centered is a stylish young woman sitting confidently on color-blocked orange cubes. She has long, voluminous, wavy blonde hair and a natural, glowing complexion. She’s dressed in a forest green zip-up windbreaker jacket with loose sleeves and an orange top underneath, paired with white athletic crew socks branded ‘SAMOLA’ and retro-style white sneakers with thick black stripes and tan soles. One leg is propped up, creating a confident, athletic pose. Her expression is calm and poised. Include magazine headlines in stylish fonts, balancing black, white, and lime green text, placed thoughtfully around the subject: • Top left: ‘Covid: five years on’ in pale lime green with subtext in black: ‘Has the pandemic reshaped your identity?’ • Top right: ‘Spring forward’ in bold black with subtext: ‘How to eat, travel and sweat for your healthiest season yet’ • Center right: ‘15 skincare habits beauty founders swear by’ with large lime green ‘15’ • Bottom left: ‘FAKE VIEWS: Inside the scroll holes telling women how to “fix” themselves’ in black and pale pink • Bottom left corner with a green plus sign: ‘The workout that experts are calling a magic pill’ • Bottom right over the box: ‘Em the nutritionist’ in elegant white serif font, with yellow subheading: ‘In the kitchen with wellness’s favourite foodie’ Design should reflect an empowering, clean, editorial style, with an emphasis on health, wellness, and bold femininity. Lighting should be studio-bright, shadows soft and controlled.

Same female character, 3-angle turnaround (front, side, 3/4), consistent face and lighting, detailed outfit texture, animation design sheet.

Generate a screenshot of a windows 11 desktop, with google chrome open, showing a YouTube thumbnail of Mr. Beast on YouTube.com

Please create a solution layout for a mathematics problem with a paper-texture background. Requirements: split the canvas into left and right sections—\emph{left:} schematic of the plan (arrows/notes, scale, directions); \emph{right:} step-by-step derivation. Use consistent annotations in the figure: known quantities, unknowns, key relations, and coordinate axes or normals. Box the final answer and include a check. \textbf{Problem:} Given $v_0=20\,\text{m/s}$ and $\theta=30^\circ$, find the time of flight, the maximum height, and the range, and output the position at $t=1\,\text{s}$. Take $g=10\,\text{m/s}^2$. draw the question and solution.

A perfectly reflective chrome (Chrome) mirror ball placed on a black and white checkerboard.

High-quality flat lay photography creating a DIY infographic that simply explains how solar energy works, arranged on a clean, light gray textured background. The visual story flows from left to right in clear steps: Content is based on this:https://wavespeed.ai/. Simple, clean black arrows are hand-drawn onto the background to guide the viewer's eye from the sun to the house, clearly marking the flow of energy. The overall mood is educational, modern, and easy to understand. The image is shot from a top-down, bird's-eye view with soft, even lighting that minimizes shadows and keeps the focus on the process. Format 16:9

An intense, cinematic 3D animation style render of a boxing match taking place inside a large sizzling frying pan. The main characters are an anthropomorphic French fry wearing a red, white, and blue sweatband and boxing gloves, fighting against an anthropomorphic onion wearing a blue wrestling singlet and a mustache. They are in a dynamic fighting pose with hot oil splashing around their feet and flying vegetable particles. In the background, a crowd of anthropomorphic burgers, potatoes, and hot dogs are watching and cheering. The lighting is professional studio lighting with a kitchen background, high quality, octane render, hyper-realistic food textures, 8k resolution.

actor standing on set surrounded by two large cinema cameras, LED walls behind creating a sci-fi backdrop illusion, crew marking positions on the floor, realistic production lighting, ultra-real cinematic style

behind-the-scenes of a high-end commercial shoot, fashion model under giant soft lights, photographers, stylists fixing details, production assistants holding reflectors, studio filled with pro equipment, crisp realistic image

Generate a black-and-white comic for me about a Japanese high school student being late for school.

a realistic person blended into an artistic collage of textures, shapes, torn paper layers, bold contrasting typography, experimental poster layout, overlapping elements, vibrant color palette, modern graphic design aesthetic, high-resolution details
Nano Banana Pro Text-to-Image (Gemini 3.0 Pro Image) is Google’s lightweight yet powerful AI image generation model, built for creators who need fast, high-quality visuals from simple text prompts. It transforms words into expressive, realistic images with remarkable clarity, composition, and style diversity — all within seconds.
Natural-language, context-aware editing Modify images using simple text instructions. The model understands scene structure, objects, and relationships for realistic edits.
Multilingual on-image text with auto translation Generate and edit text inside images in multiple languages, with improved font clarity and layout.
Camera-style controls Support for camera-related parameters such as angle, focus, depth of field, and color adjustment for more photographic results.
Aspect ratio flexibility Supports formats from 1:1 to 9:16 (and beyond, such as 4:3, 16:9, 21:9), suitable for feeds, stories, banners, and print concepts.
Consistent character and style rendering Maintain character identity, brand elements, and overall style across related images.
Input: text prompt
Output: high-quality image (JPEG/PNG)
Supports multiple aspect ratios and output format
Compatible with descriptive prompts such as:
“A golden retriever playing in a field of sunflowers at sunset.”
“A futuristic city skyline with neon reflections on wet streets.”
“An elegant still-life photo of coffee and croissants by a window.”
| Resolution | Cost per image |
|---|---|
| 1k | $0.14 |
| 2k | $0.14 |
| 4k | $0.24 |
Please ensure your prompts comply with Google’s Safety Guidelines. If an error occurs, review your prompt for restricted content, adjust it, and try again.
Compare Nano Banana Pro T2I with:
FLUX.1 [dev] – Nano Banana Pro Edit focuses on semantic understanding and layout-aware editing via Gemini 3’s reasoning, making it ideal for complex, text-driven transformations without manual masking. FLUX.1 [dev] emphasizes maximum resolution control and fine detail preservation for highly technical workflows.
Gpt-Image-1 (OpenAI) – Nano Banana Pro Edit emphasizes layout control, multilingual on-image text, and tightly directed edits for design and marketing workflows, while openai/gpt-image-1 shines as a general-purpose creative generator with strong style variety and fast, natural-language image synthesis for broad consumer and developer use.
Original Nano Banana (Gemini 2.5 Flash Image) – Nano Banana Pro Edit trades pure speed for quality, delivering better reasoning, sharper text, improved character consistency, and richer camera controls at a higher unit cost. Original Nano Banana remains ideal for rapid, low-latency iterations and lightweight edits.
Seedream – Nano Banana Pro Edit is tuned for reliable typography, photo-real edits, and mixed media layouts, while SeeDream excels at fast, stylized T2I generation with strong anime and illustration aesthetics, making it a good choice for heavily stylized concept art.
Qwen Image 2509 – Nano Banana Pro Edit focuses on high-fidelity 4K outputs and multilingual on-image design control, whereas Qwen Image shines in open-source ecosystems and document-style rendering, offering flexible integration for developer-centric and research workflows.
Grab a WaveSpeedAI API key, then call POST https://api.wavespeed.ai/api/v3/google/nano-banana-pro/text-to-image with your input as JSON. The endpoint returns a prediction id; poll the prediction endpoint until status flips to completed, then read the output URL from data.outputs[0]. Examples for Nano Banana Pro Text To Image below.
# Submit the prediction
curl -X POST "https://api.wavespeed.ai/api/v3/google/nano-banana-pro/text-to-image" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $WAVESPEED_API_KEY" \
-d '{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"aspect_ratio": "1:1",
"resolution": "1k",
"output_format": "png",
"enable_sync_mode": false,
"enable_base64_output": false
}'
# Response includes a prediction id. Poll for the result:
curl -X GET "https://api.wavespeed.ai/api/v3/predictions/{request_id}/result" \
-H "Authorization: Bearer $WAVESPEED_API_KEY"
# When status is "completed", read the output from data.outputs[0].// npm install wavespeed
const WaveSpeed = require('wavespeed');
const client = new WaveSpeed(); // reads WAVESPEED_API_KEY from env
const result = await client.run("google/nano-banana-pro/text-to-image", {
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"aspect_ratio": "1:1",
"resolution": "1k",
"output_format": "png",
"enable_sync_mode": false,
"enable_base64_output": false
});
console.log(result.outputs[0]); // → URL of the generated output# pip install wavespeed
import wavespeed
output = wavespeed.run(
"google/nano-banana-pro/text-to-image",
{
"prompt": "A cinematic shot of a city at sunset, soft golden light",
"aspect_ratio": "1:1",
"resolution": "1k",
"output_format": "png",
"enable_sync_mode": false,
"enable_base64_output": false
}
)
print(output["outputs"][0]) # → URL of the generated outputNano Banana Pro Text To Image is a Google model for image generation, exposed as a REST API on WaveSpeedAI. Google's Nano Banana pro (Gemini 3.0 Pro Image) is a cutting-edge text-to-image model enabling high-res 4K image generation optimized for phones. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing. You can call it programmatically or try it from the playground above.
POST your input parameters to the model's REST endpoint (shown in the API tab of this playground) with your WaveSpeedAI API key in the Authorization header. Submission returns a prediction ID; poll the prediction endpoint until status flips to "completed", then read the output URL from the result. The playground generates a ready-to-paste code sample in Python, JavaScript, or cURL for whatever inputs you've set. Full request/response shape is documented at https://wavespeed.ai/docs/docs-api/google/google-nano-banana-pro-text-to-image.
Nano Banana Pro Text To Image starts at $0.14 per run. That figure is the base price — the final charge scales with the parameters you set in the form (output size, length, count, references, or whatever knobs this model exposes), so a higher-quality or larger output costs more than a minimal one. The exact cost for your current input is shown live next to the Generate button before you submit, and the actual per-call charge is recorded on the prediction afterwards.
Key inputs: `prompt`, `aspect_ratio`, `resolution`, `enable_base64_output`, `enable_sync_mode`, `output_format`. The full JSON schema (types, defaults, allowed values) is rendered above the Generate button and mirrored in the API reference at https://wavespeed.ai/docs/docs-api/google/google-nano-banana-pro-text-to-image.
Average end-to-end generation time on WaveSpeedAI is around 57 seconds per request — measured across recent runs. Queue time scales with global demand; live status is visible in the prediction record.
Commercial usage rights depend on the model's license, set by its provider (Google). The license summary appears on the model card above; see WaveSpeedAI's Terms of Service for platform-level conditions.