Supported LLM Models

Supported LLM Models

Use this supported LLM models guide to choose a model family before you build with WaveSpeedAI LLM. For the latest available models, context windows, and token prices, check the LLM Playground before implementation.

How to Choose an LLM Model

NeedStart with
Lower latencyFast or flash models
Strong coding abilityClaude, GPT, Qwen Coder, DeepSeek
Long documentsModels with a large context window
Creative writingClaude or GPT-style models
Cost-sensitive usageCompare input and output token prices in the Playground
Production reliabilityTest two candidate models with your real prompts

LLM Pricing and Model Details in the Playground

When you select a model in the Playground, check:

PropertyWhy it matters
Context windowMaximum amount of prompt and conversation history
Input priceCost for prompts, history, and tool context
Output priceCost for generated responses
CapabilitiesWhether the model supports reasoning, coding, vision, or other modes

LLM Model ID Format

WaveSpeedAI model IDs use a provider prefix:

provider/model-name

Examples:

anthropic/claude-opus-4.7
openai/gpt-5.5
bytedance-seed/seed-1.6-flash
qwen/qwen3-coder
deepseek/deepseek-chat

Use the full model ID in API requests and agent configuration.

  1. Pick two or three candidate models from the Playground.
  2. Test them with prompts that match your actual workflow.
  3. Compare quality, latency, context size, and token cost.
  4. Use a cheaper model for simple tasks and a stronger model for complex reasoning or coding.

Cost Notes

LLM billing is token-based:

Token typeMeaning
Input tokensYour system prompt, user messages, history, and tool context
Output tokensText generated by the model

Long conversations can become expensive because previous messages are usually sent again as context. Trim history, summarize old turns, and choose smaller models when quality requirements allow it.

© 2025 WaveSpeedAI. All rights reserved.