Supported LLM Models
Use this supported LLM models guide to choose a model family before you build with WaveSpeedAI LLM. For the latest available models, context windows, and token prices, check the LLM Playground before implementation.
How to Choose an LLM Model
| Need | Start with |
|---|---|
| Lower latency | Fast or flash models |
| Strong coding ability | Claude, GPT, Qwen Coder, DeepSeek |
| Long documents | Models with a large context window |
| Creative writing | Claude or GPT-style models |
| Cost-sensitive usage | Compare input and output token prices in the Playground |
| Production reliability | Test two candidate models with your real prompts |
LLM Pricing and Model Details in the Playground
When you select a model in the Playground, check:
| Property | Why it matters |
|---|---|
| Context window | Maximum amount of prompt and conversation history |
| Input price | Cost for prompts, history, and tool context |
| Output price | Cost for generated responses |
| Capabilities | Whether the model supports reasoning, coding, vision, or other modes |
LLM Model ID Format
WaveSpeedAI model IDs use a provider prefix:
provider/model-nameExamples:
anthropic/claude-opus-4.7
openai/gpt-5.5
bytedance-seed/seed-1.6-flash
qwen/qwen3-coder
deepseek/deepseek-chatUse the full model ID in API requests and agent configuration.
Recommended Evaluation Flow
- Pick two or three candidate models from the Playground.
- Test them with prompts that match your actual workflow.
- Compare quality, latency, context size, and token cost.
- Use a cheaper model for simple tasks and a stronger model for complex reasoning or coding.
Cost Notes
LLM billing is token-based:
| Token type | Meaning |
|---|---|
| Input tokens | Your system prompt, user messages, history, and tool context |
| Output tokens | Text generated by the model |
Long conversations can become expensive because previous messages are usually sent again as context. Trim history, summarize old turns, and choose smaller models when quality requirements allow it.