Supported LLM Models

Use this supported LLM models guide to choose a model family before you build with WaveSpeedAI LLM. For the latest available models, context windows, and token prices, check the LLM Playground before implementation.

How to Choose an LLM Model

Need	Start with
Lower latency	Fast or flash models
Strong coding ability	Claude, GPT, Qwen Coder, DeepSeek
Long documents	Models with a large context window
Creative writing	Claude or GPT-style models
Cost-sensitive usage	Compare input and output token prices in the Playground
Production reliability	Test two candidate models with your real prompts

LLM Pricing and Model Details in the Playground

When you select a model in the Playground, check:

Property	Why it matters
Context window	Maximum amount of prompt and conversation history
Input price	Cost for prompts, history, and tool context
Output price	Cost for generated responses
Capabilities	Whether the model supports reasoning, coding, vision, or other modes

LLM Model ID Format

WaveSpeedAI model IDs use a provider prefix:

provider/model-name

Examples:

anthropic/claude-opus-4.7
openai/gpt-5.5
bytedance-seed/seed-1.6-flash
qwen/qwen3-coder
deepseek/deepseek-chat

Use the full model ID in API requests and agent configuration.

Recommended Evaluation Flow

Pick two or three candidate models from the Playground.
Test them with prompts that match your actual workflow.
Compare quality, latency, context size, and token cost.
Use a cheaper model for simple tasks and a stronger model for complex reasoning or coding.

Cost Notes

LLM billing is token-based:

Token type	Meaning
Input tokens	Your system prompt, user messages, history, and tool context
Output tokens	Text generated by the model

Long conversations can become expensive because previous messages are usually sent again as context. Trim history, summarize old turns, and choose smaller models when quality requirements allow it.

Connect Coding Agents Serverless Overview