LLM API Quick Start

Use this LLM API quick start when you want to test WaveSpeedAI LLM quickly and make your first OpenAI-compatible Chat Completions request.

Option 1: Try It in the Playground

The fastest way to understand the service is through the web UI.

Open wavespeed.ai/llm.
Pick a model from the model selector.
Send a short prompt.
Adjust temperature, max tokens, or streaming if needed.
Open View Code to copy an API example for the current model and settings.

The Playground is also the recommended place to confirm the latest model IDs, context windows, and prices.

Option 2: Call the OpenAI-Compatible API

WaveSpeedAI LLM uses the OpenAI Chat Completions format.

Field	Value
Base URL	`https://llm.wavespeed.ai/v1`
Chat endpoint	`https://llm.wavespeed.ai/v1/chat/completions`
API key	Your WaveSpeedAI API key
Protocol	OpenAI-compatible Chat Completions

Minimal cURL Request for the LLM API

curl https://llm.wavespeed.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_WAVESPEED_API_KEY" \
  -d '{
    "model": "anthropic/claude-opus-4.7",
    "messages": [
      {
        "role": "user",
        "content": "Write a concise product description for WaveSpeedAI LLM."
      }
    ]
  }'

Python OpenAI SDK Example

from openai import OpenAI
 
client = OpenAI(
    api_key="YOUR_WAVESPEED_API_KEY",
    base_url="https://llm.wavespeed.ai/v1",
)
 
response = client.chat.completions.create(
    model="anthropic/claude-opus-4.7",
    messages=[
        {"role": "user", "content": "Give me three ways to reduce LLM cost."}
    ],
)
 
print(response.choices[0].message.content)

JavaScript OpenAI SDK Example

import OpenAI from "openai";
 
const client = new OpenAI({
  apiKey: "YOUR_WAVESPEED_API_KEY",
  baseURL: "https://llm.wavespeed.ai/v1",
});
 
const response = await client.chat.completions.create({
  model: "anthropic/claude-opus-4.7",
  messages: [
    { role: "user", content: "Summarize what an OpenAI-compatible API means." }
  ],
});
 
console.log(response.choices[0].message.content);

What to Change Next

Change	Field
Try another model	`model`
Make output shorter or longer	`max_tokens`
Make output more focused	Lower `temperature`
Make output more creative	Raise `temperature`
Stream output as it is generated	`stream: true`

Common Request Fields

Support for optional generation fields can vary by model. Start with model and messages, then add optional fields after testing the selected model.

Field	Required	Description
`model`	Yes	Model ID, such as `anthropic/claude-opus-4.7`
`messages`	Yes	Conversation messages in OpenAI chat format
`stream`	No	Return incremental chunks
`max_tokens`	No	Limit output length
`temperature`	No	Control randomness
`top_p`	No	Nucleus sampling

Messages can include system, user, and assistant roles. Send previous turns again when you want the model to keep context.

Streaming Example

Use streaming when you want to show text as soon as it is generated.

import OpenAI from "openai";
 
const client = new OpenAI({
  apiKey: "YOUR_WAVESPEED_API_KEY",
  baseURL: "https://llm.wavespeed.ai/v1",
});
 
const stream = await client.chat.completions.create({
  model: "anthropic/claude-opus-4.7",
  messages: [
    { role: "user", content: "Write a short onboarding checklist." }
  ],
  stream: true,
});
 
for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0]?.delta?.content || "");
}

Next Steps

Supported LLM Models for model selection guidance
Connect Coding Agents for Claude Code, Codex, and OpenClaw setup

Overview Connect Coding Agents