Enjoy 50% OFF Vidu Q3 & Q3 Pro models • Only on WaveSpeedAI | May 20 – Jun 2

Dashboard Explore AI GeneratorHOT Desktop App

LLM

Settings

Object Detection and Segmentation

Detect, identify, and segment objects in images and videos with AI models on WaveSpeed

Our selection

image-to-text

wavespeed-ai/moondream3-preview/point

Moondream3 Point finds objects in images and returns precise coordinate points for computer vision tasks, enabling accurate point localization. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Try it now!See docs

All models

10 models

image-to-text

wavespeed-ai/moondream3-preview/point

image-to-text

wavespeed-ai/moondream3-preview/detect

Moondream3 Detect: Precise object bounding boxes in images for accurate computer vision localization. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-3d

wavespeed-ai/sam-3d-body

Advanced SAM 3D body generation model for creating detailed 3D human body models from images with optional mask-based segmentation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-3d

wavespeed-ai/sam-3d-objects

Advanced SAM 3D objects generation model for creating detailed 3D object models from images with text prompts and optional mask-based segmentation. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

video-to-video

wavespeed-ai/sam3-video

SAM3 Video is a unified foundation model for prompt-based video segmentation. Provide text, point, box, or mask prompts and the model segments and tracks targets across frames with strong temporal consistency. Supports concept-level (“segment anything with concepts”) and multi-object masks for editing, analytics, and VFX. Ready-to-use REST inference API with fast response, no cold starts, and affordable pricing.

image-to-image

wavespeed-ai/sam3-image

SAM 3 is a unified foundation model for promptable image segmentation using text, points, or boxes to detect and segment objects. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

video-to-text

wavespeed-ai/sam3-video-rle

SAM 3 Video RLE is a unified foundation model for prompt-based segmentation in video. Track and segment objects across frames using text, points, or boxes, returning RLE encoded masks for efficient processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-text

wavespeed-ai/sam3-image-rle

SAM 3 RLE is a unified foundation model for promptable image segmentation using text, points, or boxes to detect and segment objects. Returns RLE (Run-Length Encoding) encoded masks for efficient storage and processing. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

image-to-image

bria/embed-product

Bria Embed Product seamlessly integrates product images into scene backgrounds with natural lighting and perspective matching. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

video-to-video

wavespeed-ai/void-video-inpainting/mask

VOID Video Inpainting removes objects from videos using mask-guided inpainting. Supports quad-mask or auto-generated SAM-3 masks, optional Pass 2 refinement for temporal consistency, adjustable denoising steps, guidance scale, and temporal window size. Ready-to-use REST inference API, best performance, no coldstarts, affordable pricing.

Object Detection and Segmentation API — pricing & performance

Run any model in the Object Detection and Segmentation collection through a single REST API. Pay per generation — no subscriptions, no minimums — with industry-leading latency on a 99.9% uptime infrastructure.

Why run Object Detection and Segmentation on WaveSpeedAI

Transparent pricing

Per-call pricing for every Object Detection and Segmentation model. The price is listed on each model page — no platform fees on top.

Optimized for low latency

Most Object Detection and Segmentation image models complete in under 2 seconds. Video and 3D models run several times faster than self-hosted alternatives.

99.9% uptime

Multi-region failover and automatic retries keep your production traffic online — even during provider outages.

Frequently asked questions

How much does the Object Detection and Segmentation API cost?+

Each model has its own per-call price listed on the model page. We bill per successful generation, with no subscription fees or minimums.

How fast are Object Detection and Segmentation models on WaveSpeedAI?+

Image models in this collection typically complete in under 2 seconds. Video and 3D models depend on duration and resolution but are usually several times faster than self-hosted runs.

Can I try the API without a credit card?+

Yes — every account gets $1 in free credits on signup, enough to try most Object Detection and Segmentation models without a credit card.

Are there rate limits?+

Standard accounts have generous concurrent-job limits. Enterprise plans offer custom RPM, higher concurrency, and dedicated capacity — contact sales for details.

Explore 1,000+ AI Models

Browse our full catalog of state-of-the-art AI models — image, video, 3D, audio, LLM, and more.

wavespeed.ai/models →

Build with the API

Integrate AI into your own apps. RESTful API with client libraries — no cold starts, pay per use.

wavespeed.ai/docs →