Skip to main content

Segment

The /segment endpoint generates precise SVG path segmentation masks for specific objects in images. It returns an SVG path string that outlines the object, along with a bounding box.

Example Request

import moondream as md
from PIL import Image

# Initialize with API key
model = md.vl(api_key="your-api-key")

# Load an image
image = Image.open("path/to/image.jpg")

# Segment an object
result = model.segment(image, "cat")
svg_path = result["path"]
bbox = result["bbox"]

print(f"SVG Path: {svg_path[:100]}...")
print(f"Bounding box: {bbox}")

# With spatial hint (point) to guide segmentation
result = model.segment(image, "cat", spatial_refs=[[0.5, 0.3]])

# With spatial hint (bounding box)
result = model.segment(image, "cat", spatial_refs=[[0.2, 0.1, 0.8, 0.9]])

Example Response

{
"path": "M 0 0.76 L 0 0.32 L 0.03 0.31 C 0.04 0.30 0.09 0.28 0.12 0.27...",
"bbox": {
"x_min": 0.0002,
"y_min": 0.0039,
"x_max": 0.7838,
"y_max": 0.9971
}
}

Spatial References

You can guide the segmentation by providing spatial references - either points or bounding boxes with normalized coordinates (0-1):

  • Point: [x, y] - A single point to indicate the object location
  • Bounding box: [x1, y1, x2, y2] - A region containing the object
# Point hint - segment object near center
result = model.segment(image, "person", spatial_refs=[[0.5, 0.5]])

# Bounding box hint - segment object within region
result = model.segment(image, "person", spatial_refs=[[0.1, 0.2, 0.6, 0.8]])

# Multiple hints
result = model.segment(image, "person", spatial_refs=[[0.3, 0.4], [0.7, 0.6]])

Streaming

Segment supports streaming to receive the bounding box immediately and coarse path updates as they're generated:

import moondream as md
from PIL import Image

model = md.vl(api_key="your-api-key")
image = Image.open("path/to/image.jpg")

# Stream segmentation updates
for update in model.segment(image, "cat", stream=True):
if "bbox" in update and not update.get("completed"):
# Bounding box available in first message
print(f"Bbox: {update['bbox']}")
if "chunk" in update:
# Coarse path chunks
print(update["chunk"], end="")
if update.get("completed"):
# Final refined path
print(f"\nFinal path: {update['path'][:100]}...")

Streaming Response Format

When streaming, you'll receive Server-Sent Events with different message types:

  1. Bounding box (first message):
{"type": "bbox", "bbox": {"x_min": 0.0, "y_min": 0.0, "x_max": 0.78, "y_max": 0.99}}
  1. Path chunks (coarse path updates):
{"type": "path_delta", "chunk": "M 0 0.76", "completed": false}
  1. Final message (refined path):
{"type": "final", "path": "M 0 0.76 L 0 0.32...", "bbox": {...}, "completed": true}

Using the SVG Path

The path coordinates are normalized to [0, 1] relative to the bounding box, not the full image. To render the mask correctly, you need to position and scale the path within the bounding box region:

<div style="position: relative; width: 800px; height: 600px;">
<img src="your-image.jpg" style="width: 100%; height: 100%;" />
<!-- Position SVG within the bounding box region -->
<svg
viewBox="0 0 1 1"
preserveAspectRatio="none"
style="
position: absolute;
left: calc(x_min * 100%);
top: calc(y_min * 100%);
width: calc((x_max - x_min) * 100%);
height: calc((y_max - y_min) * 100%);
"
>
<path d="M 0 0.76 L 0 0.32..." fill="rgba(255,0,0,0.3)" stroke="red" stroke-width="0.01"/>
</svg>
</div>

In practice, replace the calc() expressions with actual values from the bbox:

const { bbox, path } = result;
const container = document.querySelector('.image-container');
const svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
svg.setAttribute('viewBox', '0 0 1 1');
svg.setAttribute('preserveAspectRatio', 'none');
svg.style.position = 'absolute';
svg.style.left = `${bbox.x_min * 100}%`;
svg.style.top = `${bbox.y_min * 100}%`;
svg.style.width = `${(bbox.x_max - bbox.x_min) * 100}%`;
svg.style.height = `${(bbox.y_max - bbox.y_min) * 100}%`;

const pathEl = document.createElementNS('http://www.w3.org/2000/svg', 'path');
pathEl.setAttribute('d', path);
pathEl.setAttribute('fill', 'rgba(255,0,0,0.3)');
svg.appendChild(pathEl);
container.appendChild(svg);

Rasterizing and Overlaying in Python

Use an SVG renderer to rasterize and overlay the mask.

import io
from PIL import Image
import cairosvg


def rasterize_mask(svg_path: str, bbox: dict, image_size: tuple[int, int]) -> Image.Image:
"""Return a PIL mask image (0 = background, 255 = mask)."""
img_w, img_h = image_size
x_min, y_min = bbox["x_min"], bbox["y_min"]
w = bbox["x_max"] - bbox["x_min"]
h = bbox["y_max"] - bbox["y_min"]

svg = f"""<svg viewBox=\"0 0 1 1\" width=\"{img_w}\" height=\"{img_h}\" preserveAspectRatio=\"none\">
<g style=\"position: absolute; left: {x_min * 100}%; top: {y_min * 100}%; width: {w * 100}%; height: {h * 100}%; transform-origin: 0 0; transform: scale({w} {h});\"> <path d=\"{svg_path}\" fill=\"white\" />
</g>
</svg>"""

png_bytes = cairosvg.svg2png(
bytestring=svg.encode("utf-8"),
output_width=img_w,
output_height=img_h,
)
return Image.open(io.BytesIO(png_bytes)).convert("L")


# Overlay + save
image = Image.open("input.jpg").convert("RGB")
result = model.segment(image, "cat")

mask = rasterize_mask(result["path"], result["bbox"], image.size)

red = Image.new("RGB", image.size, (255, 0, 0))
overlay = Image.composite(red, image, mask)
overlay.save("output.png")

This keeps the path in bbox space (viewBox 0–1) and uses SVG sizing to place it on the image before rasterizing.

Segment vs. Detect

FeatureSegmentDetect
OutputSVG path (pixel-level mask)Bounding box
PrecisionExact object boundariesRectangular approximation
Use caseCutouts, masks, precise selectionObject counting, localization
StreamingYes (bbox + path chunks)No

Learn More: