Skip to main content

Caption

The /caption endpoint generates natural language descriptions of images, from brief summaries to detailed explanations of visual content.

Example Request

import moondream as md
from PIL import Image

# Initialize with API key
model = md.vl(api_key="your-api-key")

# Load an image
image = Image.open("path/to/image.jpg")

# Generate a caption
result = model.caption(image)
caption = result["caption"]
request_id = result["request_id"]
print(f"Caption: {caption}")
print(f"Request ID: {request_id}")

# Generate a short caption
short_result = model.caption(image, length="short")
short_caption = short_result["caption"]
print(f"Short Caption: {short_caption}")

# Stream the response
stream_result = model.caption(image, stream=True)
for chunk in stream_result["chunk"]:
print(chunk, end="", flush=True)

Example Response

For non-streaming responses:

{
"request_id": "2025-03-25_caption_2025-03-25-21:00:39-715d03",
"caption": "A detailed caption describing the image..."
}

For streaming responses, you'll receive a series of data events:

{data: {"chunk": "A scene ", "completed": false, "request_id": "2025-03-25_caption_123456"}}
{data: {"chunk": "showing a mountain", "completed": false, "request_id": "2025-03-25_caption_123456"}}
{data: {"chunk": " landscape.", "completed": false, "request_id": "2025-03-25_caption_123456"}}
{data: {"completed": true, "chunk": "", "org_id": "a349504c-8006-54f3-8862-ba0c41d2b4d7", "request_id": "2025-03-25_caption_123456"}}

Caption Length Options

  • "short": Brief 1-2 sentence summary (e.g., "A red car parked on a street.")
  • "normal" (default): Detailed description covering elements, context, colors, positioning, etc.

Use Cases

  • Generating alt text for accessibility
  • Content indexing and organization
  • Image search functionality
  • Social media content creation
  • Automated reporting and documentation

Error Handling

Common error responses:

Status CodeDescription
400Bad Request - Invalid parameters or image format
401Unauthorized - Invalid or missing API key
413Payload Too Large - Image size exceeds limits
429Too Many Requests - Rate limit exceeded
500Internal Server Error - Server-side issue

Error Response Format

Error responses are returned in the following format:

{
"error": {
"message": "Detailed error description",
"type": "error_type",
"param": "parameter_name",
"code": "error_code"
}
}

Limitations

  • Maximum image size: 10MB
  • Supported image formats: JPEG, PNG, GIF (first frame only)
  • Rate limits apply based on your plan

Learn More: