Skip to main content

Query

The /query endpoint enables you to ask natural language questions about images and receive detailed answers. This is also known as Visual Question Answering (VQA).

Example Request

import moondream as md
from PIL import Image

# Initialize with API key
model = md.vl(api_key="your-api-key")

# Load an image
image = Image.open("path/to/image.jpg")

# Ask a question
result = model.query(image, "What's in this image?")
answer = result["answer"]
request_id = result["request_id"]
print(f"Answer: {answer}")
print(f"Request ID: {request_id}")

# Stream the response
stream_result = model.query(image, "What's in this image?", stream=True)
for chunk in stream_result["chunk"]:
print(chunk, end="", flush=True)

Example Response

{
"request_id": "2025-03-25_query_2025-03-25-21:00:39-715d03",
"answer": "Detailed text answer to your question..."
}

Best Practices

  • Ask specific questions rather than general ones
  • One question at a time yields better results than multiple questions
  • If you want detailed answers, explicitly ask for details in your question
  • Moondream can produce structured outputs. For example: "describe people in the image using JSON with keys 'hair_color', 'shirt_color', 'person_type'"

Learn More: