Skip to main content

Streaming

curl -N -X POST https://api.moondream.ai/v1/query \
-H 'Content-Type: application/json' \
-H 'X-Moondream-Auth: YOUR_API_KEY' \
-d '{
"image_url": "data:image/jpeg;base64,/9j//gAQTGF2YzYxLjE5LjEwMQD/2wBDAAg+Pkk+SVVVVVVVVWRdZGhoaGRkZGRoaGhwcHCDg4NwcHBoaHBwfHyDg4+Tj4eHg4eTk5ubm7q6srLZ2eD/////xABZAAADAQEBAQAAAAAAAAAAAAAABgcFCAECAQEAAAAAAAAAAAAAAAAAAAAAEAADAAMBAQEBAAAAAAAAAAAAAQIDIREEURKBEQEAAAAAAAAAAAAAAAAAAAAA/8AAEQgAGQAZAwESAAISAAMSAP/aAAwDAQACEQMRAD8A5/PQAAABirHyVS2mUip/Pm4/vQAih9ABuRUrVLqMEALVNead7/pFgAfc+d5NLSEEAAAA/9k=",
"question": "What is in this image?",
"stream": true
}'
data: {"text": "The"}
data: {"text": " image"}

Streaming lets you receive AI responses as they're being generated, word-by-word, instead of waiting for the complete answer. This creates a more responsive experience for your users. Streaming is available for the query and caption endpoints.

Using with the SDK

View Python SDK Documentation →

Query

import moondream as md
from PIL import Image

# Initialize with your API key
model = md.vl(api_key="YOUR_API_KEY")

# Load an image
image = Image.open("path/to/image.jpg")

# Stream a query response
for chunk in model.query(image, question="What is in this image?", stream=True)["answer"]:
print(chunk, end="", flush=True)

Caption

import moondream as md
from PIL import Image

# Initialize with your API key
model = md.vl(api_key="YOUR_API_KEY")

# Load an image
image = Image.open("path/to/image.jpg")

# Stream a caption
for chunk in model.caption(image, stream=True)["caption"]:
print(chunk, end="", flush=True)