Skip to main content

OpenAI Compatibility

Moondream Cloud exposes an OpenAI-compatible chat endpoint at /v1/chat/completions. Point any OpenAI client or SDK at Moondream and you get multi-turn conversations, image inputs, streaming, and reasoning — no Moondream-specific client required.

Setup

Use the OpenAI base URL https://api.moondream.ai/v1, your Moondream API key as the bearer token, and moondream/moondream3-preview as the model (the moondream/ prefix is optional for this first-party model, so moondream3-preview also works). Grab a key from the Moondream Cloud Console.

from openai import OpenAI

client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.moondream.ai/v1",
)

response = client.chat.completions.create(
model="moondream/moondream3-preview",
messages=[{"role": "user", "content": "What is 2 + 2?"}],
)
print(response.choices[0].message.content)

Image input

Pass images as image_url content parts. Images must be base64-encoded data URLs — remote http(s) URLs are rejected with a 400. You can include more than one image in a turn.

import base64

with open("image.jpg", "rb") as f:
data_url = "data:image/jpeg;base64," + base64.b64encode(f.read()).decode()

response = client.chat.completions.create(
model="moondream/moondream3-preview",
messages=[{
"role": "user",
"content": [
{"type": "image_url", "image_url": {"url": data_url}},
{"type": "text", "text": "What is in this image?"},
],
}],
)
print(response.choices[0].message.content)

Multi-turn conversations

Send the full message history — earlier turns are kept as context.

response = client.chat.completions.create(
model="moondream/moondream3-preview",
messages=[
{"role": "user", "content": "My name is Alice."},
{"role": "assistant", "content": "Nice to meet you, Alice!"},
{"role": "user", "content": "What is my name?"},
],
)
print(response.choices[0].message.content) # -> "Alice"

Streaming

Set stream=True to receive the response as Server-Sent Events.

stream = client.chat.completions.create(
model="moondream/moondream3-preview",
messages=[{"role": "user", "content": "Write a short poem about the moon."}],
stream=True,
)
for chunk in stream:
delta = chunk.choices[0].delta.content
if delta:
print(delta, end="", flush=True)

By default a stream does not include token usage. To receive it, set stream_options — a final chunk with empty choices then carries the usage totals (standard OpenAI behavior).

stream = client.chat.completions.create(
model="moondream/moondream3-preview",
messages=[{"role": "user", "content": "Hello!"}],
stream=True,
stream_options={"include_usage": True},
)
for chunk in stream:
if chunk.usage:
print(chunk.usage) # prompt_tokens / completion_tokens / total_tokens

Reasoning

Moondream 3 can produce an explicit reasoning trace before its answer. Enable it with the reasoning parameter; the trace is returned on message.reasoning, separate from message.content.

response = client.chat.completions.create(
model="moondream/moondream3-preview",
messages=[{"role": "user", "content": "If I have 5 apples and give away 2, how many are left?"}],
extra_body={"reasoning": True},
)
message = response.choices[0].message
print(message.reasoning) # the step-by-step trace
print(message.content) # -> "3"

See Reasoning for more on how Moondream 3 reasons.

Parameters

ParameterTypeNotes
modelstringmoondream/moondream3-preview. See Models for the available list.
messagesarrayOpenAI chat messages. content may be a string or an array of text / image_url parts.
temperaturenumberSampling temperature.
top_pnumberNucleus sampling.
max_completion_tokensintegerMaximum number of tokens to generate, including reasoning tokens (up to 4096).
reasoningbooleanEnable the reasoning trace (returned on message.reasoning).
streambooleanStream the response as SSE.
stream_options.include_usagebooleanEmit a final usage chunk in a stream.

Image URLs must be base64 data URLs (no remote URLs). stop sequences and tool/function calling are not currently supported.

Models

List the available models with the standard OpenAI models endpoint:

curl https://api.moondream.ai/v1/models \
-H 'Authorization: Bearer YOUR_API_KEY'