Technical Specifications

Moondream offers a range of models optimized for different use cases, from edge devices to high-performance servers. All models support the same core capabilities with different performance characteristics.

Model Variants

Recommended for most use cases

INT8 Model

Balanced performance and quality

Requires 2,624 MiB runtime memory

1,900 MiB

Compressed: 1,733 MiB

Direct Download →

Best For

• Production APIs
• Cloud deployment
• High-throughput services

onnx branch

wget https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz

curl -L -o moondream-2b-int8.mf.gz https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz

curl.exe -L -o moondream-2b-int8.mf.gz https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz

import requests
 
url = "https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz"
response = requests.get(url)
with open("moondream-2b-int8.mf.gz", "wb") as f:
    f.write(response.content)

Downloads: moondream-2b-int8.mf.gz

INT4 Model

Maximum compression

Requires 2,002 MiB runtime memory

1,290 MiB

Compressed: 1,167 MiB

Direct Download →

Best For

• Memory constraints
• Local development
• Testing environments

onnx branch

wget https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int4.mf.gz

curl -L -o moondream-2b-int4.mf.gz https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int4.mf.gz

curl.exe -L -o moondream-2b-int4.mf.gz https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int4.mf.gz

import requests
 
url = "https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int4.mf.gz"
response = requests.get(url)
with open("moondream-2b-int4.mf.gz", "wb") as f:
    f.write(response.content)

Downloads: moondream-2b-int4.mf.gz

Benchmarks

Release	ChartQA	TextVQA	DocVQA	CountBench	TallyQA	POPE
	(total)				(simple/full)	(rand/pop/adv)
2025-01-09 latest	73.2	73.4	76.6	80.0	82.6 / 76.5	92.4 / 90.3 / 87.2
2024-08-26	-	65.2	70.5	-	82.6 / 77.6	89.6 / 88.8 / 87.2
2024-07-23	-	60.2	61.9	-	82.0 / 76.8	91.3 / 89.7 / 86.9
2024-05-20	-	57.2	30.5	-	82.1 / 76.6	91.5 / 89.6 / 86.2
2024-05-08	-	53.1	30.5	-	81.6 / 76.1	90.6 / 88.3 / 85.0
2024-04-02	-	49.7	24.3	-	80.1 / 74.2	-
2024-03-13	-	46.4	22.2	-	79.6 / 73.3	-
2024-03-06	-	43.1	20.9	-	79.5 / 73.2	-
2024-03-04	-	36.4	-	-	-	-

Benchmark Details

ChartQA: Chart understanding and question answering
TextVQA: Reading and understanding text in natural images
DocVQA: Document understanding and question answering

CountBench: Specialized counting questions benchmark
TallyQA: Object counting (simple vs. full questions)
POPE: Object presence verification (random/popular/adversarial)

Model	TextVQA	DocVQA	TallyQA	POPE	NaturalBench
Moondream 0.5B latest (12-04-24)	68.92	57.24	71.82 / 67.91	85.1 / 79.43 / 77.53	44.9

Feature Support

Model	Visual Q&A	Captioning	Detection	Pointing
2B	✅	✅	✅	✅
0.5B	✅	✅	✅	❌

Recipes OpenAI Compatibility

Technical Specifications

Model Variants

INT8 Model

Best For

INT4 Model

Best For

INT8 Model (0.5B)

Best For

INT4 Model (0.5B)

Best For

Benchmarks

Benchmark Details

Benchmark Details

Feature Support