Specifications

Technical Specifications

Moondream offers a range of models optimized for different use cases, from edge devices to high-performance servers. All models support the same core capabilities with different performance characteristics.

Model Variants

Recommended for most use cases

INT8 Model

Balanced performance and quality
Requires 2,624 MiB runtime memory
1,900 MiB
Compressed: 1,733 MiB

Best For

  • • Production APIs
  • • Cloud deployment
  • • High-throughput services
onnx branch
wget https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz

Downloads: moondream-2b-int8.mf.gz

INT4 Model

Maximum compression
Requires 2,002 MiB runtime memory
1,290 MiB
Compressed: 1,167 MiB

Best For

  • • Memory constraints
  • • Local development
  • • Testing environments
onnx branch
wget https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int4.mf.gz

Downloads: moondream-2b-int4.mf.gz

Benchmarks

ReleaseVQAv2GQATextVQADocVQATallyQAPOPE
(simple/full)(rand/pop/adv)
2024-08-26
latest
80.364.365.270.582.6 / 77.689.6 / 88.8 / 87.2
2024-07-2379.464.960.261.982.0 / 76.891.3 / 89.7 / 86.9
2024-05-2079.463.157.230.582.1 / 76.691.5 / 89.6 / 86.2
2024-05-0879.062.753.130.581.6 / 76.190.6 / 88.3 / 85.0
2024-04-0277.761.749.724.380.1 / 74.2-
2024-03-1376.860.646.422.279.6 / 73.3-
2024-03-0675.459.843.120.979.5 / 73.2-
2024-03-0474.258.536.4---

Benchmark Details:

  • VQAv2: Visual Question Answering v2 dataset - General visual reasoning

  • GQA: Grounded Question Answering - Compositional visual reasoning

  • TextVQA: Text Visual Question Answering - Reading text in images

  • DocVQA: Document Visual Question Answering - Understanding documents

  • TallyQA: Counting questions (simple vs. full complexity) - Object counting

  • POPE: Popular Objects in Common Environment - Object presence verification (random/popular/adversarial)

Feature Support

ModelVisual Q&ACaptioningDetectionPointing
2B Models (FP16/INT8/INT4)
0.5B Models (INT8/INT4)