Batch API

The Batch API lets you process large volumes of images asynchronously. Upload a JSONL file with thousands of requests, and download results when processing completes. Batch processing is ideal for offline workloads like dataset annotation, bulk captioning, or large-scale image analysis.

Pricing: Batch API requests are billed at 50% off standard API pricing. When using finetuned models, batch pricing is 50% off finetune pricing. See pricing for details.

When to use Batch API

Processing thousands to 100,000 images
Offline workloads where latency isn't critical
Dataset annotation and labeling
Cost-sensitive bulk processing

For real-time applications, use the standard API instead.

Data privacy

Your data is never used for training. Input files are deleted immediately after processing completes, and results are automatically purged after 7 days.

Workflow overview

Prepare a JSONL file with one request per line
Upload the file using multipart upload
Poll for completion status
Download result files

Quickstart

1. Prepare your input file

Create a JSONL file with one JSON object per line. Each line specifies a skill and its parameters:

batch_input.jsonl
{"id": "img_001", "skill": "caption", "image": "<base64>", "length": "normal"}
{"id": "img_002", "skill": "query", "image": "<base64>", "question": "What color is the car?"}
{"id": "img_003", "skill": "detect", "image": "<base64>", "object": "person"}

The id field is optional but recommended for correlating results with inputs.

2. Upload and submit the batch

cURL
Python
Node.js

# Step 1: Initialize multipart upload
INIT=$(curl -s -X POST "https://api.moondream.ai/v1/batch?action=mpu-create" \
  -H "X-Moondream-Auth: YOUR_API_KEY")

FILE_ID=$(echo $INIT | jq -r '.fileId')
UPLOAD_ID=$(echo $INIT | jq -r '.uploadId')

# Step 2: Split file into 50MB chunks and upload each part
CHUNK_SIZE=$((50 * 1024 * 1024))
split -b $CHUNK_SIZE batch_input.jsonl chunk_

PARTS="[]"
PART_NUM=1
for CHUNK in chunk_*; do
  PART=$(curl -s -X PUT "https://api.moondream.ai/v1/batch/$FILE_ID?action=mpu-uploadpart&uploadId=$UPLOAD_ID&partNumber=$PART_NUM" \
    -H "X-Moondream-Auth: YOUR_API_KEY" \
    -H "Content-Type: application/octet-stream" \
    --data-binary @$CHUNK)
  ETAG=$(echo $PART | jq -r '.etag')
  PARTS=$(echo $PARTS | jq ". + [{\"partNumber\": $PART_NUM, \"etag\": \"$ETAG\"}]")
  rm $CHUNK
  PART_NUM=$((PART_NUM + 1))
done

# Step 3: Complete upload and start processing
BATCH=$(curl -s -X POST "https://api.moondream.ai/v1/batch/$FILE_ID?action=mpu-complete&uploadId=$UPLOAD_ID" \
  -H "X-Moondream-Auth: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{\"parts\": $PARTS}")

BATCH_ID=$(echo $BATCH | jq -r '.id')
echo "Batch submitted: $BATCH_ID"

import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.moondream.ai/v1/batch"
headers = {"X-Moondream-Auth": API_KEY}
CHUNK_SIZE = 50 * 1024 * 1024  # 50MB chunks

# Step 1: Initialize multipart upload
init = requests.post(f"{BASE_URL}?action=mpu-create", headers=headers).json()
file_id = init["fileId"]
upload_id = init["uploadId"]

# Step 2: Upload in chunks
parts = []
with open("batch_input.jsonl", "rb") as f:
    part_number = 1
    while chunk := f.read(CHUNK_SIZE):
        part = requests.put(
            f"{BASE_URL}/{file_id}?action=mpu-uploadpart&uploadId={upload_id}&partNumber={part_number}",
            headers={**headers, "Content-Type": "application/octet-stream"},
            data=chunk
        ).json()
        parts.append({"partNumber": part_number, "etag": part["etag"]})
        part_number += 1

# Step 3: Complete upload and start processing
batch = requests.post(
    f"{BASE_URL}/{file_id}?action=mpu-complete&uploadId={upload_id}",
    headers=headers,
    json={"parts": parts}
).json()

print(f"Batch submitted: {batch['id']}")

import fs from 'fs';

const API_KEY = 'YOUR_API_KEY';
const BASE_URL = 'https://api.moondream.ai/v1/batch';
const headers = { 'X-Moondream-Auth': API_KEY };
const CHUNK_SIZE = 50 * 1024 * 1024; // 50MB chunks

// Step 1: Initialize multipart upload
const init = await fetch(`${BASE_URL}?action=mpu-create`, {
  method: 'POST', headers
}).then(r => r.json());

const { fileId, uploadId } = init;

// Step 2: Upload in chunks
const fileData = fs.readFileSync('batch_input.jsonl');
const parts = [];

for (let i = 0; i < fileData.length; i += CHUNK_SIZE) {
  const chunk = fileData.subarray(i, i + CHUNK_SIZE);
  const partNumber = Math.floor(i / CHUNK_SIZE) + 1;

  const part = await fetch(
    `${BASE_URL}/${fileId}?action=mpu-uploadpart&uploadId=${uploadId}&partNumber=${partNumber}`,
    {
      method: 'PUT',
      headers: { ...headers, 'Content-Type': 'application/octet-stream' },
      body: chunk
    }
  ).then(r => r.json());

  parts.push({ partNumber, etag: part.etag });
}

// Step 3: Complete upload and start processing
const batch = await fetch(
  `${BASE_URL}/${fileId}?action=mpu-complete&uploadId=${uploadId}`,
  {
    method: 'POST',
    headers: { ...headers, 'Content-Type': 'application/json' },
    body: JSON.stringify({ parts })
  }
).then(r => r.json());

console.log(`Batch submitted: ${batch.id}`);

3. Poll for completion

cURL
Python
Node.js

curl -s "https://api.moondream.ai/v1/batch/$BATCH_ID" \
  -H "X-Moondream-Auth: YOUR_API_KEY"

import time

while True:
    status = requests.get(f"{BASE_URL}/{batch['id']}", headers=headers).json()
    print(f"Status: {status['status']}, Progress: {status['progress']}")

    if status["status"] in ["completed", "failed"]:
        break
    time.sleep(30)

while (true) {
  const status = await fetch(`${BASE_URL}/${batch.id}`, { headers })
    .then(r => r.json());

  console.log(`Status: ${status.status}, Progress:`, status.progress);

  if (status.status === 'completed' || status.status === 'failed') {
    break;
  }
  await new Promise(r => setTimeout(r, 30000));
}

Response (processing):

{
  "id": "01JQXYZ9ABCDEF123456",
  "status": "processing",
  "model": "moondream3-preview",
  "progress": { "total": 1000, "completed": 450 },
  "created_at": "2025-01-10T12:00:00Z"
}

Response (completed):

{
  "id": "01JQXYZ9ABCDEF123456",
  "status": "completed",
  "model": "moondream3-preview",
  "progress": { "total": 1000, "completed": 998, "failed": 2 },
  "usage": { "input_tokens": 1500000, "output_tokens": 50000 },
  "outputs": [
    { "index": 0, "url": "https://..." },
    { "index": 1, "url": "https://..." }
  ],
  "created_at": "2025-01-10T12:00:00Z",
  "completed_at": "2025-01-10T12:45:00Z",
  "expires_at": "2025-01-17T12:45:00Z"
}

4. Download results

The outputs array contains presigned URLs to download result files. Each URL points to a JSONL file:

curl -o results_0.jsonl "https://..."

Input format

Input files must be valid JSONL (JSON Lines):

One JSON object per line
No blank lines
UTF-8 encoding

Common fields

Field	Type	Required	Description
`id`	any	No	User-provided identifier, returned in results
`skill`	string	Yes	One of: `caption`, `query`, `detect`, `point`
`image`	string	Yes*	Base64-encoded image
`settings`	object	No	Generation settings (caption/query only)

*For query skill, you can use images (array) instead of image for multi-image queries.

Skill-specific fields

Caption:

{"skill": "caption", "image": "<base64>", "length": "short"}

length: "short", "normal" (default), or "long"

Query:

{"skill": "query", "image": "<base64>", "question": "What is this?", "reasoning": true}

question: Required
image or images: Required (text-only queries are not supported)
images: Array of base64 strings for multi-image queries
reasoning: Include chain-of-thought (default: true)

Detect:

{"skill": "detect", "image": "<base64>", "object": "car"}

object: Required, what to detect

Point:

{"skill": "point", "image": "<base64>", "object": "door handle"}

object: Required, what to locate

Settings object

For caption and query skills:

{
  "settings": {
    "temperature": 0.7,
    "top_p": 0.9,
    "max_tokens": 256
  }
}

Output format

Results are returned as JSONL files with one result per line:

Successful result:

{
  "line_index": 0,
  "id": "img_001",
  "status": "ok",
  "result": { "caption": "A golden retriever playing in a park..." },
  "usage": { "input_tokens": 150, "output_tokens": 25 }
}

Failed result:

{
  "line_index": 1,
  "id": "img_002",
  "status": "error",
  "error": { "code": "image_decode_failed", "message": "Invalid image data" }
}

Result fields by skill

Skill	Result field
`caption`	`result.caption`
`query`	`result.answer`, `result.reasoning` (if enabled)
`detect`	`result.objects` (array of bounding boxes)
`point`	`result.points` (array of coordinates)

API reference

Initialize upload

POST /v1/batch?action=mpu-create

Response:

{
  "fileId": "01JQ...",
  "uploadId": "abc123...",
  "key": "requests/org_id/batch_id.jsonl"
}

Upload part

PUT /v1/batch/:fileId?action=mpu-uploadpart&uploadId={uploadId}&partNumber={n}
Content-Type: application/octet-stream

Response:

{
  "etag": "\"abc123...\"",
  "partNumber": 1
}

Important: Each part must be under 100MB. Requests exceeding this limit will receive a 413 Request Entity Too Large error. Split larger files into multiple parts (we recommend 50MB chunks for reliability).

Complete upload

POST /v1/batch/:fileId?action=mpu-complete&uploadId={uploadId}
Content-Type: application/json

{
  "parts": [
    { "partNumber": 1, "etag": "\"abc123...\"" }
  ],
  "model": "moondream3-preview/finetune_id@step"
}

Field	Type	Required	Description
`parts`	array	Yes	Array of uploaded parts with `partNumber` and `etag`
`model`	string	No	Model to use. Default: `moondream3-preview`. For finetunes: `moondream3-preview/finetune_id@step`

Response:

{
  "id": "01JQXYZ9ABCDEF123456",
  "status": "chunking",
  "size": 52428800
}

Abort upload

DELETE /v1/batch/:fileId?action=mpu-abort&uploadId={uploadId}

List batches

GET /v1/batch?limit={n}&cursor={cursor}

Parameter	Default	Max	Description
`limit`	50	100	Results per page
`cursor`	-	-	Pagination cursor

Response:

{
  "batches": [...],
  "next_cursor": "abc...",
  "has_more": true
}

Get batch status

GET /v1/batch/:batchId

Status values:

chunking - File is being validated and split
processing - Requests are being processed
completed - All requests finished
failed - Batch failed (see error field)

Limits

Limit	Value
Max file size	2 GB
Max part size	100 MB (split larger files into multiple parts)
Max lines per batch	100,000
Max line size	10 MB (~7.5 MB base64 image)
Result retention	7 days

Error codes

Batch-level errors

Code	Description
`invalid_model`	Model format invalid (must be `moondream3-preview` or `moondream3-preview/finetune_id@step`)
`validation_error`	Invalid JSONL, blank lines, or bad UTF-8
`limit_exceeded`	File or line limits exceeded
`empty_batch`	No valid lines in file
`processing_failed`	Internal processing error
`timeout`	Batch took too long to process

Per-line errors

Code	Description
`invalid_request`	Missing required fields or invalid values
`image_decode_failed`	Corrupt or invalid image data
`unknown_skill`	Unrecognized skill name
`processing_error`	Failed during inference

Using finetuned models

You can run batch jobs using your finetuned models by specifying the model parameter when completing the upload:

curl -s -X POST "https://api.moondream.ai/v1/batch/$FILE_ID?action=mpu-complete&uploadId=$UPLOAD_ID" \
  -H "X-Moondream-Auth: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "parts": [...],
    "model": "moondream3-preview/YOUR_FINETUNE_ID@STEP"
  }'

The model format is moondream3-preview/finetune_id@step, where:

finetune_id is your finetune's ID (e.g., 01JQXYZ...)
step is the checkpoint step number

All lines in a batch use the same model. The finetune must have a saved checkpoint.

Best practices

Validate locally first: Check your JSONL is valid before uploading
Include IDs: Add an id field to each line for easy result correlation
Handle partial failures: Some lines may fail while others succeed
Download promptly: Results expire after 7 days
Split large jobs: For datasets over 100k images, submit multiple batches

When to use Batch API​

Data privacy​

Workflow overview​

Quickstart​

1. Prepare your input file​

2. Upload and submit the batch​

3. Poll for completion​

4. Download results​

Input format​

Common fields​

Skill-specific fields​

Settings object​

Output format​

Result fields by skill​

API reference​

Initialize upload​

Upload part​

Complete upload​

Abort upload​

List batches​

Get batch status​

Limits​

Error codes​

Batch-level errors​

Per-line errors​

Using finetuned models​

Best practices​

When to use Batch API

Data privacy

Workflow overview

Quickstart

1. Prepare your input file

2. Upload and submit the batch

3. Poll for completion

4. Download results

Input format

Common fields

Skill-specific fields

Settings object

Output format

Result fields by skill

API reference

Initialize upload

Upload part

Complete upload

Abort upload

List batches

Get batch status

Limits

Error codes

Batch-level errors

Per-line errors

Using finetuned models

Best practices