Skip to main content

Batch API

The Batch API lets you process large volumes of images asynchronously. Upload a JSONL file with thousands of requests, and download results when processing completes. Batch processing is ideal for offline workloads like dataset annotation, bulk captioning, or large-scale image analysis.

Pricing: Batch API requests are billed at 50% off standard API pricing.

When to use Batch API

  • Processing thousands to 100,000 images
  • Offline workloads where latency isn't critical
  • Dataset annotation and labeling
  • Cost-sensitive bulk processing

For real-time applications, use the standard API instead.

Data privacy

Your data is never used for training. Input files are deleted immediately after processing completes, and results are automatically purged after 7 days.

Workflow overview

  1. Prepare a JSONL file with one request per line
  2. Upload the file using multipart upload
  3. Poll for completion status
  4. Download result files

Quickstart

1. Prepare your input file

Create a JSONL file with one JSON object per line. Each line specifies a skill and its parameters:

batch_input.jsonl
{"id": "img_001", "skill": "caption", "image": "<base64>", "length": "normal"}
{"id": "img_002", "skill": "query", "image": "<base64>", "question": "What color is the car?"}
{"id": "img_003", "skill": "detect", "image": "<base64>", "object": "person"}

The id field is optional but recommended for correlating results with inputs.

2. Upload and submit the batch

# Step 1: Initialize multipart upload
INIT=$(curl -s -X POST "https://api.moondream.ai/v1/batch?action=mpu-create" \
-H "X-Moondream-Auth: YOUR_API_KEY")

FILE_ID=$(echo $INIT | jq -r '.fileId')
UPLOAD_ID=$(echo $INIT | jq -r '.uploadId')

# Step 2: Split file into 50MB chunks and upload each part
CHUNK_SIZE=$((50 * 1024 * 1024))
split -b $CHUNK_SIZE batch_input.jsonl chunk_

PARTS="[]"
PART_NUM=1
for CHUNK in chunk_*; do
PART=$(curl -s -X PUT "https://api.moondream.ai/v1/batch/$FILE_ID?action=mpu-uploadpart&uploadId=$UPLOAD_ID&partNumber=$PART_NUM" \
-H "X-Moondream-Auth: YOUR_API_KEY" \
-H "Content-Type: application/octet-stream" \
--data-binary @$CHUNK)
ETAG=$(echo $PART | jq -r '.etag')
PARTS=$(echo $PARTS | jq ". + [{\"partNumber\": $PART_NUM, \"etag\": \"$ETAG\"}]")
rm $CHUNK
PART_NUM=$((PART_NUM + 1))
done

# Step 3: Complete upload and start processing
BATCH=$(curl -s -X POST "https://api.moondream.ai/v1/batch/$FILE_ID?action=mpu-complete&uploadId=$UPLOAD_ID" \
-H "X-Moondream-Auth: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d "{\"parts\": $PARTS}")

BATCH_ID=$(echo $BATCH | jq -r '.id')
echo "Batch submitted: $BATCH_ID"

3. Poll for completion

curl -s "https://api.moondream.ai/v1/batch/$BATCH_ID" \
-H "X-Moondream-Auth: YOUR_API_KEY"

Response (processing):

{
"id": "01JQXYZ9ABCDEF123456",
"status": "processing",
"model": "moondream-3-preview",
"progress": { "total": 1000, "completed": 450 },
"created_at": "2025-01-10T12:00:00Z"
}

Response (completed):

{
"id": "01JQXYZ9ABCDEF123456",
"status": "completed",
"model": "moondream-3-preview",
"progress": { "total": 1000, "completed": 998, "failed": 2 },
"usage": { "input_tokens": 1500000, "output_tokens": 50000 },
"outputs": [
{ "index": 0, "url": "https://..." },
{ "index": 1, "url": "https://..." }
],
"created_at": "2025-01-10T12:00:00Z",
"completed_at": "2025-01-10T12:45:00Z",
"expires_at": "2025-01-17T12:45:00Z"
}

4. Download results

The outputs array contains presigned URLs to download result files. Each URL points to a JSONL file:

curl -o results_0.jsonl "https://..."

Input format

Input files must be valid JSONL (JSON Lines):

  • One JSON object per line
  • No blank lines
  • UTF-8 encoding

Common fields

FieldTypeRequiredDescription
idanyNoUser-provided identifier, returned in results
skillstringYesOne of: caption, query, detect, point
imagestringYes*Base64-encoded image
settingsobjectNoGeneration settings (caption/query only)

*For query skill, you can use images (array) instead of image for multi-image queries.

Skill-specific fields

Caption:

{"skill": "caption", "image": "<base64>", "length": "short"}
  • length: "short", "normal" (default), or "long"

Query:

{"skill": "query", "image": "<base64>", "question": "What is this?", "reasoning": true}
  • question: Required
  • image or images: Required (text-only queries are not supported)
  • images: Array of base64 strings for multi-image queries
  • reasoning: Include chain-of-thought (default: true)

Detect:

{"skill": "detect", "image": "<base64>", "object": "car"}
  • object: Required, what to detect

Point:

{"skill": "point", "image": "<base64>", "object": "door handle"}
  • object: Required, what to locate

Settings object

For caption and query skills:

{
"settings": {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 256
}
}

Output format

Results are returned as JSONL files with one result per line:

Successful result:

{
"line_index": 0,
"id": "img_001",
"status": "ok",
"result": { "caption": "A golden retriever playing in a park..." },
"usage": { "input_tokens": 150, "output_tokens": 25 }
}

Failed result:

{
"line_index": 1,
"id": "img_002",
"status": "error",
"error": { "code": "image_decode_failed", "message": "Invalid image data" }
}

Result fields by skill

SkillResult field
captionresult.caption
queryresult.answer, result.reasoning (if enabled)
detectresult.objects (array of bounding boxes)
pointresult.points (array of coordinates)

API reference

Initialize upload

POST /v1/batch?action=mpu-create

Response:

{
"fileId": "01JQ...",
"uploadId": "abc123...",
"key": "requests/org_id/batch_id.jsonl"
}

Upload part

PUT /v1/batch/:fileId?action=mpu-uploadpart&uploadId={uploadId}&partNumber={n}
Content-Type: application/octet-stream

Response:

{
"etag": "\"abc123...\"",
"partNumber": 1
}

Important: Each part must be under 100MB. Requests exceeding this limit will receive a 413 Request Entity Too Large error. Split larger files into multiple parts (we recommend 50MB chunks for reliability).

Complete upload

POST /v1/batch/:fileId?action=mpu-complete&uploadId={uploadId}
Content-Type: application/json

{
"parts": [
{ "partNumber": 1, "etag": "\"abc123...\"" }
]
}

Response:

{
"id": "01JQXYZ9ABCDEF123456",
"status": "chunking",
"size": 52428800
}

Abort upload

DELETE /v1/batch/:fileId?action=mpu-abort&uploadId={uploadId}

List batches

GET /v1/batch?limit={n}&cursor={cursor}
ParameterDefaultMaxDescription
limit50100Results per page
cursor--Pagination cursor

Response:

{
"batches": [...],
"next_cursor": "abc...",
"has_more": true
}

Get batch status

GET /v1/batch/:batchId

Status values:

  • chunking - File is being validated and split
  • processing - Requests are being processed
  • completed - All requests finished
  • failed - Batch failed (see error field)

Limits

LimitValue
Max file size2 GB
Max part size100 MB (split larger files into multiple parts)
Max lines per batch100,000
Max line size10 MB (~7.5 MB base64 image)
Result retention7 days

Error codes

Batch-level errors

CodeDescription
validation_errorInvalid JSONL, blank lines, or bad UTF-8
limit_exceededFile or line limits exceeded
empty_batchNo valid lines in file
processing_failedInternal processing error
timeoutBatch took too long to process

Per-line errors

CodeDescription
invalid_requestMissing required fields or invalid values
image_decode_failedCorrupt or invalid image data
unknown_skillUnrecognized skill name
processing_errorFailed during inference

Best practices

  • Validate locally first: Check your JSONL is valid before uploading
  • Include IDs: Add an id field to each line for easy result correlation
  • Handle partial failures: Some lines may fail while others succeed
  • Download promptly: Results expire after 7 days
  • Split large jobs: For datasets over 100k images, submit multiple batches