Batch API
The Batch API lets you process large volumes of images asynchronously. Upload a JSONL file with thousands of requests, and download results when processing completes. Batch processing is ideal for offline workloads like dataset annotation, bulk captioning, or large-scale image analysis.
Pricing: Batch API requests are billed at 50% off standard API pricing.
When to use Batch API
- Processing thousands to 100,000 images
- Offline workloads where latency isn't critical
- Dataset annotation and labeling
- Cost-sensitive bulk processing
For real-time applications, use the standard API instead.
Data privacy
Your data is never used for training. Input files are deleted immediately after processing completes, and results are automatically purged after 7 days.
Workflow overview
- Prepare a JSONL file with one request per line
- Upload the file using multipart upload
- Poll for completion status
- Download result files
Quickstart
1. Prepare your input file
Create a JSONL file with one JSON object per line. Each line specifies a skill and its parameters:
{"id": "img_001", "skill": "caption", "image": "<base64>", "length": "normal"}
{"id": "img_002", "skill": "query", "image": "<base64>", "question": "What color is the car?"}
{"id": "img_003", "skill": "detect", "image": "<base64>", "object": "person"}
The id field is optional but recommended for correlating results with inputs.
2. Upload and submit the batch
- cURL
- Python
- Node.js
# Step 1: Initialize multipart upload
INIT=$(curl -s -X POST "https://api.moondream.ai/v1/batch?action=mpu-create" \
-H "X-Moondream-Auth: YOUR_API_KEY")
FILE_ID=$(echo $INIT | jq -r '.fileId')
UPLOAD_ID=$(echo $INIT | jq -r '.uploadId')
# Step 2: Split file into 50MB chunks and upload each part
CHUNK_SIZE=$((50 * 1024 * 1024))
split -b $CHUNK_SIZE batch_input.jsonl chunk_
PARTS="[]"
PART_NUM=1
for CHUNK in chunk_*; do
PART=$(curl -s -X PUT "https://api.moondream.ai/v1/batch/$FILE_ID?action=mpu-uploadpart&uploadId=$UPLOAD_ID&partNumber=$PART_NUM" \
-H "X-Moondream-Auth: YOUR_API_KEY" \
-H "Content-Type: application/octet-stream" \
--data-binary @$CHUNK)
ETAG=$(echo $PART | jq -r '.etag')
PARTS=$(echo $PARTS | jq ". + [{\"partNumber\": $PART_NUM, \"etag\": \"$ETAG\"}]")
rm $CHUNK
PART_NUM=$((PART_NUM + 1))
done
# Step 3: Complete upload and start processing
BATCH=$(curl -s -X POST "https://api.moondream.ai/v1/batch/$FILE_ID?action=mpu-complete&uploadId=$UPLOAD_ID" \
-H "X-Moondream-Auth: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d "{\"parts\": $PARTS}")
BATCH_ID=$(echo $BATCH | jq -r '.id')
echo "Batch submitted: $BATCH_ID"
import requests
API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.moondream.ai/v1/batch"
headers = {"X-Moondream-Auth": API_KEY}
CHUNK_SIZE = 50 * 1024 * 1024 # 50MB chunks
# Step 1: Initialize multipart upload
init = requests.post(f"{BASE_URL}?action=mpu-create", headers=headers).json()
file_id = init["fileId"]
upload_id = init["uploadId"]
# Step 2: Upload in chunks
parts = []
with open("batch_input.jsonl", "rb") as f:
part_number = 1
while chunk := f.read(CHUNK_SIZE):
part = requests.put(
f"{BASE_URL}/{file_id}?action=mpu-uploadpart&uploadId={upload_id}&partNumber={part_number}",
headers={**headers, "Content-Type": "application/octet-stream"},
data=chunk
).json()
parts.append({"partNumber": part_number, "etag": part["etag"]})
part_number += 1
# Step 3: Complete upload and start processing
batch = requests.post(
f"{BASE_URL}/{file_id}?action=mpu-complete&uploadId={upload_id}",
headers=headers,
json={"parts": parts}
).json()
print(f"Batch submitted: {batch['id']}")
import fs from 'fs';
const API_KEY = 'YOUR_API_KEY';
const BASE_URL = 'https://api.moondream.ai/v1/batch';
const headers = { 'X-Moondream-Auth': API_KEY };
const CHUNK_SIZE = 50 * 1024 * 1024; // 50MB chunks
// Step 1: Initialize multipart upload
const init = await fetch(`${BASE_URL}?action=mpu-create`, {
method: 'POST', headers
}).then(r => r.json());
const { fileId, uploadId } = init;
// Step 2: Upload in chunks
const fileData = fs.readFileSync('batch_input.jsonl');
const parts = [];
for (let i = 0; i < fileData.length; i += CHUNK_SIZE) {
const chunk = fileData.subarray(i, i + CHUNK_SIZE);
const partNumber = Math.floor(i / CHUNK_SIZE) + 1;
const part = await fetch(
`${BASE_URL}/${fileId}?action=mpu-uploadpart&uploadId=${uploadId}&partNumber=${partNumber}`,
{
method: 'PUT',
headers: { ...headers, 'Content-Type': 'application/octet-stream' },
body: chunk
}
).then(r => r.json());
parts.push({ partNumber, etag: part.etag });
}
// Step 3: Complete upload and start processing
const batch = await fetch(
`${BASE_URL}/${fileId}?action=mpu-complete&uploadId=${uploadId}`,
{
method: 'POST',
headers: { ...headers, 'Content-Type': 'application/json' },
body: JSON.stringify({ parts })
}
).then(r => r.json());
console.log(`Batch submitted: ${batch.id}`);
3. Poll for completion
- cURL
- Python
- Node.js
curl -s "https://api.moondream.ai/v1/batch/$BATCH_ID" \
-H "X-Moondream-Auth: YOUR_API_KEY"
import time
while True:
status = requests.get(f"{BASE_URL}/{batch['id']}", headers=headers).json()
print(f"Status: {status['status']}, Progress: {status['progress']}")
if status["status"] in ["completed", "failed"]:
break
time.sleep(30)
while (true) {
const status = await fetch(`${BASE_URL}/${batch.id}`, { headers })
.then(r => r.json());
console.log(`Status: ${status.status}, Progress:`, status.progress);
if (status.status === 'completed' || status.status === 'failed') {
break;
}
await new Promise(r => setTimeout(r, 30000));
}
Response (processing):
{
"id": "01JQXYZ9ABCDEF123456",
"status": "processing",
"model": "moondream-3-preview",
"progress": { "total": 1000, "completed": 450 },
"created_at": "2025-01-10T12:00:00Z"
}
Response (completed):
{
"id": "01JQXYZ9ABCDEF123456",
"status": "completed",
"model": "moondream-3-preview",
"progress": { "total": 1000, "completed": 998, "failed": 2 },
"usage": { "input_tokens": 1500000, "output_tokens": 50000 },
"outputs": [
{ "index": 0, "url": "https://..." },
{ "index": 1, "url": "https://..." }
],
"created_at": "2025-01-10T12:00:00Z",
"completed_at": "2025-01-10T12:45:00Z",
"expires_at": "2025-01-17T12:45:00Z"
}
4. Download results
The outputs array contains presigned URLs to download result files. Each URL points to a JSONL file:
curl -o results_0.jsonl "https://..."
Input format
Input files must be valid JSONL (JSON Lines):
- One JSON object per line
- No blank lines
- UTF-8 encoding
Common fields
| Field | Type | Required | Description |
|---|---|---|---|
id | any | No | User-provided identifier, returned in results |
skill | string | Yes | One of: caption, query, detect, point |
image | string | Yes* | Base64-encoded image |
settings | object | No | Generation settings (caption/query only) |
*For query skill, you can use images (array) instead of image for multi-image queries.
Skill-specific fields
Caption:
{"skill": "caption", "image": "<base64>", "length": "short"}
length:"short","normal"(default), or"long"
Query:
{"skill": "query", "image": "<base64>", "question": "What is this?", "reasoning": true}
question: Requiredimageorimages: Required (text-only queries are not supported)images: Array of base64 strings for multi-image queriesreasoning: Include chain-of-thought (default:true)
Detect:
{"skill": "detect", "image": "<base64>", "object": "car"}
object: Required, what to detect
Point:
{"skill": "point", "image": "<base64>", "object": "door handle"}
object: Required, what to locate
Settings object
For caption and query skills:
{
"settings": {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 256
}
}
Output format
Results are returned as JSONL files with one result per line:
Successful result:
{
"line_index": 0,
"id": "img_001",
"status": "ok",
"result": { "caption": "A golden retriever playing in a park..." },
"usage": { "input_tokens": 150, "output_tokens": 25 }
}
Failed result:
{
"line_index": 1,
"id": "img_002",
"status": "error",
"error": { "code": "image_decode_failed", "message": "Invalid image data" }
}
Result fields by skill
| Skill | Result field |
|---|---|
caption | result.caption |
query | result.answer, result.reasoning (if enabled) |
detect | result.objects (array of bounding boxes) |
point | result.points (array of coordinates) |
API reference
Initialize upload
POST /v1/batch?action=mpu-create
Response:
{
"fileId": "01JQ...",
"uploadId": "abc123...",
"key": "requests/org_id/batch_id.jsonl"
}
Upload part
PUT /v1/batch/:fileId?action=mpu-uploadpart&uploadId={uploadId}&partNumber={n}
Content-Type: application/octet-stream
Response:
{
"etag": "\"abc123...\"",
"partNumber": 1
}
Important: Each part must be under 100MB. Requests exceeding this limit will receive a 413 Request Entity Too Large error. Split larger files into multiple parts (we recommend 50MB chunks for reliability).
Complete upload
POST /v1/batch/:fileId?action=mpu-complete&uploadId={uploadId}
Content-Type: application/json
{
"parts": [
{ "partNumber": 1, "etag": "\"abc123...\"" }
]
}
Response:
{
"id": "01JQXYZ9ABCDEF123456",
"status": "chunking",
"size": 52428800
}
Abort upload
DELETE /v1/batch/:fileId?action=mpu-abort&uploadId={uploadId}
List batches
GET /v1/batch?limit={n}&cursor={cursor}
| Parameter | Default | Max | Description |
|---|---|---|---|
limit | 50 | 100 | Results per page |
cursor | - | - | Pagination cursor |
Response:
{
"batches": [...],
"next_cursor": "abc...",
"has_more": true
}
Get batch status
GET /v1/batch/:batchId
Status values:
chunking- File is being validated and splitprocessing- Requests are being processedcompleted- All requests finishedfailed- Batch failed (seeerrorfield)
Limits
| Limit | Value |
|---|---|
| Max file size | 2 GB |
| Max part size | 100 MB (split larger files into multiple parts) |
| Max lines per batch | 100,000 |
| Max line size | 10 MB (~7.5 MB base64 image) |
| Result retention | 7 days |
Error codes
Batch-level errors
| Code | Description |
|---|---|
validation_error | Invalid JSONL, blank lines, or bad UTF-8 |
limit_exceeded | File or line limits exceeded |
empty_batch | No valid lines in file |
processing_failed | Internal processing error |
timeout | Batch took too long to process |
Per-line errors
| Code | Description |
|---|---|
invalid_request | Missing required fields or invalid values |
image_decode_failed | Corrupt or invalid image data |
unknown_skill | Unrecognized skill name |
processing_error | Failed during inference |
Best practices
- Validate locally first: Check your JSONL is valid before uploading
- Include IDs: Add an
idfield to each line for easy result correlation - Handle partial failures: Some lines may fail while others succeed
- Download promptly: Results expire after 7 days
- Split large jobs: For datasets over 100k images, submit multiple batches