Automatic Detection Labeling

By combining the /query and /detect endpoints, we're able to create an auto-labeler detecting all objects in an image.

The /query endpoint can give us a list of either all objects or objects related to a prompt.
Then we call the /detect endpoint for each object.
Finally, we take those results and use them to render our bounding boxes over the image(s).

API Key Setup

export MOONDREAM_API_KEY="your_key_here"
npm start

Two-Step Process

Query (/v1/query) - Find objects
Detect (/v1/detect) - Get coordinates

Step 1: Query Objects

// Configure the prompt based on user input
const objectPrompt = prompt.trim() 
  ? `List all ${prompt.trim()} you can see in this image. Return your answer as a simple comma-separated list of object names.`
  : `List all the objects you can see in this image. Return your answer as a simple comma-separated list of object names.`;

// Call Moondream Query API
const queryResponse = await fetch("https://api.moondream.ai/v1/query", {
  method: "POST",
  headers: {
    "X-Moondream-Auth": process.env.MOONDREAM_API_KEY,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    image_url: image,  // Base64 encoded image
    question: objectPrompt,
    reasoning: true    // Enable grounded reasoning for better accuracy
  }),
});

const queryResult = await queryResponse.json();
const objectList = queryResult.answer || '';

// Parse the comma-separated response
const objects = objectList
  .split(',')
  .map(obj => obj.trim())
  .filter(obj => obj.length > 0);

Step 2: Get Bounding Boxes

const detectionResults = [];
let globalIndex = 0; // Track for consistent coloring

for (const objectName of objects) {
  const detectResponse = await fetch("https://api.moondream.ai/v1/detect", {
    method: "POST",
    headers: {
      "X-Moondream-Auth": process.env.MOONDREAM_API_KEY,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      image_url: image,
      object: objectName,
      reasoning: true  // Enable grounded reasoning
    }),
  });

  if (detectResponse.ok) {
    const detectResult = await detectResponse.json();
    if (detectResult.objects && detectResult.objects.length > 0) {
      detectResult.objects.forEach(obj => {
        detectionResults.push({
          label: objectName,
          x_min: obj.x_min,        // Normalized coordinates (0-1)
          y_min: obj.y_min,
          x_max: obj.x_max,
          y_max: obj.y_max,
          originalIndex: globalIndex // For consistent colors
        });
        globalIndex++;
      });
    }
  }
}

Step 3: Overlay Bounding Boxes

Convert normalized coordinates (0-1) to pixel coordinates and render:

// Convert to pixel coordinates
const x = obj.x_min * imageWidth;
const y = obj.y_min * imageHeight;
const width = (obj.x_max - obj.x_min) * imageWidth;
const height = (obj.y_max - obj.y_min) * imageHeight;

Rendering Options:

SVG overlays: Scalable, precise positioning
Canvas drawing: High performance for many boxes
HTML divs: Simple absolute positioning
CSS transforms: Hardware-accelerated rendering

To take this further you could add

Editing bounding boxes and object labels
Manually adding new objects and bounding boxes
Downloading your outputs as a dataset

Sample App

Here's a JS sample app to get started with / test: Moondream Auto-Labeler Repository

API Key Setup​

Two-Step Process​

Step 1: Query Objects​

Step 2: Get Bounding Boxes​

Step 3: Overlay Bounding Boxes​

To take this further you could add​

Sample App​