Working with Input and Output Data
Guide for the various input data formats supported by DeGirumJS for inference, as well as a detailed breakdown of the output result object structures for different model types.
Input Data Formats
DeGirumJS predict()
and predict_batch()
methods are designed to be flexible, accepting a wide range of input image formats. Internally, the SDK uses the ImageBitmap
API for efficient image processing and handles the conversion of various input types into a standardized ImageBitmap
format before sending them to the model for inference.
The following input types are supported:
HTML Elements:
HTMLImageElement
(<img>
)SVGImageElement
(<image>
within SVG)HTMLVideoElement
(<video>
) - The current frame will be used.HTMLCanvasElement
(<canvas>
)
Image Data Objects:
File
(specifically image files likeimage/jpeg
,image/png
, etc.)VideoFrame
(if available in the environment)
String Formats:
Image URL: A standard URL pointing to an image resource. Example:
https://example.com/path/to/image.jpg
Data URL: A string representing a Base64-encoded image, prefixed with
data:
. Example:
Base64 String: A raw Base64-encoded string of image data (without the
data:
prefix).
Array Buffer Types:
Batch Processing: The
predict_batch()
method can accept an Async Generator that yields pairs of input data and frame identifiers. The input data must be in one of the above formats. This allows for efficient processing of multiple frames for real-time applications such as video streams or multi-frame inference. See Advanced Inference: Batch Processing & Callbacks
async function* imageGenerator() {
yield [image1, 'frame1'];
yield [image2, 'frame2'];
// ...
}
Web Codecs API The
predict_batch()
method can also work withReadableStream
objects. This enables efficient video processing while using the Web Codecs API for handling frames. Use this to build video processing pipelines in fewer lines of code.
Usage Examples for Input
You can pass any of the supported input types directly to the predict()
or predict_batch()
methods:
// Assuming 'model' is an initialized CloudServerModel or AIServerModel instance
// 1. Using an HTMLImageElement
const imgElement = document.getElementById('myImage');
const result1 = await model.predict(imgElement);
// 2. Using a File object (e.g., from an <input type="file">)
const fileInput = document.getElementById('fileUpload');
fileInput.addEventListener('change', async (event) => {
const file = event.target.files[0];
if (file && file.type.startsWith('image/')) {
const result2 = await model.predict(file);
console.log('Inference result from File:', result2);
}
});
// 3. Using an Image URL
const imageUrl = 'https://www.degirum.com/images/degirum-logo.png';
const result3 = await model.predict(imageUrl);
// 4. Using a Data URL
const dataUrl = '';
const result4 = await model.predict(dataUrl);
// 5. Using a Base64 string (without the data: prefix)
const base64String = 'iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==';
const result5 = await model.predict(base64String);
// 6. Using an ArrayBuffer (e.g., from fetching an image as arrayBuffer)
async function fetchImageAsArrayBuffer(url) {
const response = await fetch(url);
return await response.arrayBuffer();
}
const arrayBuffer = await fetchImageAsArrayBuffer('https://www.degirum.com/images/degirum-logo.png');
const result6 = await model.predict(arrayBuffer);
// 7. Using a Uint8Array
const uint8Array = new Uint8Array([/* ... image byte data ... */]);
const result7 = await model.predict(uint8Array);
// 8. Using predict_batch with an AsyncGenerator
async function* imageGenerator() {
yield [imgElement, 'frame1'];
yield [imageUrl, 'frame2'];
yield [dataUrl, 'frame3'];
}
for await (const batchResult of model.predict_batch(imageGenerator())) {
console.log('Batch inference result:', batchResult);
}
Output Data Structure
The predict()
and predict_batch()
methods of AIServerModel
and CloudServerModel
return a comprehensive result object. This object encapsulates the inference output from the model, along with contextual information about the processed frame.
The general structure of the returned object is as follows:
{
"result": [
[ /* Inference results (array of objects, structure varies by model type) */ ],
"frame_info_string" // Unique identifier for the frame
],
"imageFrame": ImageBitmap, // The original input image as an ImageBitmap (if not a video element)
"modelImage": Blob // The preprocessed image blob sent to the model (if `saveModelImage` is true)
}
Accessing the Result Data
Inference Results: Access the main inference results using
someResult.result[0]
. This is an array of objects, where each object represents a detected item, classification, pose, or segmentation mask.Frame Info / Number: Retrieve the unique identifier or frame information using
someResult.result[1]
. Use this to correlate results with specific input frames, especially in batch processing.Original Input Image: Access the original input image as an
ImageBitmap
viasomeResult.imageFrame
. Note that this will benull
if the input was anHTMLVideoElement
to avoid memory issues with continuous video streams.Preprocessed Model Image: If the
saveModelImage
model parameter is set totrue
, thesomeResult.modelImage
property will contain the preprocessed image as aBlob
that was sent to the model. This can be useful for debugging preprocessing steps.
Inference Result Types
The structure of the objects within someResult.result[0]
varies depending on the type of AI model and its output. The SDK supports the following common inference result types:
Detection Results
Classification Results
Pose Detection Results
Segmentation Results
Multi-Label Classification Results
For detailed examples and explanations of each result type, refer to Result Object Structure + Examples. This document provides comprehensive JSON examples and descriptions for bbox
, landmarks
, category_id
, label
, score
, and mask
fields.
Displaying Results on a Canvas
The displayResultToCanvas()
method handles the drawing of bounding boxes, labels, keypoints, and segmentation masks based on the model's output.
/**
* Overlay the result onto the image frame and display it on the canvas.
* @async
* @param {Object} combinedResult - The result object combined with the original image frame. This is directly received from `predict` or `predict_batch`
* @param {string|HTMLCanvasElement|OffscreenCanvas} outputCanvasName - The canvas to draw the image onto. Either the canvas element or the ID of the canvas element.
* @param {boolean} [justResults=false] - Whether to show only the result overlay without the image frame.
*/
async displayResultToCanvas(combinedResult, outputCanvasName, justResults = false)
Parameters:
combinedResult
: The result object returned bypredict()
orpredict_batch()
.outputCanvasName
: The ID of the HTML<canvas>
element (as a string) or a direct reference to anHTMLCanvasElement
orOffscreenCanvas
object where the results will be drawn.justResults
(optional): A boolean flag. Iftrue
, only the inference overlay (e.g., bounding boxes, labels) will be drawn on the canvas, without drawing the originalimageFrame
. This is useful when you want to overlay results on an existing canvas content or when the input was a video stream, for example. Defaults tofalse
.
Example:
// Assuming 'model' is an initialized CloudServerModel or AIServerModel instance
// and 'myImage' is a valid input image
const outputCanvas = document.getElementById('outputCanvas');
async function runInferenceAndDisplay() {
try {
const result = await model.predict(myImage);
// Display the result on the canvas
await model.displayResultToCanvas(result, outputCanvas);
console.log('Inference and display complete!');
} catch (error) {
console.error('Error during inference or display:', error);
}
}
runInferenceAndDisplay();
Last updated
Was this helpful?