LogoLogo
AI HubCommunityWebsite
  • Start Here
  • AI Hub
    • Overview
    • Quickstart
    • Teams
    • Device Farm
    • Browser Inference
    • Model Zoo
      • Hailo
      • Intel
      • MemryX
      • BrainChip
      • Google
      • DeGirum
      • Rockchip
    • View and Create Model Zoos
    • Model Compiler
    • PySDK Integration
  • PySDK
    • Overview
    • Quickstart
    • Installation
    • Runtimes and Drivers
      • Hailo
      • OpenVINO
      • MemryX
      • BrainChip
      • Rockchip
      • ONNX
    • PySDK User Guide
      • Core Concepts
      • Organizing Models
      • Setting Up an AI Server
      • Loading an AI Model
      • Running AI Model Inference
      • Model JSON Structure
      • Command Line Interface
      • API Reference Guide
        • PySDK Package
        • Model Module
        • Zoo Manager Module
        • Postprocessor Module
        • AI Server Module
        • Miscellaneous Modules
      • Older PySDK User Guides
        • PySDK 0.16.0
        • PySDK 0.15.2
        • PySDK 0.15.1
        • PySDK 0.15.0
        • PySDK 0.14.3
        • PySDK 0.14.2
        • PySDK 0.14.1
        • PySDK 0.14.0
        • PySDK 0.13.4
        • PySDK 0.13.3
        • PySDK 0.13.2
        • PySDK 0.13.1
        • PySDK 0.13.0
    • Release Notes
      • Retired Versions
    • EULA
  • DeGirum Tools
    • Overview
      • Streams
        • Streams Base
        • Streams Gizmos
      • Compound Models
      • Result Analyzer Base
      • Inference Support
  • DeGirumJS
    • Overview
    • Get Started
    • Understanding Results
    • Release Notes
    • API Reference Guides
      • DeGirumJS 0.1.3
      • DeGirumJS 0.1.2
      • DeGirumJS 0.1.1
      • DeGirumJS 0.1.0
      • DeGirumJS 0.0.9
      • DeGirumJS 0.0.8
      • DeGirumJS 0.0.7
      • DeGirumJS 0.0.6
      • DeGirumJS 0.0.5
      • DeGirumJS 0.0.4
      • DeGirumJS 0.0.3
      • DeGirumJS 0.0.2
      • DeGirumJS 0.0.1
  • Orca
    • Overview
    • Benchmarks
    • Unboxing and Installation
    • M.2 Setup
    • USB Setup
    • Thermal Management
    • Tools
  • Resources
    • External Links
Powered by GitBook

Get Started

  • AI Hub Quickstart
  • PySDK Quickstart
  • PySDK in Colab

Resources

  • AI Hub
  • Community
  • DeGirum Website

Social

  • LinkedIn
  • YouTube

Legal

  • PySDK EULA
  • Terms of Service
  • Privacy Policy

© 2025 DeGirum Corp.

On this page
  • Running AI Model Inference
  • Model.predict()
  • Model.predict_batch()
  • Supported Input Data Types
  • Images
  • Tensors
  • Single Frame Inference
  • Batch Inference
  • Inference Results

Was this helpful?

  1. PySDK
  2. PySDK User Guide

Running AI Model Inference

This is a walkthrough for running predictions. You'll learn about input data types, understanding the results, and finally processing inputs in batches for efficiency.

PreviousLoading an AI ModelNextModel JSON Structure

Last updated 2 months ago

Was this helpful?

Running AI Model Inference

Once you loaded an AI model and obtained a model handling object, you can start doing AI inferences on your model. The degirum.model.Model class has two methods for performing AI inference:

  • degirum.model.Model.predict: Runs prediction on a single data frame.

  • degirum.model.Model.predict_batch: Runs prediction on a batch of frames.

Model.predict()

The predict() method takes a single input data frame and returns an inference result object. You can also call this method by calling degirum.model.Model.__call__. This is an alias for Model(). See for more information.

# Method Signature: Model.predict()
degirum.model.Model.predict(data)

Example:

import degirum as dg

# Declaring variables
# Set your model, inference host address, model zoo, and token in these variables.
your_model_name = "model-name"
your_host_address = dg.CLOUD # Can be dg.CLOUD, host:port, or dg.LOCAL
your_model_zoo = "degirum/public"
your_token = "<token>"

# Specify the image you will run inference on
your_image = "path/image.jpg"

# Loading a model
model = dg.load_model(
    model_name = your_model_name, 
    inference_host_address = your_host_address, 
    zoo_url = your_model_zoo, 
    token = your_token 
    # optional parameters, such as overlay_show_probabilities = True
)

# Run a prediction and assign it to result
result = model(your_image)

# Print the prediction result
print(result)

Model.predict_batch()

# Method Signature: Model.predict_batch()
degirum.model.Model.predict_batch(data)

Supported Input Data Types

PySDK models can handle images and raw tensors as data types.

The input you pass to predict() depends on the number of inputs the model has. If the model has one input, then you pass only one object to predict().

The model may have multiple inputs. In this case, the data you pass to predict() is a list of objects: one object per corresponding input.

Check Input Data Type of Your Model

You can check what input type your model expects by inspecting the model.model_info.InputType property.

import degirum as dg

# Declaring variables
# Set your model, inference host address, model zoo, and token in these variables.
your_model_name = "model-name"
your_host_address = dg.CLOUD # Can be dg.CLOUD, host:port, or dg.LOCAL
your_model_zoo = "degirum/public"
your_token = "<token>"

# Loading a model
model = dg.load_model(
    model_name = your_model_name, 
    inference_host_address = your_host_address, 
    zoo_url = your_model_zoo, 
    token = your_token 
    # optional parameters, such as overlay_show_probabilities = True
)

# Print input data supported by your model.
print(model.model_info.InputType)

In this example, we check the input data type of our model.

The model_info.InputType property returns a list of input types (one entry per model input). The length of this list tells you how many separate inputs the model expects. For instance, a model that takes two images will have two entries in this list.

Images

If your model expects image inputs (InputType == "Image"), you can supply the input frame in any of the following formats:

  • Path to an image file.

  • HTTP URL to an image.

  • NumPy array.

  • PIL Image object.

  • Raw bytes of image data.

Tensors

If your model expects raw tensor inputs (InputType == "Tensor"), you should provide a multi-dimensional NumPy array with the appropriate shape and data type.

The array’s dimensions must match the model’s expected input shape, which you can find in the model info (model.model_info.InputShape). The data type of the array’s elements should match the model’s expected raw data type (model.model_info.InputRawDataType).

Single Frame Inference

When you want to process one frame, use predict().

import degirum as dg
import cv2

# Declaring variables
# Set your model, inference host address, model zoo, and token in these variables.
your_model_name = "model-name"
your_host_address = dg.CLOUD # Can be dg.CLOUD, host:port, or dg.LOCAL
your_model_zoo = "degirum/public"
your_token = "<token>"

# Specify the image you will run inference on
your_image = "path/image.jpg"

# Loading a model
model = dg.load_model(
    model_name = your_model_name, 
    inference_host_address = your_host_address, 
    zoo_url = your_model_zoo, 
    token = your_token 
    # optional parameters, such as overlay_show_probabilities = True
)

# Run a prediction and assign it to result
result = model(your_image)

# Print the prediction result
print(result)

Batch Inference

When you have multiple frames to process, use predict_batch(). The predict_batch() method runs predictions on an iterable list of frames. These predictions are run in a pipeline, which maximizes throughput and is more efficient than calling predict() in a loop.

The predict_batch method accepts single parameter: an iterator object, for example, a list. You populate your iterator object with the same type of data you pass to regular predict(), i.e. input image path strings, input image URL string, NumPy arrays, or PIL Image objects (in case of PIL image backend).

predict_batch() returns a generator of results. You can loop over these results just as you would iterate through results from successive predict() calls.

Since predict_batch returns a generator, simply calling the method won’t immediately run the inference. Frames are processed only when you iterate over the returned generator (for example, in a for loop).

Example: Iterating over predict_batch results

for result in model.predict_batch(['image1.jpg','image2.jpg']):
    print(result)

Example: Using predict_batch() on a video file

In this example, we use predict_batch() to process a video file. The frame_source generator yields frames from the video, and for each frame, the model’s predictions are obtained. We then overlay the results on the frame (result.image_overlay) and display it using OpenCV.

import degirum as dg
import cv2

# Declaring variables
# Set your model, inference host address, model zoo, and token in these variables.
your_model_name = "model-name"
your_host_address = dg.CLOUD # Can be dg.CLOUD, host:port, or dg.LOCAL
your_model_zoo = "degirum/public"
your_token = "<token>"

# Specify the video you will run inference on
your_video = "path/video.mp4"

# Loading a model
model = dg.load_model(
    model_name = your_model_name, 
    inference_host_address = your_host_address, 
    zoo_url = your_model_zoo, 
    token = your_token 
    # optional parameters, such as overlay_show_probabilities = True
)

# Open your video file
stream = cv2.VideoCapture(your_video) 

# Define generator function to produce video frames
def frame_source(stream):
    while True:
      ret, frame = stream.read()
      if not ret:
         break # end of file
      yield frame

# Run predict_batch on stream of frames from video file
for result in model.predict_batch(frame_source(stream)):
    # Print raw results for each frame
    print(result)

# Release stream
stream.release()

Inference Results

InferenceResults objects contain the following data:

  • degirum.postprocessor.InferenceResults.image: Original input image as a NumPy array or PIL image.

  • degirum.postprocessor.InferenceResults.image_overlay: Original image with inference results drawn on top. The drawing is model-dependent:

    • Classification models: class labels with probabilities are printed below the original image.

    • Object detection models: bounding boxes are printed on the original image.

    • Hand and pose detection models: keypoints and keypoint connections are printed on the original image.

    • Segmentation models: segments are printed on the original image.

  • degirum.postprocessor.InferenceResults.results: Keeps a list of inference results in dictionary form. Follow the property link for detailed explanation of all result formats.

  • degirum.postprocessor.InferenceResults.image_model: Preprocessed image tensor that was fed into the model (in binary form). Populated only if you enable Model.save_model_image before performing predictions.

The results property is what you will typically use in your code. This property contains the core prediction data. Note that if the model outputs coordinates (e.g., bounding boxes), these have been converted back to the coordinates of the original image for your convenience.

Example: Combine predict_batch() with image_overlay to show prediction results on original video

import degirum as dg
import cv2

# Declaring variables
# Set your model, inference host address, model zoo, and token in these variables.
your_model_name = "model-name"
your_host_address = dg.CLOUD # Can be dg.CLOUD, host:port, or dg.LOCAL
your_model_zoo = "degirum/public"
your_token = "<token>"

# Specify the video you will run inference on
your_video = "path/video.mp4"

# Loading a model
model = dg.load_model(
    model_name = your_model_name, 
    inference_host_address = your_host_address, 
    zoo_url = your_model_zoo, 
    token = your_token 
    # optional parameters, such as overlay_show_probabilities = True
)

# Open your video file
stream = cv2.VideoCapture(your_video) 

# Define generator function to produce video frames
def frame_source(stream):
    while True:
        ret, frame = stream.read()
        if not ret:
            break # end of file
        yield frame

# Process the video frames in a batch and display the overlay
for result in model.predict_batch(frame_source(stream)):
    # Retrieve the overlay; if it's callable, call it
    overlay = result.image_overlay() if callable(result.image_overlay) else result.image_overlay

    # Display the overlay image in a window
    cv2.imshow("Inference Overlay", overlay)

    # Wait 1ms; press 'q' to quit early
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the video stream and close all OpenCV windows
stream.release()
cv2.destroyAllWindows()

The predict_batch() method takes an iterator object of data frames (such as a list or stream) and returns a generator object. It processes the iterator object in a pipelined manner, which is more efficient than calling predict() repeatedly in a loop. This method is ideal for processing a list of images or a video stream. See the section for more information.

The InputType property of the ModelParams class returned by the degirum.model.Model.model_info property describes the number and the type of inputs of the model (see section for details about model info properties).

PySDK will automatically convert these inputs into the format required by the model’s neural network, according to the model’s preprocessor settings in its JSON configuration. You may read more about the .

When you call predict(), you get back an inference result object derived from the class. Likewise, when you callpredict_batch(), you get a generator object that yields inference result objects. These result classes are called postprocessors. Particular postprocessor class types depend on the AI model type: classification, object detection, pose detection, segmentation, and etc. From your point of view, they deliver identical functionality.

degirum.postprocessor.InferenceResults
Single Frame Inference
Batch Inference
Model Info
preprocessor parameters here