Running AI Model Inference
This is a walkthrough for running predictions. You'll learn about input data types, understanding the results, and finally processing inputs in batches for efficiency.
Last updated
Was this helpful?
This is a walkthrough for running predictions. You'll learn about input data types, understanding the results, and finally processing inputs in batches for efficiency.
Last updated
Was this helpful?
Once you have loaded an AI model and obtained a model handle, you can start running inferences. The class provides two methods for performing AI inference:
: Runs prediction on a single data frame.
: Runs prediction on a batch of frames.
The predict()
method takes a single input data frame and returns an inference result object. You can also call this method by calling degirum.model.Model.__call__
. This is an alias for Model()
. See for more information.
Example:
PySDK models can handle images and raw tensors as data types.
The input you pass to predict()
depends on the number of inputs the model has. If the model has one input, then you pass only one object to predict()
.
The model may have multiple inputs. In this case, the data you pass to predict()
is a list of objects: one object per corresponding input.
You can check what input type your model expects by inspecting the model.model_info.InputType
property.
In this example, we check the input data type of our model.
The model_info.InputType
property returns a list of input types (one entry per model input). The length of this list tells you how many separate inputs the model expects. For instance, a model that takes two images will have two entries in this list.
If your model expects image inputs (InputType == "Image"
), you can supply the input frame in any of the following formats:
Path to an image file.
HTTP URL to an image.
NumPy array.
PIL Image
object.
Raw bytes
of image data.
If your model expects raw tensor inputs (InputType == "Tensor"
), you should provide a multi-dimensional NumPy array with the appropriate shape and data type.
The array’s dimensions must match the model’s expected input shape, which you can find in the model info (model.model_info.InputShape
). The data type of the array’s elements should match the model’s expected raw data type (model.model_info.InputRawDataType
).
When you want to process one frame, use predict()
.
When you have multiple frames to process, use predict_batch()
. The predict_batch()
method runs predictions on an iterable list of frames. The predictions run in a pipeline to maximize throughput, making it more efficient than calling predict()
in a loop.
The predict_batch()
method accepts a single parameter: an iterator object, for example, a list. Populate this iterator with the same types of data you pass to predict()
, such as image paths, image URLs, NumPy arrays, or PIL Image
objects.
predict_batch()
returns a generator of results. You can loop over these results just as you would iterate through successive predict()
calls.
Because predict_batch()
returns a generator, simply calling the method does not immediately run inference. Frames are processed only when you iterate over the returned generator (for example, in a for
loop).
predict_batch()
on a video fileThis example uses predict_batch()
to process a video file. The frame_source
generator yields frames from the video, and the model produces predictions for each frame. The results are overlaid on the frame (result.image_overlay
) and displayed with OpenCV.
InferenceResults
objects contain the following data:
Classification models: class labels with probabilities are printed below the original image.
Object detection models: bounding boxes are printed on the original image.
Hand and pose detection models: keypoints and keypoint connections are printed on the original image.
Segmentation models: segments are printed on the original image.
The results property is what you will typically use in your code. This property contains the core prediction data. Note that if the model outputs coordinates (e.g., bounding boxes), these have been converted back to the coordinates of the original image for your convenience.
The predict_batch()
method accepts an iterator of data frames, such as a list or a stream, and returns a generator. It processes the iterator in a pipeline to maximize throughput, making it more efficient than calling predict()
repeatedly in a loop. This approach is ideal for processing a list of images or a video stream. See the section for more information.
The InputType
property of the ModelParams
class returned by the property describes the number and the type of inputs of the model (see section for details about model info properties).
PySDK automatically converts these inputs into the format required by the model’s neural network according to the model’s preprocessor settings in its JSON configuration. For more details, see the .
When you call predict()
, you receive an inference result object derived from the class. Likewise, predict_batch()
returns a generator that yields inference result objects. These result classes, known as postprocessors, vary by AI model type—classification, object detection, pose detection, segmentation, and so on. From your perspective, they provide the same functionality.
: Original input image as a NumPy array or PIL image.
: Original image with inference results drawn on top. The overlay is model-dependent:
: Keeps a list of inference results in dictionary form. Follow the property link for detailed explanation of all result formats.
: Preprocessed image tensor that was fed into the model (in binary form). Populated only if you enable before performing predictions.