Core Concepts

Explore the core components of PySDK—including the AI inference engine, AI model, and model zoo—to understand how they power modern edge AI applications.

Main Concepts in PySDK

There are three main components in PySDK: the AI inference engine, the AI model, and the AI model zoo. Together, they form the core of the PySDK ecosystem, making it easy to add AI capabilities to any application.

AI Inference Engine

The component responsible for performing predictions by running AI models on hardware.

AI Model

The actual trained model used to make predictions, such as detecting objects or recognizing faces.

AI Model Zoo

A storage location where a collection of AI models is kept and accessed by the inference engine.

What is a Client Application?

Any program you create or use that leverages PySDK to perform AI tasks is a client application. This application could be a Python script, a web service, or any other software that communicates with PySDK to send inputs (like images or video frames) to the AI inference engine and receive predictions (such as detected objects or classifications).

How the AI Inference Engine Operates Across Environments

The AI inference engine is responsible for running AI models on hardware, but it can be deployed and accessed in various environments to suit different application needs. PySDK supports three key types of inference setups:

AI Hub Inference: When the AI inference engine runs on hardware hosted and managed by the DeGirum AI Hub.
AI Server Inference: When the AI inference engine is controlled by a local or networked AI server.
Local Inference: When the AI inference engine directly communicates with local AI hardware on the same machine as the client application.

Types of AI Inference Supported by PySDK

Let’s explore the different types of inference setups supported by PySDK in more detail:

AI Hub Inference

In this setup, the DeGirum AI Hub Application Server manages the inference engine and connects to the DeGirum Device Farm, a collection of cloud-hosted computing nodes with diverse hardware configurations like CPUs, NPUs, and AI accelerators. The client application communicates with the server over a network using HTTP or SocketIO protocols.

When to Use AI Hub Inference

Ideal for rapid prototyping and development when you want to test different models and hardware configurations.
Useful when local hardware is unavailable or when you need flexibility in exploring various options without purchasing or configuring hardware.

Key Advantage

Similar to services like AWS Rekognition or Clarifai, but with greater flexibility, allowing users to choose hardware configurations and deploy custom models optimized for specific applications.

AI Server Inference

The DeGirum AI Server manages inference locally or across a network, acting as an intermediary between the client application and the AI hardware. The client application communicates with the AI server using HTTP or ASIO protocols, allowing multiple applications or machines to access shared AI hardware.

When to Use AI Server Inference

Suitable for distributed environments where multiple applications or machines need shared access to the same AI hardware.
Ideal when you want to separate application logic from hardware management and maintain centralized control over resources.

Key Advantage

Similar to NVIDIA Triton or OpenVINO Model Server, but more flexible, enabling easy scaling by allowing multiple applications to share hardware. Ideal for data centers, edge deployments, or lab setups.

Local Inference

In a local inference setup, both the client application and the AI inference engine operate on the same machine, eliminating the need for network communication between separate client and server components. The client directly interacts with the hardware using PySDK through efficient, low-latency function calls.

When to Use Local Inference

Ideal for edge devices and standalone AI systems where minimizing latency and reducing external dependencies is critical.
Simplifies deployment and maintenance when the application and AI hardware are on the same system.

Key Advantage

Similar to runtimes like ONNX Runtime and TensorFlow Lite, but PySDK’s hardware-agnostic interface enables seamless deployment across different hardware configurations (CPUs, NPUs, AI accelerators) without modifying application code.

While local inference supports multiple applications sharing the AI hardware on the same machine for some hardware options, AI server inference is recommended for distributed setups where multiple machines or applications need shared access to the same hardware over a network.

Summary of Communication Protocols

Inference Type

Server-Client Protocol

Description

AI Hub

Yes (HTTP+SocketIO)

Communication over the Internet with the Application Server. Best for rapid prototyping without hardware setup.

AI Server

Yes (HTTP/ASIO)

Communication between the client and the AI server using HTTP or ASIO. The server can run on the same machine (localhost) or a different machine on the local network. Ideal for multiple machines sharing the same AI hardware.

Local

Direct function calls to the hardware using PySDK. Supports multiple applications sharing the same hardware on the same machine.

AI Models

An AI model is the core component responsible for making predictions, such as detecting objects, recognizing faces, or performing classifications. In PySDK, an AI model is defined by a set of files that include model configurations, binaries, and optional supporting files for labels and postprocessing.

Key Components of an AI Model

Model JSON File (<model_name>.json)

Contains the configuration and metadata required for loading and running the model.

Specifies key parameters such as the postprocessing logic needed for interpreting inference results.

For more details, see our page on Model JSON Structure.

Model Binary Files

Stores the model's weights and architecture in a format optimized for the target hardware and runtime. Required files depend on the model type and the hardware backend.

For more details, see Supported Hardware.

Optional Label File (<label_file>.json)

Label file to map output class indices to readable labels, as specified in the model JSON file.

Optional Python Postprocessing File (postprocess.py)

Defines custom postprocessing logic, if needed, as specified in the model JSON file.

For example, object detection models may require additional steps such as decoding bounding boxes or applying Non-Maximum Suppression (NMS).

AI Model Zoos Supported by PySDK

An AI model zoo is a collection of AI models. Model zoos simplify model management, centralize storage, and provide a consistent way to load models.

PySDK supports both local and AI Hub-based model zoos to accommodate diverse development and deployment needs.

Local Model Zoos

Models are stored in a directory on the local computer.
For local inferencing, models are located on the same computer as the client application.
For AI server inference, the models are stored on the computer running the AI server software.

AI Hub Model Zoos

Models are stored on the AI Hub.
AI Hub provides a centralized location where models can be uploaded, organized, and shared across different environments, similar to platforms like Hugging Face model repositories.
AI Hub model zoos act solely as storage and retrieval systems, ensuring that models are easily available without requiring manual downloads or local management. This allows for quick updates and streamlines deployment across various setups.
While the AI Hub Model Zoo offers a web-based GUI for browsing and managing models, it also allows programmatic access through PySDK, making it easy for developers to load models directly into their applications.

Supported Combinations of AI Inference and Model Zoo Types

PySDK provides flexible options for combining AI inference types and model zoos, supporting almost all combinations to suit various deployment needs. Below are the supported configurations:

Inference Type

Model Zoo Type

Description

AI Hub

The client application connects to the Application Server accessing models stored in the AI Hub model zoo.

AI Server

AI Hub

The AI server downloads the models from the model zoo on the AI Hub and provides inference to the client application.

AI Server

Local Folder

The AI server accesses models stored locally and provides inference to the client application over a network.

Local

AI Hub

The client application, running on the same machine as the AI hardware, downloads models from a model zoo on the AI Hub.

Local

Local Folder

The client application directly accesses models stored locally on the machine.

Local

Local File

The client application loads a specific model directly from a .json configuration file.

If you'll host an AI server or perform inference with a local server, read the AI Server Setup page. You'll learn to setup a model zoo locally to prepare to run inferences.

If you plan to use the AI Hub for inference, go to Loading an AI Model to learn to load models. The models are already managed for you and you can go straight to running inferences.

AI Hub inference with a local model zoo is not supported. The AI Hub Server requires models to be hosted on the AI Hub for remote access.

PreviousPySDK User Guide NextOrganizing Models

Last updated 1 month ago

Was this helpful?