Device Management for Inference

Configure and switch between device types when running inference with DeGirumJS.

AI Models can be run on various hardware configurations, and the DeGirumJS provides a flexible way to manage device types chosen for inference. This is particularly useful when you want to switch between different hardware accelerators or runtimes without having to significantly change your code.

Both AIServerModel and CloudServerModel classes offer flexible ways to manage device types, allowing you to configure and switch between devices dynamically.

Supported Device Types

Each model has a set of SupportedDeviceTypes, which indicates the runtime/device combinations that are compatible for inference. The format for device types is "RUNTIME/DEVICE", where:

  • RUNTIME refers to the AI engine or runtime used for inference (e.g., TENSORRT, OPENVINO).

  • DEVICE refers to the hardware type (e.g., CPU, GPU, NPU).

AIServerModel / CloudServerModel Device Management

In the AIServerModel and CloudServerModel classes, device management is integrated into both the initialization and runtime phases of the model lifecycle. Below are key scenarios and examples:

Default Device Type Selection

When you load a model without specifying a device type, the default device type specified in the model parameters is selected.

let model = await zoo.loadModel('your_model_name');
console.log(model.deviceType); // Outputs: "DefaultRuntime/DefaultAgent"

Switching Device Types After Initialization

You can change the device type even after the model has been initialized. The model will validate the requested device type against the system’s supported device types.

model.deviceType = 'RUNTIME2/CPU';
console.log(model.deviceType); // Outputs: "RUNTIME2/CPU"

If the requested device type is not valid, an error will be thrown.

Specifying a Device Type During Initialization

You can specify a device type when loading the model. The model will start with the specified device type if it’s available.

let model = await zoo.loadModel('your_model_name', { deviceType: 'RUNTIME2/CPU' });
console.log(model.deviceType); // Outputs: "RUNTIME2/CPU"

Handling Multiple Device Types

The SDK allows you to provide a list of device types. The first available option in the list will be selected.

model.deviceType = ['RUNTIME3/CPU', 'RUNTIME1/CPU'];
console.log(model.deviceType); // Outputs: "RUNTIME3/CPU" if available, otherwise "RUNTIME1/CPU"

Fallback and Error Handling

If none of the specified device types are supported, the model will throw an error, ensuring that only valid configurations are used.

try {
    model.deviceType = ['INVALID/DEVICE', 'ANOTHER_INVALID/DEVICE'];
} catch (e) {
    console.error('Error: Invalid device type selection');
}

Supported Device Types

You can check the supported device types for a model using the supportedDeviceTypes property.

console.log(model.supportedDeviceTypes); // Outputs: ["RUNTIME1/CPU", "RUNTIME2/CPU"]

System Supported Device Types

You can check the system’s list of supported devices for inference using the getSupportedDevices() method of the dg_sdk class.

let dg = new dg_sdk();
let aiserverDevices = dg.getSupportedDevices('targetAIServerIp');
console.log(aiserverDevices); // Outputs: ["RUNTIME1/CPU", "RUNTIME2/CPU", "RUNTIME3/CPU"]
let cloudDevices = dg.getSupportedDevices('cloud');
console.log(cloudDevices); // Outputs: ["RUNTIME1/CPU", "RUNTIME2/CPU", "RUNTIME3/CPU"]

Device management in both AIServerModel and CloudServerModel is designed to be flexible, allowing you to fine-tune the inference environment. You can easily switch between device types, handle fallbacks, and ensure that your models are always running on supported configurations.

Last updated

Was this helpful?