Inference setup

PySDK gives you flexibility in where models are stored and where inferences run. This page walks through common setups (cloud, local, and hybrid) so you can choose what fits your workflow.

Estimated read time: 2 minutes

Inference setup—pick your scenario

This page focuses on two ModelSpec parameters:

zoo_url: where the model is stored (cloud vs. local)
inference_host_address: where the model runs ("@cloud" vs. "@local")

Pick the setup that matches where you want inference to run and where your models live.

AI Server setups—like custom endpoints and multi-model hosting—are covered in the Advanced topics section.

Cloud inference

Cloud zoo → cloud runtime

Use when: You want zero local setup—everything runs in the cloud.

from degirum_tools import ModelSpec

spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_hailort_multidevice_1",
    zoo_url="degirum/hailo",  # cloud model zoo
    inference_host_address="@local",  # inference executes on your machine
    model_properties={"device_type": ["HAILORT/HAILO8L", "HAILORT/HAILO8"]},
)
model = spec.load_model()

Local inference with cloud zoo

Cloud zoo → local runtime

Use when: You have local hardware but prefer to fetch models from the cloud.

from degirum_tools import ModelSpec

spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_hailort_multidevice_1",
    zoo_url="degirum/hailo",  # fetch artifacts from cloud
    inference_host_address="@local",  # inference executes on your machine
    model_properties={"device_type": ["HAILORT/HAILO8L", "HAILORT/HAILO8"]},
)
model = spec.load_model()

Local inference with local zoo

Local zoo → local runtime

Use when: You want offline operation and predictable model behavior. Use ModelSpec.ensure_local() to download and store the model locally, then switch the ModelSpec to local before loading.

from degirum_tools import ModelSpec

spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_hailort_multidevice_1",
    zoo_url="degirum/hailo",  # start from cloud reference
    inference_host_address="@local",  # inference executes locally
    model_properties={"device_type": ["HAILORT/HAILO8L", "HAILORT/HAILO8"]},
)

# One-time (while online): download/verify artifacts and update spec to local zoo.
spec.ensure_local()

# After this, loading uses the local zoo (offline-friendly).
model = spec.load_model()

Once loaded, model objects are callable: model(x) ≡ model.predict(x)
Public model zoos generally don't require a token; private ones do.
For advanced setups (e.g., using AI Server or hosting your own model zoo), see Advanced topics.

PreviousDiscover Hailo models NextModel properties

Last updated 4 months ago

Was this helpful?

Good evening

hashtagInference setup—pick your scenario

hashtagCloud inference

hashtagLocal inference with cloud zoo

hashtagLocal inference with local zoo

Inference setup—pick your scenario

Cloud inference

Local inference with cloud zoo

Local inference with local zoo