# Inference with local models

*Estimated read time: 3 minutes*

This setup runs inference on local hardware using DeGirum AI Server. The model is stored in a local folder on the machine where the server runs.

You can set the server address via `inference_host_address="localhost"` or `inference_host_address="<host_ip>:<port>"`.

## Client and AI Server on the same host

In this case, both the AI Server and your client application run on the same machine.

First, download the model to a local folder using the `degirum download-zoo` command:

{% code overflow="wrap" %}

```bash
ZOO="$HOME/degirum_model_zoo"
mkdir -p "$ZOO"

degirum download-zoo \
  --path "$ZOO" \
  --url https://hub.degirum.com/degirum/hailo \
  --model_family yolov8n_coco--640x640_quant_hailort_multidevice_1
```

{% endcode %}

Then launch the AI Server with the following command:

{% code overflow="wrap" %}

```bash
degirum server --zoo "$ZOO"
```

{% endcode %}

You should see output like:

{% code overflow="wrap" %}

```bash
DeGirum asio server is started at TCP port 8778
Local model zoo is served from '/home/degirum/degirum_model_zoo' directory.
Press Enter to stop the server
```

{% endcode %}

{% hint style="info" %}
Zoo path may differ.
{% endhint %}

The server runs until you press `ENTER`. By default, it listens on TCP port 8778. To change the port, use the `--port` argument:

{% code overflow="wrap" %}

```
degirum server --port <your_port> --zoo "$ZOO"
```

{% endcode %}

## Example ModelSpec

This example configures `ModelSpec` to use the AI Server and load the model from the local zoo:

<pre class="language-python" data-overflow="wrap"><code class="lang-python"># Example ModelSpec
model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_hailort_multidevice_1",
    <a data-footnote-ref href="#user-content-fn-1">zoo_url="aiserver://"</a>,
    <a data-footnote-ref href="#user-content-fn-2">inference_host_address="localhost"</a>,
    model_properties={"device_type": ["HAILORT/HAILO8L", "HAILORT/HAILO8"]}
)
</code></pre>

* `zoo_url="aiserver://"`: load the model from the zoo path given when launching the AI Server.
* `inference_host_address="localhost"`: run inference using the Hailo device managed by the local server.

## Client and AI Server on different hosts

To run inference from a separate client machine, first use `degirum download-zoo` to download the model to the host that will run the AI Server:

{% code overflow="wrap" %}

```bash
ZOO="$HOME/degirum_model_zoo"
mkdir -p "$ZOO"

degirum download-zoo \
  --path "$ZOO" \
  --url https://hub.degirum.com/degirum/hailo \
  --model_family yolov8n_coco--640x640_quant_hailort_multidevice_1
```

{% endcode %}

Star the AI Server on the remote host:

{% code overflow="wrap" %}

```bash
degirum server --zoo "$ZOO"
```

{% endcode %}

You should see output like:

{% code overflow="wrap" %}

```bash
DeGirum asio server is started at TCP port 8778
Local model zoo is served from '/home/degirum/degirum_model_zoo' directory.
Press Enter to stop the server
```

{% endcode %}

As before, the server runs until you press `ENTER`. By default, it listens on TCP port 8778. To change the port, use the `--port` argument:

{% code overflow="wrap" %}

```
degirum server --port <your_port> --zoo "$ZOO"
```

{% endcode %}

## Example ModelSpec

This time, the client points to a remote host:

<pre class="language-python" data-overflow="wrap"><code class="lang-python"># Example ModelSpec
model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_hailort_multidevice_1",
    <a data-footnote-ref href="#user-content-fn-1">zoo_url="aiserver://"</a>,
    <a data-footnote-ref href="#user-content-fn-3">inference_host_address="&#x3C;host_ip>:&#x3C;port>"</a>,
    model_properties={"device_type": ["HAILORT/HAILO8L", "HAILORT/HAILO8"]}
)
</code></pre>

* `zoo_url="aiserver://"`: still points to the AI Server zoo.
* `inference_host_address="<host_ip>:<port>"`: runs inference using the Hailo device on the remote AI Server.

[^1]: Indicates that the model will be loaded from a model zoo path specified when the AI Server is run.

[^2]: Indicates that the model will be run using Hailo devices managed by the AI Server listening on `"localhost:8778"`.

[^3]: Indicates that the model will be run using Hailo devices managed by the AI Server available at `"<host_ip>:<port>"`.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.degirum.com/hailo/advanced-guides/ai-server/ai-server-inference-with-local-models.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
