# Inference with cloud models

*Estimated read time: 2 minutes*

This setup runs inference on local hardware (e.g., `inference_host_address="localhost"` or `inference_host_address="<host_ip>:<port>"`) while fetching models from the cloud zoo (`zoo_url="degirum/axelera"`). It's useful when compute is local, but you want to pull models directly from DeGirum's public AI Hub.

## Client and AI Server on the same host

In this case, the AI Server and client application run on the same machine.

To start the server, run:

{% code overflow="wrap" %}

```bash
degirum server
```

{% endcode %}

You should see output similar to:

{% code overflow="wrap" %}

```bash
DeGirum asio server is started at TCP port 8778
Local model zoo is served from '.' directory.  
Press Enter to stop the server
```

{% endcode %}

The server runs until you press `ENTER` in the terminal. By default, it listens on TCP port 8778. To specify a different port, use the `--port` argument.

{% code overflow="wrap" %}

```
degirum server --port <your_port>
```

{% endcode %}

### Example ModelSpec

This `ModelSpec` runs inference on a local AI Server using a model fetched from DeGirum's public Axelera model zoo:

<pre class="language-python" data-overflow="wrap"><code class="lang-python"># Example ModelSpec
model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_axelera_metis_1",
    <a data-footnote-ref href="#user-content-fn-1">zoo_url="degirum/axelera"</a>,
    <a data-footnote-ref href="#user-content-fn-2">inference_host_address="localhost:8778"</a>,
    model_properties={"device_type": ["AXELERA/METIS"]}
)
</code></pre>

* `zoo_url="degirum/axelera"`: loads the model from [DeGirum's public Axelera model zoo](https://hub.degirum.com/public-models/degirum/axelera?utm_source=docs.degirum.com\&utm_medium=site\&utm_campaign=cookbooks-axelera-cookbook-ai-server-inference-with-cloud-models).
* `inference_host_address="localhost:8778"`: runs inference using Axelera devices managed by a local AI server.

## Client and AI Server on different hosts

You can also run the AI server on a remote host. On that remote machine, launch the server:

{% code overflow="wrap" %}

```bash
degirum server
```

{% endcode %}

Expected output:

{% code overflow="wrap" %}

```bash
DeGirum asio server is started at TCP port 8778
Local model zoo is served from '.' directory.  
Press Enter to stop the server
```

{% endcode %}

The server runs until you press `ENTER` in the terminal. By default, it listens on TCP port 8778. To specify a different port, use the `--port` argument:

{% code overflow="wrap" %}

```bash
degirum server --port <your_port>
```

{% endcode %}

## Example ModelSpec

This `ModelSpec` runs inference on a remote AI Server using a model fetched from DeGirum's public Axelera model zoo.

<pre class="language-python" data-overflow="wrap"><code class="lang-python"># Example ModelSpec
model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_axelera_metis_1",
    <a data-footnote-ref href="#user-content-fn-1">zoo_url="degirum/axelera"</a>,
    <a data-footnote-ref href="#user-content-fn-3">inference_host_address="&#x3C;host_ip>:&#x3C;port>"</a>,
    model_properties={"device_type": ["AXELERA/METIS"]}
)
</code></pre>

* `zoo_url="degirum/axelera"`: fetches the model from [DeGirum's public Axelera model zoo](https://hub.degirum.com/public-models/degirum/axelera?utm_source=docs.degirum.com\&utm_medium=site\&utm_campaign=cookbooks-axelera-cookbook-ai-server-inference-with-cloud-models).
* `inference_host_address="<host_ip>:<port>"`: runs inference using Axelera devices managed by the remote AI server.

[^1]: Indicates that the model will be loaded from [DeGirum's Axelera public model zoo](https://hub.degirum.com/public-models/degirum/axelera?utm_source=docs.degirum.com\&utm_medium=site\&utm_campaign=cookbooks-axelera-cookbook-ai-server-inference-with-cloud-models).

[^2]: Indicates that the model will be run using Axelera devices managed by the AI Server listening on `"localhost:8778"`.

[^3]: Indicates that the model will be run using Axelera devices managed by the AI Server available at `"<host_ip>:<port>"`.
