# Measuring performance

*Estimated read time: 3 minutes*

Performance work starts with trusted measurements. This guide shows how to enable timing collection, gather per-frame metrics, and compute throughput for repeatable regression tracking. Each code sample includes inline setup so you can copy and run sections independently.

## Baseline checklist

Use this repeatable sequence whenever you capture new numbers:

{% stepper %}
{% step %}
**Reset state**

Call `model.reset_time_stats()` before the first measured run.
{% endstep %}

{% step %}
**Collect baselines**

Record system info (`degirum sys-info`), model version, and any custom parameters.
{% endstep %}

{% step %}
**Execute a fixed workload**

Run a deterministic loop (same inputs, same batch size) using `predict` or `predict_batch`.
{% endstep %}

{% step %}
**Capture outputs**

Log `result.timing`, `model.time_stats()`, and manual throughput numbers.
{% endstep %}

{% step %}
**Publish context**

Store the script, command, environment details, and raw logs so others can reproduce the run.
{% endstep %}
{% endstepper %}

## Inspect per-frame timing

`InferenceResults.timing` exposes millisecond breakdowns for preprocessing, inference, and postprocessing when `model.measure_time` is `True`.

### Example

{% code overflow="wrap" %}

```python
from degirum_tools import ModelSpec, remote_assets

model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_axelera_metis_1",
    zoo_url="degirum/axelera",
    inference_host_address="@local",
    model_properties={"device_type": ["AXELERA/METIS"]},
)
model = model_spec.load_model()
model.measure_time = True

# Optional warm-up inference
_ = model(remote_assets.three_persons)

result = model(remote_assets.three_persons)
for stage, value in result.timing.items():
    print(f"{stage}: {value:.2f} ms")
```

{% endcode %}

Example output:

{% code overflow="wrap" %}

```python
preprocess: 2.87 ms
inference: 34.68 ms
postprocess: 4.09 ms
```

{% endcode %}

{% hint style="info" %}
To persist values, append them to a JSON or CSV log alongside any metadata from `result.info`.
{% endhint %}

## Aggregate with time stats

`model.time_stats()` accumulates min, average, max, and count since the last reset. Use it after running a workload loop.

### Example

{% code overflow="wrap" %}

```python
from degirum_tools import ModelSpec, remote_assets

model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_axelera_metis_1",
    zoo_url="degirum/axelera",
    inference_host_address="@local",
    model_properties={"device_type": ["AXELERA/METIS"]},
)
model = model_spec.load_model()
model.measure_time = True

model.reset_time_stats()
for _ in range(20):
    _ = model(remote_assets.three_persons)

for stage, stats in model.time_stats().items():
    print(
        f"{stage:>30}: min={stats.min:.2f} ms avg={stats.avg:.2f} ms "
        f"max={stats.max:.2f} ms count={stats.cnt}"
    )
```

{% endcode %}

Example output:

{% code overflow="wrap" %}

```python
preprocess: min=2.81 ms avg=2.89 ms max=2.95 ms count=20
inference: min=34.51 ms avg=34.73 ms max=35.12 ms count=20
postprocess: min=3.98 ms avg=4.07 ms max=4.22 ms count=20
```

{% endcode %}

{% hint style="warning" %}
Keep inputs consistent. Mixing URLs, file paths, or arrays with different sizes can skew timings.
{% endhint %}

## Compute throughput

For end-to-end throughput, measure a block of inferences with `time.perf_counter`.

### Example

{% code overflow="wrap" %}

```python
import time
from degirum_tools import ModelSpec, remote_assets

model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_axelera_metis_1",
    zoo_url="degirum/axelera",
    inference_host_address="@local",
    model_properties={"device_type": ["AXELERA/METIS"]},
)
model = model_spec.load_model()

samples = 50
model.reset_time_stats()
start = time.perf_counter()
for _ in range(samples):
    _ = model(remote_assets.three_persons)
wall_clock = time.perf_counter() - start

print(f"Throughput: {samples / wall_clock:.2f} FPS")
print(f"Average latency: {(wall_clock / samples) * 1000:.2f} ms")
```

{% endcode %}

Example output:

{% code overflow="wrap" %}

```python
Throughput: 26.41 FPS
Average latency: 37.88 ms
```

{% endcode %}

{% hint style="success" %}
Share both wall-clock and per-stage stats in your report so teammates can spot preprocessing or transfer bottlenecks.
{% endhint %}

## Track regressions over time

* Store raw timing logs under version control or an observability system.
* Re-run the exact script when you change firmware, runtimes, or postprocessing parameters.
* Compare against previous baselines before deploying to production. If a regression appears, bisect by toggling one change at a time.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.degirum.com/axelera/basics/measuring-performance.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
