# Measuring performance

*Estimated read time: 3 minutes*

Performance work starts with trusted measurements. This guide shows how to enable timing collection, gather per-frame metrics, and compute throughput for repeatable regression tracking. Each code sample includes inline setup so you can copy and run sections independently.

## Baseline checklist

Use this repeatable sequence whenever you capture new numbers:

{% stepper %}
{% step %}
**Reset state**

Call `model.reset_time_stats()` before the first measured run.
{% endstep %}

{% step %}
**Collect baselines**

Record system info (`degirum sys-info`), model version, and any custom parameters.
{% endstep %}

{% step %}
**Execute a fixed workload**

Run a deterministic loop (same inputs, same batch size) using `predict` or `predict_batch`.
{% endstep %}

{% step %}
**Capture outputs**

Log `result.timing`, `model.time_stats()`, and manual throughput numbers.
{% endstep %}

{% step %}
**Publish context**

Store the script, command, environment details, and raw logs so others can reproduce the run.
{% endstep %}
{% endstepper %}

## Inspect per-frame timing

`InferenceResults.timing` exposes millisecond breakdowns for preprocessing, inference, and postprocessing when `model.measure_time` is `True`.

### Example

{% code overflow="wrap" %}

```python
from degirum_tools import ModelSpec, remote_assets

model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_axelera_metis_1",
    zoo_url="degirum/axelera",
    inference_host_address="@local",
    model_properties={"device_type": ["AXELERA/METIS"]},
)
model = model_spec.load_model()
model.measure_time = True

# Optional warm-up inference
_ = model(remote_assets.three_persons)

result = model(remote_assets.three_persons)
for stage, value in result.timing.items():
    print(f"{stage}: {value:.2f} ms")
```

{% endcode %}

Example output:

{% code overflow="wrap" %}

```python
preprocess: 2.87 ms
inference: 34.68 ms
postprocess: 4.09 ms
```

{% endcode %}

{% hint style="info" %}
To persist values, append them to a JSON or CSV log alongside any metadata from `result.info`.
{% endhint %}

## Aggregate with time stats

`model.time_stats()` accumulates min, average, max, and count since the last reset. Use it after running a workload loop.

### Example

{% code overflow="wrap" %}

```python
from degirum_tools import ModelSpec, remote_assets

model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_axelera_metis_1",
    zoo_url="degirum/axelera",
    inference_host_address="@local",
    model_properties={"device_type": ["AXELERA/METIS"]},
)
model = model_spec.load_model()
model.measure_time = True

model.reset_time_stats()
for _ in range(20):
    _ = model(remote_assets.three_persons)

for stage, stats in model.time_stats().items():
    print(
        f"{stage:>30}: min={stats.min:.2f} ms avg={stats.avg:.2f} ms "
        f"max={stats.max:.2f} ms count={stats.cnt}"
    )
```

{% endcode %}

Example output:

{% code overflow="wrap" %}

```python
preprocess: min=2.81 ms avg=2.89 ms max=2.95 ms count=20
inference: min=34.51 ms avg=34.73 ms max=35.12 ms count=20
postprocess: min=3.98 ms avg=4.07 ms max=4.22 ms count=20
```

{% endcode %}

{% hint style="warning" %}
Keep inputs consistent. Mixing URLs, file paths, or arrays with different sizes can skew timings.
{% endhint %}

## Compute throughput

For end-to-end throughput, measure a block of inferences with `time.perf_counter`.

### Example

{% code overflow="wrap" %}

```python
import time
from degirum_tools import ModelSpec, remote_assets

model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_axelera_metis_1",
    zoo_url="degirum/axelera",
    inference_host_address="@local",
    model_properties={"device_type": ["AXELERA/METIS"]},
)
model = model_spec.load_model()

samples = 50
model.reset_time_stats()
start = time.perf_counter()
for _ in range(samples):
    _ = model(remote_assets.three_persons)
wall_clock = time.perf_counter() - start

print(f"Throughput: {samples / wall_clock:.2f} FPS")
print(f"Average latency: {(wall_clock / samples) * 1000:.2f} ms")
```

{% endcode %}

Example output:

{% code overflow="wrap" %}

```python
Throughput: 26.41 FPS
Average latency: 37.88 ms
```

{% endcode %}

{% hint style="success" %}
Share both wall-clock and per-stage stats in your report so teammates can spot preprocessing or transfer bottlenecks.
{% endhint %}

## Track regressions over time

* Store raw timing logs under version control or an observability system.
* Re-run the exact script when you change firmware, runtimes, or postprocessing parameters.
* Compare against previous baselines before deploying to production. If a regression appears, bisect by toggling one change at a time.
