# Custom video source

*Estimated read time: 6 minutes*

This guide shows how to plug non-standard video sources into DeGirum PySDK—things like PiCamera2, GStreamer appsink, proprietary SDKs, image sequences, screen captures, or preprocessed frames.

For common sources (webcam, file path, RTSP URL), use `degirum_tools.predict_stream(model, source)`. When you need **c**ustom capture, inline processing, or per-frame metadata, use `predict_batch` with your own generator.

## How predict\_batch works

`model.predict_batch(source_iterable)` accepts any Python iterator or generator that yields either:

* `frame`: a NumPy array shaped H×W×3, dtype=uint8, BGR format.
* `(frame, frame_info)`: same `frame` plus a free-form `dict` (`frame_info`) that is returned as `result.info`.

**This allows you to**:

* **Attach metadata**: for syncing, routing, or auditing (e.g., `{"camera_id":"dock-3","frame_index":42,"ts_ms":1712345678901}`).
* **Preprocess inline**: rotate, resize, crop, denoise, or convert colors before yielding.
* **Use any source**: PiCamera2, GStreamer, PyAV/FFmpeg, SDK callbacks, image folders, or synthetic frames.

## Lifecycle & flow

* **One result frame**: outputs are returned in order.
* **An analyzer** (if used): tile, track, and zone analyzers return a single merged result per frame.
* **Back-pressure aware**: `predict_batch` pulls frames at the model’s pace. Don’t busy—if you capture asynchronously, use a bounded queue and `yield` from it.
* **Termination**: stop iteration to end the stream. Always release devices and pipelines in a `finally` block inside your generator.
* **Errors handling**: skip bad frames, log, and continue. For a stuck backend, signal failure in `frame_info` or break and restart the pipeline.

## Common setup

{% code overflow="wrap" %}

```python
from degirum_tools import ModelSpec, Display, remote_assets
import degirum_tools
import cv2

# Describe & load the model once
model_spec = ModelSpec(
    model_name="yolov8n_coco--640x640_quant_axelera_metis_1",
    zoo_url="degirum/axelera",
    inference_host_address="@local",
    model_properties={
        "device_type": ["AXELERA/METIS"],
        "overlay_color": [(0, 255, 0)],
        # Optional: "output_class_set": {"car", "person"}  # filter labels
    },
)
model = model_spec.load_model()

# Sample asset (or use your own path, or 0 for webcam)
video_path = remote_assets("Traffic.mp4")
```

{% endcode %}

### Example: OpenCV source (with frame\_info)

{% code overflow="wrap" %}

```python
import time

# Any OpenCV source: file path, RTSP URL, or 0 for webcam
cap = cv2.VideoCapture(video_path)

def frames_with_info(cap, camera_id="front-entrance"):
    idx = 0
    try:
        while True:
            ok, frame = cap.read()
            if not ok:
                break
            # Add metadata for sync/ID/routing
            info = {
                "camera_id": camera_id,
                "frame_index": idx,
                "ts_ms": int(time.time() * 1000),  # or device timestamp if available
            }
            idx += 1
            yield frame, info
    finally:
        cap.release()

with Display("OpenCV Source") as output_display:
    for result in model.predict_batch(frames_with_info(cap)):
        # Access your metadata
        _meta = result.info  # {"camera_id": "...", "frame_index": ..., "ts_ms": ...}
        output_display.show(result)
```

{% endcode %}

### Example: Raspberry Pi Camera (PiCamera2)

{% code overflow="wrap" %}

```python
# Requires: sudo apt install -y python3-picamera2 (libcamera enabled)
from picamera2 import Picamera2
import time

def picamera2_frames(camera_id="pi-cam"):
    picam2 = Picamera2()
    picam2.configure(picam2.preview_configuration(main={"format": "BGR888"}))  # BGR for OpenCV
    picam2.start()
    idx = 0
    try:
        while True:
            frame = picam2.capture_array()
            info = {"camera_id": camera_id, "frame_index": idx, "ts_ms": int(time.time() * 1000)}
            idx += 1
            yield frame, info
    finally:
        picam2.stop()

with Display("PiCamera2") as output_display:
    for result in model.predict_batch(picamera2_frames("north-lane")):
        output_display.show(result)
```

{% endcode %}

### Example: GStreamer via OpenCV (simple)

Make sure your OpenCV build has GStreamer support.

{% code overflow="wrap" %}

```python
# V4L2 webcam example (Linux)
gst_pipeline = (
    "v4l2src device=/dev/video0 ! "
    "videoconvert ! video/x-raw,format=BGR ! "
    "appsink"
)
cap = cv2.VideoCapture(gst_pipeline, cv2.CAP_GSTREAMER)

def gst_cv_frames(cap, camera_id="gst-cam"):
    idx = 0
    try:
        while True:
            ok, frame = cap.read()
            if not ok:
                break
            yield frame, {"camera_id": camera_id, "frame_index": idx}
            idx += 1
    finally:
        cap.release()

with Display("GStreamer (OpenCV)") as output_display:
    for result in model.predict_batch(gst_cv_frames(cap, camera_id="usb-0")):
        output_display.show(result)
```

{% endcode %}

**RTSP variant (OpenCV)**:

{% code overflow="wrap" %}

```python
rtsp_url = "rtsp://username:password@camera_host/stream"
cap = cv2.VideoCapture(rtsp_url)  # or a full gst pipeline with CAP_GSTREAMER
with Display("RTSP (OpenCV)") as output_display:
    for result in model.predict_batch(gst_cv_frames(cap, camera_id="rtsp-1")):
        output_display.show(result)
```

{% endcode %}

### Example: GStreamer via PyGObject (appsink, fine control)

{% code overflow="wrap" %}

```python
# Requires:
#   sudo apt install python3-gi python3-gi-cairo gstreamer1.0-tools \
#       gstreamer1.0-plugins-{base,good,bad,ugly}
#   pip install PyGObject
import gi, numpy as np
gi.require_version("Gst", "1.0")
from gi.repository import Gst

Gst.init(None)

def gst_pygobject_frames(camera_id="gst-raw"):
    pipeline_str = (
        "v4l2src device=/dev/video0 ! "
        "videoconvert ! video/x-raw,format=BGR ! "
        "appsink name=sink emit-signals=true max-buffers=1 drop=true"
    )
    pipeline = Gst.parse_launch(pipeline_str)
    sink = pipeline.get_by_name("sink")
    pipeline.set_state(Gst.State.PLAYING)

    idx = 0
    try:
        while True:
            sample = sink.emit("pull-sample")
            if not sample:
                break
            buf = sample.get_buffer()
            caps = sample.get_caps()
            s = caps.get_structure(0)
            w, h = s.get_value("width"), s.get_value("height")

            success, map_info = buf.map(Gst.MapFlags.READ)
            if not success:
                continue
            frame = np.frombuffer(map_info.data, np.uint8).reshape((h, w, 3))
            buf.unmap(map_info)

            yield frame, {"camera_id": camera_id, "frame_index": idx}
            idx += 1
    finally:
        pipeline.set_state(Gst.State.NULL)

with Display("GStreamer (PyGObject)") as output_display:
    for result in model.predict_batch(gst_pygobject_frames("usb-raw")):
        output_display.show(result)
```

{% endcode %}

{% hint style="info" %}
\* \*\*When to use\*\* \`predict\_stream\`: for webcam, video files, or RTSP, \`degirum\_tools.predict\_stream(model, source)\` already performs capture, iteration, and drawing—use it unless you need custom sources, extra processing, or \`frame\_info\` metadata. \* \`frame\_info\` \*\*round-trip\*\*: yield \`(frame, info)\` → read back as \`result.info\` (e.g., timestamps, camera IDs, sequence numbers, shard IDs, etc.). \* \*\*Inline transforms\*\*: rotate, resize, or crop frames inside your generator before yielding:

````
{% code overflow="wrap" %}
```python
rotated = cv2.rotate(frame, cv2.ROTATE_90_CLOCKWISE)
yield rotated, info
```
{% endcode %}
````

* **Resource safety**: release `VideoCapture` or pipelines in a `finally` block inside your generator.
* **Performance**: keep per-frame Python work minimal. Push heavy decoding to capture backends. To reduce clutter and computation, set `output_class_set` in `ModelSpec.model_properties` if you only need specific labels.
  {% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.degirum.com/axelera/advanced-guides/custom-video-source.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
