# WebCodecs Example

If your browser [supports the WebCodecs API](https://caniuse.com/webcodecs), you can create efficient video processing pipelines with DeGirumJS.

The WebCodecs API provides low-level access to the individual frames of a video stream. This allows for highly efficient and flexible video processing pipelines directly in the browser. When combined with DeGirumJS's `predict_batch()` method, you can perform real-time AI inference on a live webcam stream with minimal latency.

The core components of this pipeline are:

1. **`MediaStreamTrackProcessor`**: Takes a `MediaStreamTrack` (like from a webcam) and exposes its frames as a `ReadableStream` of `VideoFrame` objects.
2. **`predict_batch()`**: The DeGirumJS method that can directly consume a `ReadableStream` of `VideoFrame` objects and efficiently process them for inference.
3. **`MediaStreamTrackGenerator`**: Takes a stream of processed `VideoFrame` objects and exposes them as a new `MediaStreamTrack`, which can be displayed in a `<video>` element.

Here are some examples demonstrating how to build pipelines using these components:

## Example 1: ReadableStream as Input

This example demonstrates the most direct way to perform inference on a video stream. We will take the `ReadableStream` provided by the `MediaStreamTrackProcessor` and feed it *directly* into `model.predict_batch()`.

**How it works:**

* Get a `videoTrack` from the webcam using `navigator.mediaDevices.getUserMedia`.
* Create a `MediaStreamTrackProcessor` to get a `ReadableStream` of `VideoFrame` objects.
* Pass this `readableStream` directly as the data source to `model.predict_batch()`.
* Display the results in a `<canvas>`.

{% code overflow="wrap" %}

```html
<p>Inference results from a direct video stream:</p>
<canvas id="outputCanvas"></canvas>

<script src="https://assets.degirum.com/degirumjs/0.1.5/degirum-js.min.obf.js"></script>
<script type="module">
    // --- Model Setup ---
    const dg = new dg_sdk();
    const secretToken = localStorage.getItem('secretToken') || prompt('Enter secret token:');
    localStorage.setItem('secretToken', secretToken);
    const MODEL_NAME = 'yolov8n_relu6_coco--640x640_quant_n2x_orca1_1';
    const ZOO_IP = 'https://cs.degirum.com/degirum/public';
    const zoo = await dg.connect('cloud', ZOO_IP, secretToken);
    const model = await zoo.loadModel(MODEL_NAME);

    // 1. Get video stream from webcam
    const mediaStream = await navigator.mediaDevices.getUserMedia({ video: true });
    const videoTrack = mediaStream.getVideoTracks()[0];

    // 2. Create a processor to get a readable stream of frames
    const processor = new MediaStreamTrackProcessor({ track: videoTrack });
    const readableStream = processor.readable;

    // 3. Feed the stream to predict_batch and loop through results
    for await (const result of model.predict_batch(readableStream)) {
        // Display the result on the canvas
        await model.displayResultToCanvas(result, 'outputCanvas');

        // IMPORTANT: Close the frame to release memory.
        // The SDK does not close frames when you provide a raw stream.
        result.imageFrame.close();
    }
</script>
```

{% endcode %}

## Example 2: Real-Time Inference with Display in a `<video>` Element

While the first example is simple, you might want to output the processed video (with results drawn) into a `<video>` element (for further processing, use by other libraries in your code, etc...). This example uses WebCodecs for re-encoding the processed frames back into a video track.

We use a `TransformStream` to orchestrate the work and a `MediaStreamTrackGenerator` to create the final output video track. This pattern is more robust and flexible for building complex applications.

**How it works:**

* A `MediaStreamTrackProcessor` creates a `ReadableStream` from the webcam.
* This stream is piped through a `TransformStream`. Inside the `transform` function, for each `frame`:
  1. We run inference on the frame using `model.predict()`.
  2. We draw the original frame onto an `OffscreenCanvas`.
  3. We use `model.displayResultToCanvas()` to overlay the inference results on that same canvas.
  4. We enqueue a *new* `VideoFrame` created from the canvas to the stream's controller.
  5. We close the original `frame` to free up memory.
* The output of the `TransformStream` is piped to the `writable` side of a `MediaStreamTrackGenerator`.
* The `MediaStreamTrackGenerator`'s track is then attached to a `<video>` element's `srcObject`.

{% code overflow="wrap" %}

```html
<p>Inference results inside a video element</p>
<video id="outputVideo" width="640" height="480" autoplay muted></video>

<script src="https://assets.degirum.com/degirumjs/0.1.5/degirum-js.min.obf.js"></script>
<script type="module">
    const outputVideo = document.getElementById('outputVideo');

    // --- Model Setup ---
    const dg = new dg_sdk();
    const secretToken = localStorage.getItem('secretToken') || prompt('Enter secret token:');
    localStorage.setItem('secretToken', secretToken);
    const MODEL_NAME = 'yolov8n_relu6_coco--640x640_quant_n2x_orca1_1';
    const ZOO_IP = 'https://cs.degirum.com/degirum/public';
    const zoo = await dg.connect('cloud', ZOO_IP, secretToken);
    const model = await zoo.loadModel(MODEL_NAME);

    // Use an OffscreenCanvas for efficient background rendering
    const canvas = new OffscreenCanvas(640, 480);
    const ctx = canvas.getContext('2d');

    const stream = await navigator.mediaDevices.getUserMedia({ video: true });
    const videoTrack = stream.getVideoTracks()[0];

    const trackProcessor = new MediaStreamTrackProcessor({ track: videoTrack });
    const trackGenerator = new MediaStreamTrackGenerator({ kind: "video" });

    outputVideo.srcObject = new MediaStream([trackGenerator]);

    // Define the transformation logic
    const transform = async (frame, controller) => {
        // Run inference on the current frame.
        // Note: We use predict() here, not predict_batch(), as we process one frame at a time.
        const result = await model.predict(frame);

        // If we have a valid result, draw it on top
        if (result.result) {
            await model.displayResultToCanvas(result, canvas);
        } else {
            // Draw the original frame onto our offscreen canvas
            ctx.drawImage(frame, 0, 0);
        }

        // Create a new frame from the canvas and pass it down the pipeline
        controller.enqueue(new VideoFrame(canvas, { timestamp: frame.timestamp }));

        // IMPORTANT: Close the original frame to release its resources.
        frame.close();
    };

    // Construct the full pipeline!
    trackProcessor.readable
        .pipeThrough(new TransformStream({ transform }))
        .pipeTo(trackGenerator.writable);
</script>
```

{% endcode %}

## Example 3: Parallel Inference on Four Video Streams

The WebCodecs API and DeGirumJS can handle multiple independent video pipelines at once. This example demonstrates four processed video streams displayed in a 2x2 grid.

This architecture is highly scalable. While we use a cloned track here, you could just as easily use four different video sources (e.g., multiple cameras or video files).

**How it works:**

* Grab a single webcam track (`mainVideoTrack`).
* Clone the track four times so each pipeline gets its own independent `MediaStreamTrack`.
* For each pipeline:
  1. Load a separate model instance.
  2. Create a `MediaStreamTrackProcessor` for the cloned track to get a `ReadableStream` of `VideoFrame`s.
  3. Pass the stream directly to `model.predict_batch()`.
  4. For each inference result, render detections onto the assigned `<canvas>` element using `model.displayResultToCanvas()`.
  5. Close the frame after processing to release memory.

{% code overflow="wrap" %}

```html
<!DOCTYPE html>
<html>

<head>
    <title>DeGirumJS four-canvas parallel demo</title>
    <style>
        html,
        body {
            margin: 0;
            height: 100%;
        }

        #canvas-grid {
            display: grid;
            grid-template-columns: repeat(2, 1fr);
            grid-template-rows: repeat(2, 1fr);
            width: 100vw;
            height: 100vh;
        }

        canvas {
            width: 100%;
            height: 100%;
            background: #000;
            display: block
        }
    </style>
</head>

<body>
    <div id="canvas-grid">
        <canvas id="canvas_0" width="640" height="480"></canvas>
        <canvas id="canvas_1" width="640" height="480"></canvas>
        <canvas id="canvas_2" width="640" height="480"></canvas>
        <canvas id="canvas_3" width="640" height="480"></canvas>
    </div>

    <script src="https://assets.degirum.com/degirumjs/0.1.5/degirum-js.min.obf.js"></script>
    <script type="module">
        // ----- Model setup -----
        const dg = new dg_sdk();
        const secretToken = localStorage.getItem('secretToken') || prompt('Enter secret token:');
        localStorage.setItem('secretToken', secretToken);
        const MODEL_NAMES = [
            'yolov8n_relu6_coco--640x640_quant_n2x_orca1_1',
            'yolov8n_relu6_face--640x640_quant_n2x_orca1_1',
            'yolov8n_relu6_hand--640x640_quant_n2x_orca1_1',
            'yolov8n_relu6_widerface_kpts--640x640_quant_n2x_orca1_1'
        ];
        const NUM_PIPELINES = MODEL_NAMES.length;
        const ZOO_IP = 'https://cs.degirum.com/degirum/public';
        const zoo = await dg.connect('cloud', ZOO_IP, secretToken);

        // Grab the webcam once and clone the track
        const stream = await navigator.mediaDevices.getUserMedia({ video: true });
        const mainVideoTrack = stream.getVideoTracks()[0];

        async function setupPipeline(index, videoTrack) {
            // Load a separate model instance for each stream
            const model = await zoo.loadModel(MODEL_NAMES[index]);

            // Processor gives us a ReadableStream<VideoFrame>
            const processor = new MediaStreamTrackProcessor({ track: videoTrack });
            const readable = processor.readable;

            // Iterate over batched predictions
            for await (const result of model.predict_batch(readable)) {
                // Draw detections to the right canvas
                await model.displayResultToCanvas(result, `canvas_${index}`);

                // IMPORTANT: Always close frames when supplying a raw stream
                result.imageFrame.close();
            }
        }

        // Create and launch four independent pipelines
        for (let i = 0; i < NUM_PIPELINES; i++) {
            setupPipeline(i, mainVideoTrack.clone());
        }
    </script>
</body>

</html>
```

{% endcode %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.degirum.com/degirumjs/guides/web-codecs-example.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
