# WebCodecs Example

If your browser [supports the WebCodecs API](https://caniuse.com/webcodecs), you can create efficient video processing pipelines with DeGirumJS.

The WebCodecs API provides low-level access to the individual frames of a video stream. This allows for highly efficient and flexible video processing pipelines directly in the browser. When combined with DeGirumJS's `predict_batch()` method, you can perform real-time AI inference on a live webcam stream with minimal latency.

The core components of this pipeline are:

1. **`MediaStreamTrackProcessor`**: Takes a `MediaStreamTrack` (like from a webcam) and exposes its frames as a `ReadableStream` of `VideoFrame` objects.
2. **`predict_batch()`**: The DeGirumJS method that can directly consume a `ReadableStream` of `VideoFrame` objects and efficiently process them for inference.
3. **`MediaStreamTrackGenerator`**: Takes a stream of processed `VideoFrame` objects and exposes them as a new `MediaStreamTrack`, which can be displayed in a `<video>` element.

Here are some examples demonstrating how to build pipelines using these components:

## Example 1: ReadableStream as Input

This example demonstrates the most direct way to perform inference on a video stream. We will take the `ReadableStream` provided by the `MediaStreamTrackProcessor` and feed it *directly* into `model.predict_batch()`.

**How it works:**

* Get a `videoTrack` from the webcam using `navigator.mediaDevices.getUserMedia`.
* Create a `MediaStreamTrackProcessor` to get a `ReadableStream` of `VideoFrame` objects.
* Pass this `readableStream` directly as the data source to `model.predict_batch()`.
* Display the results in a `<canvas>`.

{% code overflow="wrap" %}

```html
<p>Inference results from a direct video stream:</p>
<canvas id="outputCanvas"></canvas>

<script src="https://assets.degirum.com/degirumjs/0.1.5/degirum-js.min.obf.js"></script>
<script type="module">
    // --- Model Setup ---
    const dg = new dg_sdk();
    const secretToken = localStorage.getItem('secretToken') || prompt('Enter secret token:');
    localStorage.setItem('secretToken', secretToken);
    const MODEL_NAME = 'yolov8n_relu6_coco--640x640_quant_n2x_orca1_1';
    const ZOO_IP = 'https://cs.degirum.com/degirum/public';
    const zoo = await dg.connect('cloud', ZOO_IP, secretToken);
    const model = await zoo.loadModel(MODEL_NAME);

    // 1. Get video stream from webcam
    const mediaStream = await navigator.mediaDevices.getUserMedia({ video: true });
    const videoTrack = mediaStream.getVideoTracks()[0];

    // 2. Create a processor to get a readable stream of frames
    const processor = new MediaStreamTrackProcessor({ track: videoTrack });
    const readableStream = processor.readable;

    // 3. Feed the stream to predict_batch and loop through results
    for await (const result of model.predict_batch(readableStream)) {
        // Display the result on the canvas
        await model.displayResultToCanvas(result, 'outputCanvas');

        // IMPORTANT: Close the frame to release memory.
        // The SDK does not close frames when you provide a raw stream.
        result.imageFrame.close();
    }
</script>
```

{% endcode %}

## Example 2: Real-Time Inference with Display in a `<video>` Element

While the first example is simple, you might want to output the processed video (with results drawn) into a `<video>` element (for further processing, use by other libraries in your code, etc...). This example uses WebCodecs for re-encoding the processed frames back into a video track.

We use a `TransformStream` to orchestrate the work and a `MediaStreamTrackGenerator` to create the final output video track. This pattern is more robust and flexible for building complex applications.

**How it works:**

* A `MediaStreamTrackProcessor` creates a `ReadableStream` from the webcam.
* This stream is piped through a `TransformStream`. Inside the `transform` function, for each `frame`:
  1. We run inference on the frame using `model.predict()`.
  2. We draw the original frame onto an `OffscreenCanvas`.
  3. We use `model.displayResultToCanvas()` to overlay the inference results on that same canvas.
  4. We enqueue a *new* `VideoFrame` created from the canvas to the stream's controller.
  5. We close the original `frame` to free up memory.
* The output of the `TransformStream` is piped to the `writable` side of a `MediaStreamTrackGenerator`.
* The `MediaStreamTrackGenerator`'s track is then attached to a `<video>` element's `srcObject`.

{% code overflow="wrap" %}

```html
<p>Inference results inside a video element</p>
<video id="outputVideo" width="640" height="480" autoplay muted></video>

<script src="https://assets.degirum.com/degirumjs/0.1.5/degirum-js.min.obf.js"></script>
<script type="module">
    const outputVideo = document.getElementById('outputVideo');

    // --- Model Setup ---
    const dg = new dg_sdk();
    const secretToken = localStorage.getItem('secretToken') || prompt('Enter secret token:');
    localStorage.setItem('secretToken', secretToken);
    const MODEL_NAME = 'yolov8n_relu6_coco--640x640_quant_n2x_orca1_1';
    const ZOO_IP = 'https://cs.degirum.com/degirum/public';
    const zoo = await dg.connect('cloud', ZOO_IP, secretToken);
    const model = await zoo.loadModel(MODEL_NAME);

    // Use an OffscreenCanvas for efficient background rendering
    const canvas = new OffscreenCanvas(640, 480);
    const ctx = canvas.getContext('2d');

    const stream = await navigator.mediaDevices.getUserMedia({ video: true });
    const videoTrack = stream.getVideoTracks()[0];

    const trackProcessor = new MediaStreamTrackProcessor({ track: videoTrack });
    const trackGenerator = new MediaStreamTrackGenerator({ kind: "video" });

    outputVideo.srcObject = new MediaStream([trackGenerator]);

    // Define the transformation logic
    const transform = async (frame, controller) => {
        // Run inference on the current frame.
        // Note: We use predict() here, not predict_batch(), as we process one frame at a time.
        const result = await model.predict(frame);

        // If we have a valid result, draw it on top
        if (result.result) {
            await model.displayResultToCanvas(result, canvas);
        } else {
            // Draw the original frame onto our offscreen canvas
            ctx.drawImage(frame, 0, 0);
        }

        // Create a new frame from the canvas and pass it down the pipeline
        controller.enqueue(new VideoFrame(canvas, { timestamp: frame.timestamp }));

        // IMPORTANT: Close the original frame to release its resources.
        frame.close();
    };

    // Construct the full pipeline!
    trackProcessor.readable
        .pipeThrough(new TransformStream({ transform }))
        .pipeTo(trackGenerator.writable);
</script>
```

{% endcode %}

## Example 3: Parallel Inference on Four Video Streams

The WebCodecs API and DeGirumJS can handle multiple independent video pipelines at once. This example demonstrates four processed video streams displayed in a 2x2 grid.

This architecture is highly scalable. While we use a cloned track here, you could just as easily use four different video sources (e.g., multiple cameras or video files).

**How it works:**

* Grab a single webcam track (`mainVideoTrack`).
* Clone the track four times so each pipeline gets its own independent `MediaStreamTrack`.
* For each pipeline:
  1. Load a separate model instance.
  2. Create a `MediaStreamTrackProcessor` for the cloned track to get a `ReadableStream` of `VideoFrame`s.
  3. Pass the stream directly to `model.predict_batch()`.
  4. For each inference result, render detections onto the assigned `<canvas>` element using `model.displayResultToCanvas()`.
  5. Close the frame after processing to release memory.

{% code overflow="wrap" %}

```html
<!DOCTYPE html>
<html>

<head>
    <title>DeGirumJS four-canvas parallel demo</title>
    <style>
        html,
        body {
            margin: 0;
            height: 100%;
        }

        #canvas-grid {
            display: grid;
            grid-template-columns: repeat(2, 1fr);
            grid-template-rows: repeat(2, 1fr);
            width: 100vw;
            height: 100vh;
        }

        canvas {
            width: 100%;
            height: 100%;
            background: #000;
            display: block
        }
    </style>
</head>

<body>
    <div id="canvas-grid">
        <canvas id="canvas_0" width="640" height="480"></canvas>
        <canvas id="canvas_1" width="640" height="480"></canvas>
        <canvas id="canvas_2" width="640" height="480"></canvas>
        <canvas id="canvas_3" width="640" height="480"></canvas>
    </div>

    <script src="https://assets.degirum.com/degirumjs/0.1.5/degirum-js.min.obf.js"></script>
    <script type="module">
        // ----- Model setup -----
        const dg = new dg_sdk();
        const secretToken = localStorage.getItem('secretToken') || prompt('Enter secret token:');
        localStorage.setItem('secretToken', secretToken);
        const MODEL_NAMES = [
            'yolov8n_relu6_coco--640x640_quant_n2x_orca1_1',
            'yolov8n_relu6_face--640x640_quant_n2x_orca1_1',
            'yolov8n_relu6_hand--640x640_quant_n2x_orca1_1',
            'yolov8n_relu6_widerface_kpts--640x640_quant_n2x_orca1_1'
        ];
        const NUM_PIPELINES = MODEL_NAMES.length;
        const ZOO_IP = 'https://cs.degirum.com/degirum/public';
        const zoo = await dg.connect('cloud', ZOO_IP, secretToken);

        // Grab the webcam once and clone the track
        const stream = await navigator.mediaDevices.getUserMedia({ video: true });
        const mainVideoTrack = stream.getVideoTracks()[0];

        async function setupPipeline(index, videoTrack) {
            // Load a separate model instance for each stream
            const model = await zoo.loadModel(MODEL_NAMES[index]);

            // Processor gives us a ReadableStream<VideoFrame>
            const processor = new MediaStreamTrackProcessor({ track: videoTrack });
            const readable = processor.readable;

            // Iterate over batched predictions
            for await (const result of model.predict_batch(readable)) {
                // Draw detections to the right canvas
                await model.displayResultToCanvas(result, `canvas_${index}`);

                // IMPORTANT: Always close frames when supplying a raw stream
                result.imageFrame.close();
            }
        }

        // Create and launch four independent pipelines
        for (let i = 0; i < NUM_PIPELINES; i++) {
            setupPipeline(i, mainVideoTrack.clone());
        }
    </script>
</body>

</html>
```

{% endcode %}
