# Overview

DeGirum® Orca is a flexible, efficient, and cost-effective AI accelerator. It helps developers build feature-rich edge solutions while staying within power and cost constraints.

## High Performance

Orca's efficient architecture delivers strong real-world performance. A single Orca can handle multiple input streams and several ML models. See our [Orca Performance Benchmarks](/orca/benchmarks.md) for performance details.

## Support for Pruned Models

Processing pruned models effectively boosts compute and bandwidth resources, letting you run larger, more accurate models in real time at the edge.

## Dedicated DRAM

Dedicated DRAM helps applications quickly switch between ML models without lengthy transfers from the host. This reduces model-switching delays and is especially helpful when your application needs to change models often, such as in image or speech recognition scenarios.

## Flexible Architecture

Orca's flexible architecture supports both int8 and float32 precision, so you can choose the format that best fits your use case and optimize performance, accuracy, and power consumption.

{% embed url="<https://assets.degirum.com/files/datasheets/Orca%20AI%20Hardware%20Accelerator%20ASIC%20Flyer.pdf>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.degirum.com/orca/readme.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.