Performance and Timing Statistics
Interpret performance and latency metrics collected during inference.
Once you enable measureTime
on a Model instance, every predict
and predict_batch
result contains timing statistics. These are accumulated by the Model instance inside a timeStats
object.
Available methods
getTimeStats()
: Use this method to return a formatted string of all the statistics collected so far.resetTimeStats()
: Use this method to delete all your old statistics and create a freshtimeStats
object to collect more statistics with.To access the
timeStats
object directly, you can usemodelName.timeStats.stats["statName"]
, where thestatName
is one of the operations tracked.printLatencyInfo()
logs a brief, human-readable summary of average timings into the console.
Example usage
let model = await zoo.loadModel('your_model_name', { measureTime: true });
let result = await model.predict(image);
console.log(model.getTimeStats()); // Pretty print time stats
// Access client-side and server-side timing stats
let preprocessDuration = model.timeStats.stats["ImagePreprocessDuration_ms"]; // Get image preprocess duration (min, avg, max, count)
let preprocessMin = model.timeStats.stats["ImagePreprocessDuration_ms"].min; // Get min image preprocess duration
let inferenceDuration = model.timeStats.stats["CoreInferenceDuration_ms"]; // Get core inference duration (min, avg, max, count)
let inferenceMax = model.timeStats.stats["CoreInferenceDuration_ms"].max; // Get max core inference duration
let frameTotalDuration = model.timeStats.stats["FrameTotalDuration_ms"]; // Get total time taken for the entire frame processing
let deviceTemp = model.timeStats.stats["DeviceTemperature_C"]; // Get device temperature if available
model.resetTimeStats(); // Reset time stats
Client-Side Timings
These metrics are measured within the JavaScript SDK running in the user's browser.
FrameTotalDuration_ms
(End-to-End) The total wall-clock time from the moment predict
or predict_batch
is called until the final processed result is ready for the user. This is the most comprehensive client-side metric.
MutexWait_ms
The time spent waiting to acquire a lock before starting to process a new frame. Only relevant for synchronous predict()
calls. This value will be high if you are calling predict()
faster than the model can process frames, indicating contention.
InputFrameConvert_ms
The time taken to validate and convert the user's input (e.g., a URL, base64 string, or HTMLImageElement
) into a standardized format ready for preprocessing. (before preprocessing)
ImagePreprocessDuration_ms
The time spent on client-side image manipulation. This primarily consists of resizing the image to the model's required input dimensions and applying padding/cropping methods.
EncodeEmit_ms
The time taken to encode the image data and send it over the network.
- For AIServerModel
, this is just socket.send(blob)
.
- For CloudServerModel
, this involves encoding the data with msgpack and then socket.emit()
.
ResultProcessing_ms
The time spent processing a result after it has been received from the server. This includes matching it with the original frame info, applying label filters, and pushing it into the result queue.
ResultQueueWaitingTime_ms
The time a processed result sits in the output queue (resultQ
) before being returned to the user's code. This measures back-pressure if the user code is consuming results slower than the model produces them.
SocketConnectWait_ms
A one-time (or per-reconnect) cost of establishing the network connection to the server. This will not appear for every frame.
Server-Side Timings
These metrics are measured on the AI Server or Cloud Server and are included in the result payload sent back to the client. The SDK simply extracts and records them.
PythonPreprocessDuration_ms
Duration of client-side pre-processing step including data loading time and data conversion time
CorePreprocessDuration_ms
Duration of server-side pre-processing step
CoreInferenceDuration_ms
Duration of server-side AI inference step
CorePostprocessDuration_ms
Duration of server-side post-processing step
CoreInputFrameSize_bytes
The size of received input frame
DeviceInferenceDuration_ms
(DeGirum ORCA models only) Duration of AI inference computations on AI accelerator IC excluding data transfers
DeviceTemperature_C
(DeGirum ORCA models only) Internal temperature of AI accelerator IC in C
DeviceFrequency_MHz
(DeGirum ORCA models only) Working frequency of AI accelerator IC in MHz
Last updated
Was this helpful?