Release Date: 04/26/2023
IMPORTANT: This release has changes in PySDK and C++ SDK APIs.
New Features and Modifications
Starting from ver. 0.7.0, PySDK releases are released to PyPI.org. Now, to install PySDK using pip it is enough to invoke
pip install degirumcommand without specifying
Previous PySDK versions are still available from DeGirum index site by specifying
Starting from ver. 0.7.0, PySDK can be installed on Ubuntu Linux 22.04 LTS for x86-64 and ARM AArch64 architectures.
Inference timeouts are implemented for all three inference types: cloud inferences, AI server inferences, and local inference. Now in case of inference hangs, disconnections, and other failures, the PySDK inference APIs will not hang indefinitely, but will raise inference timeout exceptions.
To control the duration of the inference timeout, the
inference_timeout_sproperty is added to the
degirum.model.Modelclass. It specifies the maximum time in seconds to wait for the model inference result before rasing an exception.
The default value for the
inference_timeout_sdepends on the AI hardware to be used for inferences. For inferences on AI accelerators (like ORCA) this timeout is set to 10 sec. For pure CPU inferences it is set to 100 sec.
C++ SDK: new argument
inference_timeout_msis added to
AIModelclass. It specifies the maximum time in milliseconds to wait for inference result from the model inference on AI server.
Error reporting is improved:
- More meaningful error messages are now produced in case of cloud model loading failures.
- Extended model name is added to all inference-related error messages.
When a class label dictionary is updated for some model in some cloud zoo, and this model is then requested for an inference on some AI Server, which already performed an inference of that model some time ago, then the class label information reported by this AI server does not include recent changes made in the cloud zoo. This happens because the AI Server label dictionary cache is not properly updated.
Model.EagerBatchSizeparameter is now fixed to 8 for all cloud inferences to avoid scheduling favoritism for models with smaller batch size.