PVA performance worse than CPU

According to https://docs.nvidia.com/vpi/algo_gaussian_image_filter.html#autotoc_md27 and other benchmarks there, the PVA chip is consistently worse than the CPU at all kinds of vision tasks.

Doesn’t this defeat the very name of PVA, Programmable Vision Accelerator, as it doesn’t even accelerate more than CPU?

Hi,

Thanks for the feedback.
Let me check this with our internal team and share more information with you later.

Thanks.

Hi,

PVA main purpose is to offload work from GPU or CPU, leaving them free for other tasks.
Also PVA is far more energy efficient than the other two backends.

The use of PVA will depend on the processing pipeline and the available compute budget.
The performance tables help you to pick an appropriate backend and/or algorithm parameters for the pipeline.

The fact that PVA performs less than CPU is that:
All Xavier 8 CPU cores are used for one algorithm call.
Although there are 4 independent parallel PVA HW, only one is used for a particular call.

To better use the hardware resources, you can try to run up to 4 algorithm instances on PVA.
In this case, the average processing time is divided by 4, making it faster than CPU in all cases.

Thanks.