On a Drive PX2, we currently have the aim to run adapt some of the Driveworks API to allow them to be run on up to 8 cameras (Drivenet on all, Freespace on 3 cameras, Lane Detection on 2 to 5 cameras).
We already start to come up against some performance limitations with that many cameras.
Regarding the following functions;
The name and documentation of the functions says they are intended for asynchronous operation. Is this in the sense that if they run in separate threads that they can run in parallel asynchronously on different CUDA streams in the GPU and therefore provide benefit of reduced execution time through this parallelism? otherwise what is the correct understanding of usage of these functions?