Driveworks Asynchronous Functions

george.dibben · November 21, 2017, 1:28pm

On a Drive PX2, we currently have the aim to run adapt some of the Driveworks API to allow them to be run on up to 8 cameras (Drivenet on all, Freespace on 3 cameras, Lane Detection on 2 to 5 cameras).

We already start to come up against some performance limitations with that many cameras.

Regarding the following functions;
dwDriveNet_inferDeviceAsync
dwObjectTracker_featureTrackDeviceAsync
dwLaneDetector_processDeviceAsync
dwFreeSpaceDetector_processDeviceAsync

The name and documentation of the functions says they are intended for asynchronous operation. Is this in the sense that if they run in separate threads that they can run in parallel asynchronously on different CUDA streams in the GPU and therefore provide benefit of reduced execution time through this parallelism? otherwise what is the correct understanding of usage of these functions?

SteveNV · November 24, 2017, 5:07am

Dear george.dibben,

The word “asynchronous“ in the name of these API calls means that the call only submits the inference job to the GPU but does not block to wait for the results. While the GPU is doing the job the CPU thread can do something useful. The blocking will happen (only if the GPU has not yet finished) when we call the corresponding function that retrieves the results. Thanks.