We are working with DriveWorks in a multi-threaded environment.
We are currently facing a strange issue and we would like to know if the following approach is supported/allowed:
Frame grabbing from GMSL cameras is handled in a first thread.
After the initialization step, this thread iterate over dwSensors (one for each CSI port) and for each camera (sibling) within each CSI port.
For each camera, the loop content is roughly: dwSensorCamera_readFrame, dwSensorCamera_getImageNvMedia, dwSensorCamera_getDataLines, dwImageStreamer_postNvMedia, dwImageStreamer_receiveCUDA, dwSensorCamera_getImageROI, dwImageCUDA_mapToROI, dwRawPipeline_convertRawToDemosaic, dwImageStreamer_returnReceivedCUDA, dwImageStreamer_waitPostedNvMedia, dwSensorCamera_returnFrame.
The output dwImageCUDA of dwRawPipeline_convertRawToDemosaic is a pointer to a circular buffer (one for each camera feed).
The second thread is reading dwImageCUDA from the circular buffer filled by dwRawPipeline_convertRawToDemosaic and doing the following for each image dwImageFormatConverter_copyConvertCUDA, cudaMemcpy2D (just convert to RGBA and copy to host memory).
Everything is going well while reading two cameras from the first thread and processing one circular buffer from the second thread.
Adding a third thread (identical to the second thread) to process the second camera feed (second circular buffer) leads to a crash.
As a side note, circular buffers are obviously thread safe and each dwImageCUDA within those buffers are accessible by only one thread (no concurrent access).
Would you have any clue or advice?
Is there any way to trace GPU processing if the host application is crashing?
(I would like to use the NVidia Visual Profiler to get a timeline of the crash)
Thanks and best regards,