Streaming dwImages cross-process

Dear nvidia-team,

following the DW sensor manager I want to implement my own cross process camera streaming application. I use EGLstreams as provided by the CUDA 10.2 samples. The DW samples for image streaming suggest that streaming a frame handle is valid, but their black box nature does not let me confirm that. In any case, I can stream data between producer and consumer, and also can stream an dwImageHandle_t such that it has the same value, but I cannot then get the according image (via dwImage_getCUDA). That works fine on the producer side.
For reasons of efficiency and compatibility within our pipeline it would be ideal to have the image handle on the consumer side. If streaming the handle does not suffice for that to work, what exactly do I need to stream instead of the handle? Could I assign a streamed dwImageCUDA to a new handle on the consumer side, and if yes, how so?

Please provide the following info (check/uncheck the boxes after creating this topic):
Software Version
DRIVE OS Linux 5.2.6
DRIVE OS Linux 5.2.0
DRIVE OS Linux 5.2.0 and DriveWorks 3.5
NVIDIA DRIVE™ Software 10.0 (Linux)
NVIDIA DRIVE™ Software 9.0 (Linux)
other DRIVE OS version
other

Target Operating System
Linux
QNX
other

Hardware Platform
NVIDIA DRIVE™ AGX Xavier DevKit (E3550)
NVIDIA DRIVE™ AGX Pegasus DevKit (E3550)
other

SDK Manager Version
1.6.1.8175
1.6.0.8170
other

Host Machine Version
native Ubuntu 18.04
other

Hi, @josua.zscheile
Please let us know which is this sample and where you saw this. Thanks.

Hey @VickNV,

unsurprisingly, it is the image_streamer_cross_process sample. There the according functions are dwImageStreamerGL_producerSend and dwImageStreamerGL_consumerReceive, respectively, where a dwImageHandle_t is sent (called m_image) and received (imageOut).

With my failure to use the streamed handle when implementing these as EGLstreams, I suppose these functions do more than stream the handle itself, but as I said, I cannot confirm that.

Dear @josua.zscheile,
Just to clarifty DW Image streamer is wrapper over EGLStream implementation.

Could you clarify what is your ask?

Dear @SivaRamaKrishnaNV,

I want to stream dwImages from a producer (distributing frames from cameras) to consumers. The streaming of data as such works (e.g. I can get the image from the handle in the producer process and stream the image contents), but I ideally would like to stream the dwImageHandle_t’s as such, so that the consumer can access the dwImageCUDA.
I can stream the handle correctly (or at least, the printed content of the handle in consumer and producer is the same), but get a segmentation fault error if I try to access the image or the meta data from the handle.
In the aforementioned sample, the dw functions imply that what is streamed there is the handle, not the image (but that depends on the implementation of these functions). So my question was how, if at all, that works.

Dear @josua.zscheile,
the dw functions imply that what is streamed there is the handle, not the image

Yes. There no image data is transfer. Image Streamer allows the image type to changed without much effort and by avoiding additional data transfer.

I can stream the handle correctly (or at least, the printed content of the handle in consumer and producer is the same), but get a segmentation fault error if I try to access the image or the meta data from the handle

you can read and use the image data on the consumer side but not expected to change the image contents.

Dear @SivaRamaKrishnaNV,

thank you for your answer. I do not need the image type to change (dwImageCUDA to dwImageCUDA), I just need it in another process. Accessing the image from the handle in the producer process works fine, but in the consumer, using the streamed handle leads to a segmentation fault.
Is it correct to just stream the 8 bytes of the handle then, or am I missing something?

Please share your patch for sample_image_streamer_cross so we can reproduce the segfault on our side.

I do not use the dw sample as base for my implementation, but the nvm_eglstream sample of the cuda 10.2 samples.

I think the underlying problem is the following:
dwImageHandle_t is declared as a typedef struct dwImageObject* it is essentially a pointer to the (incomplete) type dwImageObject. So when streaming the handle from one process to the next, I am not streaming content, but a pointer to content which is not pointing to a valid place in memory for the consumer process. Since the type dwImageObject is not visible to me, I may not stream the content the handle points to.
On the other hand, I do not understand how the dw cross process streaming sample then streams the handle in a valid fashion, if it is just a wrapper for EGLStream.

Another possible cause of the problem is that my initial implementation, since it is meant to stream cuda images, uses the cuda wrapper for EGL streams (cuEGLStreamProducerPresentFrame etc.). For a cuda image, that works fine, but for that to work with dwImageHandle_t, I need to copy the image handle to device memory (and back on the consumer side). As I said, the data transfer as such works well, I just guess that the handle on the consumer side cannot point to the same dwImageObject any more.

Dear @josua.zscheile,
I do not use the dw sample as base for my implementation, but the nvm_eglstream sample of the cuda 10.2 samples

Any reason for sticking to nvm_eglstream CUDA Sample. If so, why to use DW data structures?

Dear @SivaRamaKrishnaNV,

we are using the driveworks NN functions for the detection of objects of interest. Since the provided camera streamer tool has performance problems and crashes with multiple streams and is a black box (i.e. we have no way to tweak its performance), we need to implement it for ourselves so we at least have a way to handle these issues.
I went through the cuda sample because that way I at least have a low level method to stream the frame in device memory, which was my starting point. After that works, I had hoped to be able to also stream the frame handle in the same way.

Please create another topic for the performance and crash problems you face.

Because of the convoluted conversion between dwImageCUDA and CUeglFrame and others, I would suggest you use dwImageStreamer API.