Query regarding the NVIDIA Driveworks Tensor Operations, and conversion of tensor handle to image handle

I have been working on an inference script for a segmentation model on the NVIDIA Driveworks 3.5 Platform. As of now I have converted the segmentation model to the tensorrt format, loaded the model, and been able to run inference using the line:
CHECK_DW_ERROR(dwDNN_inferRaw(m_dnnOutputsDevice, &m_dnnInputDevice, 1U, m_dnn));

I now need to display only the second channel of this output tensor. My queries are:

  1. Do i need to stream the output tensor to CPU to perform any tensor operations?
  2. Is there a way to convert a tensor to an image handle that would allow me to stream it to GL directly.
  3. What would be the best approach to retrieve only a particular channel of the output tensor, and display it on the gl renderer.

Dear @vineeth.subramanyam ,
I am moving the query to DRIVE AGX General - NVIDIA Developer Forums for attention.

  1. The output of DNN module of DW is GPU CUDA buffer. So, you need to either copy or use Image Streamer to send data onto CPU to perform any operations on the data on CPU side.
  2. As I said above, the output is GPU CUDA buffer, You may use Image Streamer to send CUDA->GL and rendering.
  3. There are no APIs to extract specific channel from a buffer. You may have to implement your customized fuction to access a specific channel.

Im relatively new to using driveworks, so I needed a little more clarification. The output of the dnn module has a type float32_t which is in the Cuda buffer. So is it possible to use the image streamer to directly stream this float32_t output, or do i need to convert the float32_t to a dwImageCUDA type before streaming it.

Dear @vineeth.subramanyam ,
May I know, what is the output of your DNN? Generally for inference use cases, the output of DNN module(GPU CUDA buffer) will have object detection indices and not complete image to render on GL. Could you check our object detctor tracker sample ?

I am working on a segmentation model, and wanted to render the mask taken from the output of the dnn model. I have gone through the object tracker samples, and they stream the output of the object detection model to CPU, and make use its values. In my case I would like to render the model output since it would be a segmented version of the image.

ok. In that case, the output would be complete image mask. You need to create dwImageCUDA from CUDA buffer and check using ImageStreamer . Let us know if you see any issues.

ok. Thanks a lot!

Would i use the dwImage_getCUDA function to make the conversion to dwImageCUDA from the CUDA buffer?

This is where I am currently stuck.

Dear @vineeth.subramanyam ,
You need to use dwImage_create() function to create dwImage. Please check Image Streamer sample to understand the flow.

I have gone through the image streamer code, and what i understand from it is that , I would have to first convert the model output to DW_IMAGE_CUDA format. Then i would use the image streamer to either stream it to the GL renderer, or perform any image based operations on it. However, im unsure about how we would convert the original model output to the DW_IMAGE_CUDA format.

Dear @vineeth.subramanyam ,
Yes. You are right.

im unsure about how we would convert the original model output to the DW_IMAGE_CUDA format.

Please check using dwImage_createAndBindBuffer(). You can pass the output buffers of DNN module as input to dwImage_createAndBindBuffer(). Please check DriveWorks SDK Reference: Image to understand more details about DwImage buffers.

I have attached my block of code below. I would like to know if the error is because my model outputs a 2 channel output, and not a 3 channel expected by the type RGB. Also i would like to know if the way I have used the dwImage_createAndBindBuffer function is correct, as I couldnt find any sample code that makes use of this function.

This is the current error I get in terminal

Dear @vineeth.subramanyam ,
Yes. The error is related to wrong API usage. dwImage_createAndBindCUDAArray() requires CUDAArray buffer as input where as your buffer is normal CUDA buffer. Please use dwImage_createAndBindBuffer() function. Please see DriveWorks SDK Reference: Image Interface. Check using pitches parameter as null pointer.

could you please point me towards any example code or sample that uses this function, because I am having difficulty using this function.

Could you help me with the correct syntax for the dwImage_createAndBindBuffer function

Dear @vineeth.subramanyam ,
What is the size of m_dnnOutputsDevice? Does it contain only RG channels as you are using DW_IMAGE_FORMAT_RG_UNIT8? It looks like single buffer. Also, Could you print cudaProp. dwImageMemoryType?

I have not verified the snippet. Let me know if it works.

dwImage_createAndBindBuffer(&inference_img, cudaProp, m_dnnOutputsDevice[0], null, 1, m_context)

I was using only the RG channels since the model output shape was (224,224,2). The cudaProp memory type prints 0 which should be default according to the documentation. The error below is what I get when using the code snippet you sent above.


Dear @vineeth.subramanyam,
Please check using reinterpret_cast to convert float* to void*