Now with the help of Nvidia’s developer forum, I used NvBufsurface APIs + CUDA APIs to map NvBufsurface to cv::cuda::GpuMat.
My question is, how do I use this buffer or image to perform inference directly using a TensorRT engine.
While deepstream does come to mind, the model I am using is Detectron2.
To even create a tensorrt engine for detectron2, the onnx file needs to be created twice and it is frankly quite messy.
I read somewhere that nvvidconv already prepares the gstbuffer for GstNvinfer for inferencing, so I was hoping of doing something on the similar lines.
Since I do have a tensorrt engine now, any advice or suggestions as to how I can use this gstbuffer (or gpumat) for inferencing directly? (gpu → gpu)
Perhaps any links or resources which I can read.
Since I am new to all this, there is a lot of stuff I do not know about but I’d appreciate any help so thanks in advance.
Hi @AastaLLL
Thanks for the reply!
Link seems helpful so I’ll explore it.
I want to confirm about a thing you mentioned though,
“If you have the CUDA buffer pointer (from GpuMat?)”
So what I wanted was to get buffer from the memory:NVMM (gpu) from my gst-pipeline and pass it to TensorRT (gpu) memory directly without any cpu involvement.
Thus I posted a question here yesterday and I got this response, using which I currently have a gpumat.
I wasn’t sure how to use that gpumat to pass it to the TensorRT.
One last thing, do you think NVIVAFILTER would be useful?
From what I read in other posts, NVIVAFILTER is used to wrap nvmm memory with eglimage so that it can be used by cuda.
Yes I have and I generated the engine using that documentation itself, but it is written in the documentation that for good performance, do inference in cpp and/or use deepstream.
With python, i am not sure how to get buffer from nvmm:memory and keep it in gpu memory so that later i can pass it for the inference.
Right now I am referring the infer.py and other sample c++ programs and trying to create a cpp inference code for detectron2.
If you ever have any suggestions/advice, please let me know.
Thanks.
Deepstream is my endgoal to be very honest but I had no idea how to use it with detectron2.
Thanks for the link, I will look into it.
For now I created a custom cpp file which takes a mat image, converts it into a gpuimage image and then does inference and it is working.
So now I know that I am able to do inference with detectron2 using cpp.
My next goal is to understand how to connect my gpuimage I get from eglFrame.frame.pPitch[0] to my inference code. (Any advices or should I just create a new post?)
Also If I understand it correctly, using deepstream, I dont need to store the image from nvmm memory to gpuimage right? it directly passes it to gst-nvinfer?
1
Suppose you have got the GPU data buffer from eglFrame.frame.pPitch[0].
Then you can follow the jetson_inference to pass the data buffer directly.
For example:
mBindings = (void**)malloc(bindingSize);
for( uint32_t n=0; n < GetInputLayers(); n++ )
mBindings[mInputs[n].binding] = [input CUDA ptr]
for( uint32_t n=0; n < GetOutputLayers(); n++ )
mBindings[mOutputs[n].binding] = [output CUDA ptr]
...
context ->enqueueV2(mBindings, mStream, NULL)
2 Also If I understand it correctly, using deepstream, I dont need to store the image from nvmm memory to gpuimage right? it directly passes it to gst-nvinfer?