I have managed to feed GPU frames directly onto the stream by using the hardware encoder on NX. Here is that pipeline:
std::string pipeline =
"appsrc name=DigiViewSrc is-live=true ! "
"video/x-raw,format=BGRx ! "
"nvvideoconvert ! "
"video/x-raw(memory:NVMM),format=NV12 ! "
"nvv4l2h264enc bitrate=4000000 preset-level=2 insert-sps-pps=true idrinterval=10 ! "
“rtph264pay name=pay0 pt=96”;
But I have not managed to get it working on the Nano which uses the following pipeline:
gst_rtsp_media_factory_set_launch(
factory,
"( appsrc name=DigiViewSrc ! videoconvert ! video/x-raw,format=I420 ! "
“x264enc tune=zerolatency speed-preset=ultrafast key-int-max=10 ! rtph264pay name=pay0 config-interval=1 pt=96 )”);
From what I’ve understood it’s not possible due to x264enc only accepting host memory but that results in me having to do a device-to-host memcpy which I’d rather not. Is that correct or is there some way of feeding frames based in device memory directly into my GStreamer stream on Nano as well?
In your NX pipeline you actually have a memory copy from host memory to device memory. That can be determined because you are using an nvvideoconvert element to convert video/x-raw,format=BGRx to video/x-raw(memory:NVMM),format=NV12. Notice the (memory:NVMM) in the capsfilter after the nvvideoconvert.
Given that you are converting to NVMM memory on your NX the buffers that your custom app is putting out, we can conclude that your app produces buffers in host memory. Is that correct?
You are correct in stating that x264enc does not support device memory, therefore you would need to have host memory for it to work. However, given that we concluded that your app produces host memory buffers, it leaves me thinking that the issue might not be the memory in which the buffers are allocated, since the Nano pipe should then work without the nvvideoconvert.
Given all those points. May we ask a few questions:
Can you provide the output logs from the failing Nano pipeline?
Can you provide a bit of context on your custom app? What output memory should we expect?
best regards,
Andrew
Embedded Software Engineer at ProventusNova
I just have the nvvideoconvert element to convert from BGRx to NV12, I tried using the feature (memory:NVMM) in the first video/x-raw as well but I couldn’t get it to work.
Actually the app produces buffers in the device memory, this is the last part of the app that uses any host memory so that’s why I’m trying to get rid of it.
The videoconvert in the nano pipeline is also just used to convert BGR to I420.
Questions:
1&2: The Nano pipeline works it’s just that I’m trying to get rid of the device-to-host memcpy which I have right now. Basically the app takes in video frames directly into the device memory, then it does a lot of image processing in the device space, using CUDA kernels, then the processed device frame is sent to GStreamer which is processed based on if I’m running NX or Nano:
NX: Device-to-device memcpy to a NVMM surface which is then fed into the NX-specific GStreamer pipeline and streamed.
Nano: Device-to-host memcpy which is then fed into the nano-specific GStreamer pipeline and streamed.
Please tell me if anything is unclear or if I missed to answer something!
Yeah for the NX I tried setting memory:NVMM both in the caps as well as the pipeline but that got me nowhere so tried removing it from both and now it works. The CPU usage went down a lot and the whole GST pipeline is about 3x as fast as it was when I used the same code as I use for the Nano..
I do the memory copy manually before handing the frame to the GStreamer part of the app. So my Nano code is currently written so that it expects a host memory buffer for the frames.
Hi,
Orin Nano does not have hardware encoder, so you have to put frame data in CPU buffer and send to software encoder. CPU usage is significant so please run $ sudo jetson_clocks to fix CPU cores at maximum frequency.