Low-latency, reliable way to overlay on top of camera images with Orin and ISP

I’m trying to implement what seems like pretty basic functionality, but I can’t find any solutions. I need all of these:

  • Displaying camera image with low latency
  • Using the ISP in the SoC to process these camera images
  • Displaying a Qt UI overlaid on top of the camera image
  • Being able to unplug and replug a camera (from a GMSL deserializer) without interrupting display of the UI
  • Record both the camera image and the UI

I’m willing to do a lot of rewriting my software if I have to. I’ve already looked at and/or tried these:

  • nvarguscamerasrc: can’t reload the kernel modules to recover from some camera errors, because it keeps Argus objects in global variables
  • Using Argus directly in single-process mode: tends to crash the whole process when trying to shut down after the camera is unplugged, also leaks memory
  • Using Argus directly in multi-process mode, shutting it down to reload kernel modules: it leaks memory every time I restart the camera
  • Using Argus in a separate process I wrote, and creating an EGLStream::FrameConsumer in the main process handling display: leaks memory every time I restart the camera
  • Using Argus in a separate process I wrote, and cuEGLStreamConsumerConnect in the main process handling display: leaks memory every time I restart the camera
  • Displaying the camera and the UI on separate planes: apparently Orin only supports a single plane
  • Copying NvBufSurfaces between two processes I write: there doesn’t seem to be an API to do this. The old NvBuffer has one, but that seems to have been removed in the latest Jetpack releases.
  • CUDA IPC: I haven’t tried it, but 1. Introduction — CUDA C Programming Guide says “Since CUDA 11.5, only events-sharing IPC APIs are supported on L4T and embedded Linux Tegra devices with compute capability 7.x and higher. The memory-sharing IPC APIs are still not supported on Tegra platforms.” Please let me know if this is outdated, and I can try it.
  • Copying data with the CPU to get it between two processes is going to add more latency than I want. I might have to resort to that and accept the worse performance, but I really don’t want to.

I’m just about out of ideas here. This has been many weeks of writing code, only to throw it away upon discovering that yet another NVIDIA API is broken in some new way. Can you please tell me how to make this work?

hello brian199,

I think you’re able to enable camera stream, processing with ISP, and rendering camera preview to the display.
let me try list all your questions, and sharing some comments separately, please check below…


>> Q1. Being able to unplug and replug a camera (from a GMSL deserializer) without interrupting display of the UI

your question should related to error handling since you would like to unplug and replug a camera.
may I know what’s the expectation of… "without interrupting display of the UI"?
there’s error recover mechanism, but… when you unplug a camera, it shows timeout failures from camera pipeline. Argus will report it via EVENT_TYPE_ERROR error flag, and the application has to shutdown. application side should also report segmentation fault, which is expected to force-stop the application.


>> Q2. it leaks memory every time

may I know which Jetpack release version you’re working with.
what’s your test pipeline? is it repo’ed with simple gst pipeline with nvarguscamerasrc plugin?


>> Q3. Displaying the camera and the UI on separate planes

this looks related to your app implementation.
anyways, may I have more details or your expectation.


>> Q4. The old NvBuffer has one, but that seems to have been removed in the latest Jetpack releases.

may I know which Jetpack release version you’re working with.
please share the code snippets as well for quick checking.


>> Q5. CUDA IPC:

we may double confirm the Jetpack release version you’re working with.


>> Q6. Copying data with the CPU to get it between two processes is going to add more latency

please share your actual use-case for reference, this may due to above Q4 about NvBuffer copy mechanism.

the application has to shutdown. application side should also report segmentation fault, which is expected to force-stop the application.

Yes, this is the crux of the problem. I need part of my application to continue displaying a UI on the screen. I am flexible on ways to split up the part that needs to shut down from the rest of it, but there needs to be some separation which can be reconnected to a new instance of the camera part.

may I know which Jetpack release version you’re working with.

Jetpack 5.1.2 (L4T 35.4.1)

what’s your test pipeline? is it repo’ed with simple gst pipeline with nvarguscamerasrc plugin?

As I mentioned, nvarguscamerasrc stores global Argus resources in shared_ptr<CameraProviderContainer> g_cameraProvider, which means I can’t even reload the kernel modules to re-initialize the camera hardware in the first place. If I modify execute to just create a new CameraProviderContainer each time, then I can reload the kernel modules after removing nvarguscamerasrc from the GStreamer pipeline. However, each time I create a new nvarguscamerasrc after doing this, it leaks some memory (and dmabuf and other file descriptors, etc). This is reproducible by simply creating and destroying a simple pipeline in a main function. Is this enough detail, or do you need me to actually write out this example?

>> Q3. Displaying the camera and the UI on separate planes

this looks related to your app implementation.
anyways, may I have more details or your expectation.

This is just one of my ideas: run separate GStreamer pipelines to separate nvdrmvideosink elements with different planes. I tried prototyping that, and it didn’t work, and according to Orin Nano With NvDRM Overlay - #15 by DaneLLL it’s unsupported.

>> Q4. The old NvBuffer has one, but that seems to have been removed in the latest Jetpack releases.

may I know which Jetpack release version you’re working with.
please share the code snippets as well for quick checking.

I’m using 5.1.2. I never actually wrote this version, because I can’t find the nvbuf_utils APIs in this release. According to Deprecated Nvbuf_utils is removed from JetPack 5.1.2 · Issue #169 · dusty-nv/jetson-utils · GitHub that’s expected. Also I found the “nvbuf_utils to NvUtils Migration Guide” which is further evidence for nvbuf_utils/NvBuffer being removed. How to share buffers across processes using jetpack 5 - #10 by DaneLLL states that there’s no direct replacement for the IPC functionality (EGLStream is mentioned later in that thread, which I have also tried, but it leaks memory).

>> Q6. Copying data with the CPU to get it between two processes is going to add more latency

please share your actual use-case for reference, this may due to above Q4 about NvBuffer copy mechanism.

Like I’ve said before, I’m pretty flexible with my use case. If there’s any way to send image buffer data between processes without routing it through the CPU’s limited memory bandwidth and cache, which supports reconnecting without leaking memory, I can probably make use of it.

My baseline idea here is to set up a shared memory region and use a UNIX domain socket to coordinate usage of it. I don’t see any way to get the image data out of Argus besides IEGLOutputStream, so I guess I’ll use that and then copy from there into an NvBufSurface via IImageNativeBuffer’s createNvBuffer/copyToNvBuffer, and then from there copy it into the shared memory region with NvBuffer2Raw (if there’s a faster way to get the image into a memory region I control, I would love to hear about it). Then once it’s in my other process, I can use Raw2NvBufSurface to create the NvBufSurface to feed into my GStreamer pipeline (again, would love to hear of a faster way of doing this). I haven’t implemented this part yet, because I’m hoping that NVIDIA has a usable IPC mechanism somewhere, and I’m also sure all the copies will add some latency which I’m hoping to avoid.

hello brian199,

let me reply the error handling use-case. please allow me to ignore other questions at the moment.
so far, the application has to shutdown to restart the camera service, it’s mandatory.

there’re some stability improvements in the latest Jetpack-5 (r35.5.0)
please moving to the latest release version if that’s feasible.

Unless you can confirm that a newer version allows hotplugging a camera without restarting the application, I’m going to wait until after an upcoming deadline to upgrade.

hello brian199,

it’s not in the plan.
as mentioned, the application has to shutdown to restart the camera service, it’s mandatory.

Thanks for clarifying that. Do you have any suggestions for a low-latency API to send image data between processes which allows restarting the producer independently of the consumer?

hello brian199,

Argus uses FIFO.

here’re more details…
from Argus side when ISP is in use, our user space driver internally takes care of initiating extra 2 captures for sensor exposure programming when argus and hence underneath driver receives first capture request from client.
these 2 internal captures were ignored at driver level and are not sent to argus or client, so this way client receives the same output captures which was requested.
Although sensor would have captured 3 frames but first 2 frames might be with incorrect exposure settings.

we don’t have examples for that.
those MMAPI examples used EGL, which is used by ARGUS APIs for buffer/stream management.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.