I’m trying to implement what seems like pretty basic functionality, but I can’t find any solutions. I need all of these:
- Displaying camera image with low latency
- Using the ISP in the SoC to process these camera images
- Displaying a Qt UI overlaid on top of the camera image
- Being able to unplug and replug a camera (from a GMSL deserializer) without interrupting display of the UI
- Record both the camera image and the UI
I’m willing to do a lot of rewriting my software if I have to. I’ve already looked at and/or tried these:
- nvarguscamerasrc: can’t reload the kernel modules to recover from some camera errors, because it keeps Argus objects in global variables
- Using Argus directly in single-process mode: tends to crash the whole process when trying to shut down after the camera is unplugged, also leaks memory
- Using Argus directly in multi-process mode, shutting it down to reload kernel modules: it leaks memory every time I restart the camera
- Using Argus in a separate process I wrote, and creating an
EGLStream::FrameConsumer
in the main process handling display: leaks memory every time I restart the camera - Using Argus in a separate process I wrote, and
cuEGLStreamConsumerConnect
in the main process handling display: leaks memory every time I restart the camera - Displaying the camera and the UI on separate planes: apparently Orin only supports a single plane
- Copying
NvBufSurface
s between two processes I write: there doesn’t seem to be an API to do this. The oldNvBuffer
has one, but that seems to have been removed in the latest Jetpack releases. - CUDA IPC: I haven’t tried it, but 1. Introduction — CUDA C Programming Guide says “Since CUDA 11.5, only events-sharing IPC APIs are supported on L4T and embedded Linux Tegra devices with compute capability 7.x and higher. The memory-sharing IPC APIs are still not supported on Tegra platforms.” Please let me know if this is outdated, and I can try it.
- Copying data with the CPU to get it between two processes is going to add more latency than I want. I might have to resort to that and accept the worse performance, but I really don’t want to.
I’m just about out of ideas here. This has been many weeks of writing code, only to throw it away upon discovering that yet another NVIDIA API is broken in some new way. Can you please tell me how to make this work?