Nvbuf_utils fails to establish EGL display connection in Docker

Hello everyone,

I’m developing a camera application that uses nvargus as a processor for MIPI cameras. The application itself works as expected when running on the jetson natively.

When we compile and run the application from inside a docker container, which is based on nvcr.io/nvidia/l4t-base image the application fails in 1/3 cases. We use the following command to run the container from a script:

docker run -it --rm --net=host --runtime nvidia \
  -v /etc/localtime:/etc/localtime:ro \
  -v /tmp/.X11-unix:/tmp/.X11-unix -v /tmp/argus_socket:/tmp/argus_socket \
  -e DISPLAY=$DISPLAY --device /dev/video0 --device /dev/video1 \
  $image_name:latest app/camera

the error is the following:

nvbuf_utils: Could not get EGL display connection
nvbuf_utils: ERROR getting proc addr of eglCreateImageKHR
nvbuf_utils: ERROR getting proc addr of eglDestroyImageKHR
nvbufsurface: eglGetDisplay failed with error 0x300c
nvbufsurface: Can't get EGL display

Any help is greatly appreciated!
Cheers

jetpack: 5.1.1
jetson: orin nx

Hi,
You may try the Jetpack container:

NVIDIA L4T JetPack | NVIDIA NGC

It is a more complete environment.

Hi,
using the Jetpack container gives the same issue as described above.

Is there anything else?

Hi,
Please make sure your DISPLAY is correctly set outside docker:

$ export DISPLAY=:0
$ xrandr
Screen 0: minimum 8 x 8, current 1920 x 1080, maximum 32767 x 32767
DP-0 disconnected (normal left inverted right x axis y axis)
DP-1 connected primary 1920x1080+0+0 (normal left inverted right x axis y axis)
510mm x 290mm
   1920x1080     60.00*+  59.94    50.00    49.95
   1680x1050     59.95
   1600x1200     60.00
   1600x900      60.00
   1440x900      59.89
   1280x1024     60.02
   1280x800      59.81
   1280x720      60.00    59.94    50.00
   1024x768      60.00
   800x600       60.32
   720x576       50.00
   720x480       59.94
   640x480       59.94    59.93

And then try

$ sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/:/tmp/ nvcr.io/nvidia/l4t-jetpack:r36.4.0

root@tegra-ubuntu:/# gst-launch-1.0 nvarguscamerasrc ! fakesink

Hi,
so still similar results:

$ echo $DISPLAY
:0
$ xrandr 
Screen 0: minimum 8 x 8, current 1024 x 600, maximum 32767 x 32767
HDMI-0 connected primary 1024x600+0+0 (normal left inverted right x axis y axis) 476mm x 268mm
   1024x600      59.82*+
   1920x1080     59.94    50.00  
   1280x1024     75.02  
   1280x720      59.94    50.00  
   1024x768      75.03    70.07    60.00  
   800x600       75.00    72.19    60.32    56.25  
   720x576       50.00  
   720x480       59.94  
   640x480       75.00    72.81    59.94  
$ sudo docker run -it --rm --net=host --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/:/tmp/ nvcr.io/nvidia/l4t-jetpack:r36.4.0
root@tegra-ubuntu:/# gst-launch-1.0 nvarguscamerasrc ! fakesink

(gst-launch-1.0:21): GStreamer-WARNING **: 08:12:13.412: External plugin loader failed. This most likely means that the plugin loader helper binary was not found or could not be run. You might need to set the GST_PLUGIN_SCANNER environment variable if your setup is unusual. This should normally not be required though.
No protocol specified
No EGL Display 
nvbufsurftransform: Could not get EGL display connection
Setting pipeline to PAUSED ...
Pipeline is live and does not need PREROLL ...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
GST_ARGUS: Creating output stream
No protocol specified
CONSUMER: Waiting until producer is connected...
GST_ARGUS: Available Sensor modes :
GST_ARGUS: 2464 x 2064 FR = 41.299998 fps Duration = 24213076 ; Analog Gain range min 0.000000, max 48.000000; Exposure Range min 1000, max 1000000000;

GST_ARGUS: Running with following settings:
   Camera index = 0 
   Camera mode  = 0 
   Output Stream W = 2464 H = 2064 
   seconds to Run    = 0 
   Frame Rate = 41.299998 
GST_ARGUS: Setup Complete, Starting captures for 0 seconds
GST_ARGUS: Starting repeat capture requests.
CONSUMER: Producer has connected; continuing.
Redistribute latency...
^Chandling interrupt.
Interrupt: Stopping pipeline ...
Execution ended after 0:00:17.749971352
Setting pipeline to NULL ...
GST_ARGUS: Cleaning up
CONSUMER: Done Success
GST_ARGUS: Done Success
Freeing pipeline ...

So EGL still cannot find the display.

Do you know if there is a way to use argus, nvbuffer without a display (or X11 in general)? In our application we can work without any display because we upload all images at some point so we could just get rid of the whole thing.

Hi,
Please try to run docker without DISPLAY:

$ sudo docker run -it --rm --net=host --runtime nvidia -v /tmp/:/tmp/ nvcr.io/nvidia/l4t-jetpack:r36.4.0

root@tegra-ubuntu:/# gst-launch-1.0 nvarguscamerasrc ! fakesink

Hi,
here’s the output

$ sudo docker run -it --rm --net=host --runtime nvidia -v /tmp/:/tmp/ nvcr.io/nvidia/l4t-jetpack:r36.4.0

root@tegra-ubuntu:/# echo $DISPLAY


root@tegra-ubuntu:/# gst-launch-1.0 nvarguscamerasrc ! fakesink

(gst-launch-1.0:21): GStreamer-WARNING **: 07:34:48.074: External plugin loader failed. This most likely means that the plugin loader helper binary was not found or could not be run. You might need to set the GST_PLUGIN_SCANNER environment variable if your setup is unusual. This should normally not be required though.
Setting pipeline to PAUSED ...
Pipeline is live and does not need PREROLL ...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
New clock: GstSystemClock
GST_ARGUS: Creating output stream
CONSUMER: Waiting until producer is connected...
GST_ARGUS: Available Sensor modes :
GST_ARGUS: 2464 x 2064 FR = 41.299998 fps Duration = 24213076 ; Analog Gain range min 0.000000, max 48.000000; Exposure Range min 1000, max 1000000000;

GST_ARGUS: Running with following settings:
   Camera index = 0 
   Camera mode  = 0 
   Output Stream W = 2464 H = 2064 
   seconds to Run    = 0 
   Frame Rate = 41.299998 
GST_ARGUS: Setup Complete, Starting captures for 0 seconds
GST_ARGUS: Starting repeat capture requests.
CONSUMER: Producer has connected; continuing.
Redistribute latency...
^Chandling interrupt.
Interrupt: Stopping pipeline ...
Execution ended after 0:00:53.056457608
Setting pipeline to NULL ...
GST_ARGUS: Cleaning up
CONSUMER: Done Success
GST_ARGUS: Done Success
Freeing pipeline ...

seems like the issues with EGL are not there without setting DISPLAY. So what does that mean in my case?

I tested my setup without the DISPLAY variable but get the same error.

I found out that we were still including the old nvbuf_utils API and swiched completely to NvUtils and nvbufsurface but there are still issues:

nvbufsurface: eglGetDisplay failed with error 0x300c
nvbufsurface: Can't get EGL display

This happens during a call to NvBufSurfaceMapEglImage. Any ideas what we can try?
Like I said we do not need a display to use our application if that opens other possibilities.

Hi,
Looks like the camera can be launched in the environment. So your application still fails? If yes, please share a patch to jetson_multimedia_api 09 or 10 sample, and the steps. So that we can try it on developer kit.

Hi,

I was able to reproduce the error with the following patch:

diff -Naur 10_argus_camera_recording/main.cpp 10_argus_camera_recording-b/main.cpp
--- 10_argus_camera_recording/main.cpp	2024-11-08 13:50:23.161578827 +0000
+++ 10_argus_camera_recording-b/main.cpp	2024-11-08 14:04:35.184098081 +0000
@@ -758,12 +758,7 @@
     NvApplicationProfiler &profiler = NvApplicationProfiler::getProfilerInstance();
 
     /* Get default EGL display */
-    eglDisplay = eglGetDisplay(EGL_DEFAULT_DISPLAY);
-    if (eglDisplay == EGL_NO_DISPLAY)
-    {
-        printf("Cannot get EGL display.\n");
-        return EXIT_FAILURE;
-    }
+    eglDisplay = EGL_NO_DISPLAY;
 
     if (!ArgusSamples::execute())
         return EXIT_FAILURE;

and simple dockerfile:

FROM nvcr.io/nvidia/l4t-jetpack:r35.3.1

# register nvidias deb packages, create OS expected files to fake native environment, install tegra libs
RUN echo "deb https://repo.download.nvidia.com/jetson/t234 r35.3 main" >> /etc/apt/sources.list.d/nvidia-l4t-apt-source.list && \
    mkdir -p /opt/nvidia/l4t-packages/ && touch /opt/nvidia/l4t-packages/.nv-l4t-disable-boot-fw-update-in-preinstall && \
    mv /etc/ld.so.conf.d/nvidia-tegra.conf /etc/ld.so.conf.d/nvidia-tegra.conf.old && \
    apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
    nvidia-l4t-core nvidia-l4t-jetson-multimedia-api && \
    rm -rf /var/lib/apt/lists/* && apt-get clean

COPY . /app
WORKDIR /app
RUN rm samples/common/classes/NvDrmRenderer.cpp && \
    sed -i 's\-ldrm\\g' samples/Rules.mk && cd samples/10_argus_camera_recording && make

The patch is applied to sample 10 and the docker file is placed in the root of jetson_multimedia_api like this:
/some/path/jetson_multimedia_api/Dockerfile
and called with:

$ docker build -t argus-sample -f Dockerfile .
$ docker run -it --rm --runtime nvidia -v /tmp/:/tmp argus-sample samples/10_argus_camera_recording/argus_camera_recording

You should see that the sample will now sometimes crash.

I think the problem is coming from the compilation inside the docker and the installation of the nvmmapi instead of compiling with the runtime. We want to produce a ready to run container with our dockerfile so that is definitely the goal here.

Do you have any clues on how to achieve this?

Hi,
Are you able to try Jetpack 6.1 on developer kit? Would be great if you can try it to make sure it’s not specific to 5.1.1.

Hi,
We don’t observe the issue on Orin NX developer kit + RPi camera v2 with Jetpack 6.1:

$ sudo docker run -it --rm --net=host --runtime nvidia -v /tmp/:/tmp/ nvcr.io/nvidia/l4t-jetpack:r36.4.0


root@tegra-ubuntu:/# cd usr/src/jetson_multimedia_api/samples/10_argus_camera_recording/
root@tegra-ubuntu:/usr/src/jetson_multimedia_api/samples/10_argus_camera_recording# make
Compiling: main.cpp
(...skip)
root@tegra-ubuntu:/usr/src/jetson_multimedia_api/samples/10_argus_camera_recording# ./argus_camera_recording
Set governor to performance before enabling profiler
PRODUCER: Creating output stream
PRODUCER: Launching consumer thread
Opening in BLOCKING MODE
NvMMLiteOpen : Block : BlockType = 4
===== NvVideo: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4
875967048
842091865
create video encoder return true
H264: extProfile = 5 Level = 50
NVMEDIA: Need to set EMC bandwidth : 126000
PRODUCER: Starting repeat capture requests.
CONSUMER: Argus::STATUS_END_OF_STREAM
CONSUMER: Got EOS, exiting...
CONSUMER: Done.
PRODUCER: Done -- exiting.
************************************
Total Profiling Time = 0 sec
************************************

Would suggest try on developer kit. eglGetDisplay(EGL_DEFAULT_DISPLAY) should not fail even though there is no display.

Hi,

even with r36.4.0 it still fails. Like I said I think this is due to the compilation inside of the container.

Try the following Dockerfile:

FROM nvcr.io/nvidia/l4t-jetpack:r36.4.0 AS dependencies

# register nvidias deb packages, create OS expected files to fake native environment, install tegra libs
RUN echo "deb [trusted=yes] https://repo.download.nvidia.com/jetson/t234 r36.4 main" >> /etc/apt/sources.list.d/nvidia-l4t-apt-source.list && \
    echo "deb [trusted=yes] https://repo.download.nvidia.com/jetson/common r36.4 main" >> /etc/apt/sources.list.d/nvidia-l4t-apt-source.list && \
    mkdir -p /opt/nvidia/l4t-packages/ && touch /opt/nvidia/l4t-packages/.nv-l4t-disable-boot-fw-update-in-preinstall && \
    mv /etc/ld.so.conf.d/nvidia-tegra.conf /etc/ld.so.conf.d/nvidia-tegra.conf.old && \
    apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
    nvidia-l4t-core nvidia-l4t-jetson-multimedia-api && \
    cp -r /usr/src/jetson_multimedia_api/ /app && \
    rm -rf /var/lib/apt/lists/* && apt-get clean

WORKDIR /app
RUN rm samples/common/classes/NvDrmRenderer.cpp && \
    sed -i 's\-ldrm\\g' samples/Rules.mk && cd samples/10_argus_camera_recording && make

There I only install the multimedia-api and remove DrmRenderer so I don’t need drm.

Then when running WITH display

host +
docker run -it --rm --runtime nvidia -e DISPLAY=$DISPLAY -v /tmp/:/tmp/ argus-sample samples/10_argus_camera_recording/argus_camera_recording

This will:

  1. always give a segfault for a reason I cannot see
  2. sometimes fail to detect the EGL_DISPLAY

The segfault does not really matter to me but the EGL_DISPLAY issue should be the same.

I think this is from compiling inside the build step. Do you have any setup where you can compile during docker build?

Do I maybe have to change something in the runtime or the mounted host files?

Hi,
Please use the default nvcr.io/nvidia/l4t-jetpack:r36.4.0. We don’t observe the issue in the docker on developer kit.

Hi,

another part to solving this mystery. I figured that if I use l4t-jetpack - instead of l4t-base - as the base and add nvidia as the default runtime, so that it is also available during build, the image that I build on the jetson is working as I would hope. The EGL error does not appear anymore.

So the solution to the issue would be adding the following to /etc/docker/daemon.json

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
         } 
    },
    "default-runtime": "nvidia" 
}

the "default-runtime": "nvidia" was missing in my configuration. With this I can now build on the jetson with 5.1.1

But then another question. Is it possible to build this container while not on a Jetson device? Now the only way to build is on a jetson and not on a build server. Is there such an option?

Hi,

Since the docker is based on hardware components of Jetson platform, it may not work to build it on a x86 host PC. Some functions may be dependent to the hardware components in Jetson SoC.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.