OpenGL cannot see the two GPUs on pegasus docker

Hello we are developing opengl applications inside docker on Pegasus machine using offscreen rendering. However so far we can only access the mesa opengl software device, the two GPUs are not accessible. I tried a small test program which works fine on the host but not in the docker.
Here is the output from the host:

Detected 4 devices
Device 0 description: EGL_NV_device_cuda EGL_EXT_device_drm
Device 1 description: EGL_NV_device_cuda EGL_EXT_device_drm
Device 2 description: EGL_EXT_device_drm
Device 3 description: EGL_MESA_device_software

Here is the output inside the docker

Detected 2 devices
Device 0 description: EGL_EXT_device_drm
Device 1 description: EGL_MESA_device_software

The simple test program:

 #include <EGL/egl.h>
#include <EGL/eglext.h>
#include <iostream>

  static const EGLint configAttribs[] = {
          EGL_SURFACE_TYPE, EGL_PBUFFER_BIT,
          EGL_BLUE_SIZE, 8,
          EGL_GREEN_SIZE, 8,
          EGL_RED_SIZE, 8,
          EGL_DEPTH_SIZE, 8,
          EGL_RENDERABLE_TYPE, EGL_OPENGL_BIT,
          EGL_NONE
  };

  static const int pbufferWidth = 9;
  static const int pbufferHeight = 9;

  static const EGLint pbufferAttribs[] = {
        EGL_WIDTH, pbufferWidth,
        EGL_HEIGHT, pbufferHeight,
        EGL_NONE,
  };

int main(int argc, char *argv[])
{
  // 1. Initialize EGL

    static const int MAX_DEVICES = 20;
    EGLDeviceEXT eglDevs[MAX_DEVICES];
    EGLint numDevices;

    PFNEGLQUERYDEVICESEXTPROC eglQueryDevicesEXT =
        (PFNEGLQUERYDEVICESEXTPROC)eglGetProcAddress("eglQueryDevicesEXT");

    eglQueryDevicesEXT(MAX_DEVICES, eglDevs, &numDevices);

    //MW_INFO("Detected {} devices\n", numDevices);
    std::cout<<"Detected "<<numDevices<<" devices"<<std::endl;
    PFNEGLQUERYDEVICESTRINGEXTPROC eglQueryDeviceStringEXT =
        (PFNEGLQUERYDEVICESTRINGEXTPROC)eglGetProcAddress("eglQueryDeviceStringEXT");
    for(int i=0;i<numDevices;i++){
    const char *attr = eglQueryDeviceStringEXT(eglDevs[i], EGL_EXTENSIONS);
    //MW_INFO("Device {} description: {}",i, attr);
    std::cout<<"Device "<<i<<" description: "<<attr<<std::endl;
    }
  EGLDisplay eglDpy = eglGetDisplay(EGL_DEFAULT_DISPLAY);

  EGLint major=0, minor=0;

  eglInitialize(eglDpy, &major, &minor);
  std::cout<<"version="<<major<<"."<<minor<<std::endl;
  // 2. Select an appropriate configuration
  EGLint numConfigs;
  EGLConfig eglCfg;

  eglChooseConfig(eglDpy, configAttribs, &eglCfg, 1, &numConfigs);

  // 3. Create a surface
  EGLSurface eglSurf = eglCreatePbufferSurface(eglDpy, eglCfg,
                                               pbufferAttribs);

  // 4. Bind the API
  eglBindAPI(EGL_OPENGL_API);

  // 5. Create a context and make it current
  EGLContext eglCtx = eglCreateContext(eglDpy, eglCfg, EGL_NO_CONTEXT,
                                       NULL);
  std::cout<<"eglCtx="<<eglCtx<<std::endl;
  eglMakeCurrent(eglDpy, eglSurf, eglSurf, eglCtx);

  // from now on use your OpenGL context

  // 6. Terminate EGL when finished
  eglTerminate(eglDpy);
  return 0;
}

I did a lot of research trying to import all the GPU devices into the docker while start the container, tried to use nvidia/cudagl et al, but I have no success. Please kindly give me some idea on this. Thanks

@VickNV Can you take a look at my issue? Thanks

Hi, @shangping.guo
I moved this topic to DRIVE AGX General - NVIDIA Developer Forums. Please create DRIVE AGX topics there afterward. Thanks.

Are you asking about GPU support of target docker?

@VickNV Thank you for the reply. Yes, we want to use both cuda and opengl hardware acceleration, but seems inside the docker I cannot use opengl (only software implementation is able to use)

target container with GPU isn’t supported. You need to wait for some release of DRIVE Orin.

@VickNV Thank you for the information. Just want to make sure the target container is the same as I referred. You mean that opengl is not supported in pegasus docker yet? But I see a few posts about this, and also nvidia container nvidia/cudagl, do you mean they are still not available on pegasus?

Thanks for your double-checking.

As mentioned in https://developer.nvidia.com/blog/nvidia-drive-os-5-2-6-linux-sdk-now-available/, DRIVE OS 5.2.6 has Docker containers for only beta testing. But they don’t include any target container yet.

@VickNV I am not sure if I have access to the nvidia drive docker container. This is what I got from ngc:

@VickNV I am still not quite clear what the target container means. You mean the container on our target pegasus?
Also I have a question, if Opengl works on the host, it shall also be able to work in the container provided the driver and libraries are installed in the container, Do I miss something here? Thanks

Please refer to Drive AGX Docker Container not available and see if you have accepted the invitation.

containers running on the target system aren’t supported yet.

Are you working on containers on host or target? Please check below for the details of the containers on host.

@VickNV Maybe I did not express myself clearly and I apologize for it. I am working on the target system using container. (directly on the pegausus system with a container built by ourselves). So I am having question, why it is not supported? we are using it for a long time already. What I mean the host is the pegasus system itself (in contrast to the docker, not talking about the host vs target using sdkmanager). Hope it clarifies the questions

Thanks for the clarification. The target container with GPU support is a planned feature from a DRIVE OS 6 release for DRIVE Orin.

Did you mean it used to work on your side and this is a regression in some release?

@Vicky, No, this is our first attempt to use opengl in the pegasus docker container (I am just not very sure about your statement: target container is not supported). The opengl works outside the container (on the target pegasus machine).
What we are doing is very similar to this: EGL Eye: OpenGL Visualization without an X Server | NVIDIA Developer Blog
My question is why opengl works outside the container, but not in the container.

@VickNV I guess the nvidia driver files are missing:
on the host (outside the container):

lrwxrwxrwx 1 root root      14 May 10  2019 libGL.so -> libGL.so.1.0.0
lrwxrwxrwx 1 root root      14 May 10  2019 libGL.so.1 -> libGL.so.1.0.0
-rw-r--r-- 1 root root  972968 May 10  2019 libGL.so.1.0.0
lrwxrwxrwx 1 root root      21 May 10  2019 libGLESv1_CM.so -> libGLESv1_CM.so.1.0.0
lrwxrwxrwx 1 root root      21 May 10  2019 libGLESv1_CM.so.1 -> libGLESv1_CM.so.1.0.0
-rw-r--r-- 1 root root  141472 May 10  2019 libGLESv1_CM.so.1.0.0
lrwxrwxrwx 1 root root      18 May 10  2019 libGLESv2.so -> libGLESv2.so.2.0.0
lrwxrwxrwx 1 root root      18 May 10  2019 libGLESv2.so.2 -> libGLESv2.so.2.0.0
-rw-r--r-- 1 root root  153760 May 10  2019 libGLESv2.so.2.0.0
-rwxr-xr-x 1 root root  112968 Sep  2  2020 libGLESv2_nvidia.so.2*
lrwxrwxrwx 1 root root      19 Feb 22  2019 libGLEWmx.so.1.13 -> libGLEWmx.so.1.13.0
-rw-r--r-- 1 root root  505624 Feb 22  2019 libGLEWmx.so.1.13.0
-rw-r--r-- 1 root root  850714 May 21  2016 libGLU.a
lrwxrwxrwx 1 root root      15 May 21  2016 libGLU.so -> libGLU.so.1.3.1
lrwxrwxrwx 1 root root      15 May 21  2016 libGLU.so.1 -> libGLU.so.1.3.1
-rw-r--r-- 1 root root  400040 May 21  2016 libGLU.so.1.3.1
lrwxrwxrwx 1 root root      15 May 10  2019 libGLX.so -> libGLX.so.0.0.0
lrwxrwxrwx 1 root root      15 May 10  2019 libGLX.so.0 -> libGLX.so.0.0.0
-rw-r--r-- 1 root root   63880 May 10  2019 libGLX.so.0.0.0
lrwxrwxrwx 1 root root      16 Jun 12  2020 libGLX_indirect.so.0 -> libGLX_mesa.so.0
lrwxrwxrwx 1 root root      20 Jun 12  2020 libGLX_mesa.so.0 -> libGLX_mesa.so.0.0.0
-rw-r--r-- 1 root root  430704 Jun 12  2020 libGLX_mesa.so.0.0.0
-rwxr-xr-x 1 root root 1059488 Sep  2  2020 libGLX_nvidia.so.0*
lrwxrwxrwx 1 root root      22 May 10  2019 libGLdispatch.so -> libGLdispatch.so.0.0.0
lrwxrwxrwx 1 root root      22 May 10  2019 libGLdispatch.so.0 -> libGLdispatch.so.0.0.0
-rw-r--r-- 1 root root 1030368 May 10  2019 libGLdispatch.so.0.0.0

Inside the container:

root@pegasus1b:/ota/pkg_data2/sguo# ll /usr/lib/aarch64-linux-gnu/libGL*
lrwxrwxrwx 1 root root      22 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLdispatch.so -> libGLdispatch.so.0.0.0
lrwxrwxrwx 1 root root      22 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLdispatch.so.0 -> libGLdispatch.so.0.0.0
-rw-r--r-- 1 root root 1030368 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLdispatch.so.0.0.0
lrwxrwxrwx 1 root root      21 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLESv1_CM.so -> libGLESv1_CM.so.1.0.0
lrwxrwxrwx 1 root root      21 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLESv1_CM.so.1 -> libGLESv1_CM.so.1.0.0
-rw-r--r-- 1 root root  141472 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLESv1_CM.so.1.0.0
lrwxrwxrwx 1 root root      18 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLESv2.so -> libGLESv2.so.2.0.0
lrwxrwxrwx 1 root root      18 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLESv2.so.2 -> libGLESv2.so.2.0.0
-rw-r--r-- 1 root root  153760 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLESv2.so.2.0.0
lrwxrwxrwx 1 root root      14 May 10  2019 /usr/lib/aarch64-linux-gnu/libGL.so -> libGL.so.1.0.0
lrwxrwxrwx 1 root root      14 May 10  2019 /usr/lib/aarch64-linux-gnu/libGL.so.1 -> libGL.so.1.0.0
-rw-r--r-- 1 root root  972968 May 10  2019 /usr/lib/aarch64-linux-gnu/libGL.so.1.0.0
-rw-r--r-- 1 root root  850714 May 21  2016 /usr/lib/aarch64-linux-gnu/libGLU.a
lrwxrwxrwx 1 root root      15 May 21  2016 /usr/lib/aarch64-linux-gnu/libGLU.so -> libGLU.so.1.3.1
lrwxrwxrwx 1 root root      15 May 21  2016 /usr/lib/aarch64-linux-gnu/libGLU.so.1 -> libGLU.so.1.3.1
-rw-r--r-- 1 root root  400040 May 21  2016 /usr/lib/aarch64-linux-gnu/libGLU.so.1.3.1
lrwxrwxrwx 1 root root      16 Jun 12  2020 /usr/lib/aarch64-linux-gnu/libGLX_indirect.so.0 -> libGLX_mesa.so.0
lrwxrwxrwx 1 root root      20 Jun 12  2020 /usr/lib/aarch64-linux-gnu/libGLX_mesa.so.0 -> libGLX_mesa.so.0.0.0
-rw-r--r-- 1 root root  430704 Jun 12  2020 /usr/lib/aarch64-linux-gnu/libGLX_mesa.so.0.0.0
lrwxrwxrwx 1 root root      15 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLX.so -> libGLX.so.0.0.0
lrwxrwxrwx 1 root root      15 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLX.so.0 -> libGLX.so.0.0.0
-rw-r--r-- 1 root root   63880 May 10  2019 /usr/lib/aarch64-linux-gnu/libGLX.so.0.0.0

These 4 library files are not present in the docker

nvidia@pegasus1b:/usr/lib/aarch64-linux-gnu$ ll *nvidia*
-rwxr-xr-x 1 root root 1111976 Sep  2  2020 libEGL_nvidia.so.0*
-rwxr-xr-x 1 root root  112968 Sep  2  2020 libGLESv2_nvidia.so.2*
-rwxr-xr-x 1 root root 1059488 Sep  2  2020 libGLX_nvidia.so.0*
lrwxrwxrwx 1 root root      28 Sep 20 06:24 libnvidia-container.so.1 -> libnvidia-container.so.1.5.1*
-rwxr-xr-x 1 root root  150408 Sep 20 06:24 libnvidia-container.so.1.5.1*

Are these files related to the issue? Thanks
Update: I copied these nvidia files but issue remains the same. I guess the program is not calling nvidia’s lib

Where is the container image from? released by nvidia?

@VickNV no, we built from scratch based on drive os 5.1.6 pegasus ubuntu system

As I told you previously, it’s a feature of a DRIVE Orin release.
I’m not sure how much I can help you here. May I know the importance to you?

@VickNV Thank you for your help. I just want to make sure you understand my question clearly. If it is not supported yet, that is fine and I just want to make sure the conclusion is solid. Sorry for the troubles.
It is important since we are evaluating the nvidia pegasus as our autonomous driving platform.