Nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory

I am setting up Isaac SIM on EC2 server with 4 A10G Tensor Core GPUs. as per this instruction and getting following error:

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown

Attached details from nvidia-smi

Hi @asriaws,

I’m currently facing the same error and it seems that I need to run my container as superuser so it can access GPUs.

I’m still trying to find a solution so I won’t have to create such a privileged (and possible unsecure) environment. I’ll put my findings here as an answer if they are useful.

In the meantime, you’ve told us you’ve had followed Nvidia Omniverse’s instructions, so at the 2nd step of Container Setup (Install Docker), you should have execute the Post-install steps for Docker, which basically creates a group docker with root permissions, add your user to it and activate the changes (accordingly to Docker’s documentation).

So your issue might come from elsewhere but that might be a good startpoint to look at!

I installed docker-desktop4.24.06 in Windows 10 19042.928. The graphics card driver is 537.42 and the following error is reported. How should I solve it?

The image has been installed without any problems
docker run --gpu all

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘legacy’
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

Hi, we do not support the Isaac Sim container in Docker on Windows. We currently only support our container on Linux. Please try the latest instructions here.

You could try build a custom container without privileged environment. Take a look at our dockerfiles.

Hello, I am facing the same problem. I have the drivers and NVIDIA Container Toolkit installed and configured correctly, the same for docker. Does anyone know what it could be?

image

image