vkCreateInstance error (-9)

Subject: vkCreateInstance error (-9) when Installing NVIDIA ACE Animation Graph Microservice Docker Container

Hello NVIDIA Community,

I’m facing an issue while trying to install the NVIDIA ACE Animation Graph Microservice Docker container. I’m following the official NVIDIA documentation:

Animation Pipeline Local Docker Containers

When I execute the following command:

docker run -it --rm --gpus all --network=host --name anim-graph-ms -v /default-avatar-scene_v1.0.0:/home/ace/asset nvcr.io/eevaigoeixww/animation/ia-animation-graph-microservice:1.0.1

I receive the following output:

+ ldconfig -p
+ grep libGLX_nvidia.so.0
        libGLX_nvidia.so.0 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
+ [[ -v NOTFOUND ]]
+ export VK_ICD_FILENAMES=/tmp/nvidia_icd.json
+ VK_ICD_FILENAMES=/tmp/nvidia_icd.json
+ export LD_LIBRARY_PATH=:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/opt/nvidia/omniverse/kit-sdk-launcher/plugins/carb_gfx
+ LD_LIBRARY_PATH=:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/opt/nvidia/omniverse/kit-sdk-launcher/plugins/carb_gfx
+ /opt/nvidia/omniverse/vkapiversion/bin/vkapiversion /tmp/nvidia_icd.json
Writing disposable ICD file (/tmp/tmp_icd_HF6iIQ.json)...
vkCreateInstance error (-9)

Despite ensuring that my NVIDIA drivers and NVIDIA Container Toolkit are correctly configured, the issue persists. I’ve attempted this installation on:

  • WSL Ubuntu 22.04
  • WSL Ubuntu 24.04
  • Dedicated Server with Ubuntu

In all cases, I encounter the same vkCreateInstance error (-9).

Here is the output of nvidia-smi on my system:

root@DECODED:~# nvidia-smi
Wed Oct  2 19:57:17 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.183.01             Driver Version: 561.09       CUDA Version: 12.6     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 4070        On  | 00000000:2B:00.0  On |                  N/A |
| 58%   36C    P0              35W / 200W |   3695MiB / 12282MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|  No running processes found                                                           |
+---------------------------------------------------------------------------------------+

System Details:

  • GPU: NVIDIA GeForce RTX 4070
  • Driver Version: 561.09
  • CUDA Version: 12.6
  • Operating System: Ubuntu 22.04 and 24.04 (both on WSL and dedicated server)
  • Docker Version: Latest
  • NVIDIA Container Toolkit: Installed and configured

What I’ve Tried So Far:

  • Verified that the NVIDIA drivers are up to date.
  • Ensured the NVIDIA Container Toolkit is properly installed and configured.
  • Used the --gpus all flag with Docker to grant GPU access to the container.
  • Tested on both WSL and a dedicated Ubuntu server to rule out environment-specific issues.

Issue Description:

The error seems to occur during the Vulkan instance creation inside the container:

vkCreateInstance error (-9)

This suggests there might be an issue with Vulkan setup or GPU access within the container.

Request for Assistance:

Has anyone experienced a similar issue or can provide guidance on how to resolve this error? I’m wondering if there’s a compatibility issue with the RTX 4070 or a misconfiguration that I’m overlooking.

Any insights or suggestions would be greatly appreciated!

Thank you in advance for your help!

Best regards,

DECODED

3 Likes

I am having the same issue, any help?

Ubuntu: 22.04
Driver Version: 535.183.06
CUDA Version: 12.2
vulkan-tools installed

More info:

sudo docker run -it --rm --gpus all --network=host --name anim-graph-ms -v /home/eus/default-avatar-scene_v1.0.0:/home/ace/asset nvcr.io/eevaigoeixww/animation/ia-animation-graph-microservice:1.0.1

+ ldconfig -p
+ grep libGLX_nvidia.so.0
        libGLX_nvidia.so.0 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
        libGLX_nvidia.so.0 (libc6) => /usr/lib/i386-linux-gnu/libGLX_nvidia.so.0
+ [[ -v NOTFOUND ]]
+ export VK_ICD_FILENAMES=/tmp/nvidia_icd.json
+ VK_ICD_FILENAMES=/tmp/nvidia_icd.json
+ export LD_LIBRARY_PATH=:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/opt/nvidia/omniverse/kit-sdk-launcher/plugins/carb_gfx
+ LD_LIBRARY_PATH=:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/opt/nvidia/omniverse/kit-sdk-launcher/plugins/carb_gfx
+ /opt/nvidia/omniverse/vkapiversion/bin/vkapiversion /tmp/nvidia_icd.json
Writing disposable ICD file (/tmp/tmp_icd_2cb3WU.json)...
vkCreateInstance error (-9)

I have another issue. We see that i have installed NVIDIA Container Toolkit. But we have another error with GPU

stefan@tul-pryanik:~$ sudo dpkg -l | grep nvidia-container-toolkit
ii  nvidia-container-toolkit               1.16.1-1                                amd64        NVIDIA Container toolkit
ii  nvidia-container-toolkit-base          1.16.1-1                                amd64        NVIDIA Container Toolkit Base
stefan@tul-pryanik:~$ docker run -it --rm --runtime nvidia --network=host --name anim-graph-ms -v $(pwd)/default-avatar-scene_v1.0.0:/home/ace/asset nvcr.io/eevaigoeixww/animation/ia-animation-graph-microservice:1.0.1
+ ldconfig -p
+ grep libGLX_nvidia.so.0
+ NOTFOUND=1
+ [[ -v NOTFOUND ]]
+ cat

Fatal Error: Can't find libGLX_nvidia.so.0...

Ensure running with NVIDIA runtime. (--gpus all) or (--runtime nvidia)

+ exit 1
stefan@tul-pryanik:~$ 

Hi,

Any resolution for this? I am using a AWS g5.12xlarge ubuntu 22.04 machine and got stuck in the same thing. I’ve setup a gnome-desktop as well to get a display but that doesn’t seem to solve the issue.

Here is what I get. Can anyone help?

+ ldconfig -p
+ grep libGLX_nvidia.so.0
	libGLX_nvidia.so.0 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libGLX_nvidia.so.0
+ [[ -v NOTFOUND ]]
+ export VK_ICD_FILENAMES=/tmp/nvidia_icd.json
+ VK_ICD_FILENAMES=/tmp/nvidia_icd.json
+ export LD_LIBRARY_PATH=:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/opt/nvidia/omniverse/kit-sdk-launcher/plugins/carb_gfx
+ LD_LIBRARY_PATH=:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/opt/nvidia/omniverse/kit-sdk-launcher/plugins/carb_gfx
+ /opt/nvidia/omniverse/vkapiversion/bin/vkapiversion /tmp/nvidia_icd.json
Writing disposable ICD file (/tmp/tmp_icd_r31iaR.json)...
vkCreateInstance error (-9)

#NVIDIAInception inception

Not sure ACE will work with a RTX 4070 as you have to have NVAIE which is only supported on servers with certain cards.

First find your driver version
nvidia-smi --query-gpu=driver_version --format=csv,noheader --id=0
or
cat /proc/driver/nvidia/version
then
vulkaninfo --summary
and note down the Vulkan Instance Version: 1.3.xxx

the /tmp/nvidia_icd.json file docker is writing should be same version

{
    "file_format_version" : "1.0.0",
    "ICD": {
        "library_path": "libGLX_nvidia.so.0",
        "api_version" : "1.3.xxx"
    }
}

check to see if vulkan-utils is installed and uninstall it
sudo apt list --installed |grep vulkan-utils

Reboot to make sure nvidia_uvm is not in use and stop any containers that may use it.

Check to see if libnvidia-gl is installed or what version
sudo apt list --installed |grep libnvidia-gl

If not installed install the version relating to your driver version, choose the server version if available

sudo apt install libnvidia-gl-550-server

I still don’t have it working however this is a step in the right direction.

I reinstall WSL Ubuntu and installed drivers using CUDA Toolkit and got new error.

decoded@DESKTOP-R5O4QTL:~$ docker run -it --rm --gpus all --network=host --name anim-graph-ms -v $(pwd)/default-avatar-scene_v1.0.0:/home/ace/asset nvcr.io/nvidia/ace/ia-animation-graph-microservice:1.0.2
+ ldconfig -p
+ grep libGLX_nvidia.so.0
+ NOTFOUND=1
+ [[ -v NOTFOUND ]]
+ cat

Fatal Error: Can't find libGLX_nvidia.so.0...

Ensure running with NVIDIA runtime. (--gpus all) or (--runtime nvidia)

+ exit 1
decoded@DESKTOP-R5O4QTL:~$ docker run -it --rm --gpus all --network=host --name anim-graph-ms -v $(pwd)/default-avatar-scene_v1.0.0:/home/ace/asset nvcr.io/nvidia/ace/ia-animation-graph-microservice:1.0.2
+ ldconfig -p
+ grep libGLX_nvidia.so.0
+ NOTFOUND=1
+ [[ -v NOTFOUND ]]
+ cat

Fatal Error: Can't find libGLX_nvidia.so.0...

Ensure running with NVIDIA runtime. (--gpus all) or (--runtime nvidia)

+ exit 1
decoded@DESKTOP-R5O4QTL:~$ nvidia-smi
Mon Jan 20 07:45:43 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.77.01              Driver Version: 566.36         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070        On  |   00000000:01:00.0  On |                  N/A |
| 50%   44C    P2             38W /  216W |    1732MiB /  12282MiB |      4%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A         1      C   /python3.12                                 N/A      |
|    0   N/A  N/A        27      G   /Xwayland                                   N/A      |
|    0   N/A  N/A        35      G   /Xwayland                                   N/A      |
+-----------------------------------------------------------------------------------------+
decoded@DESKTOP-R5O4QTL:~$