"docker: Error response from daemon: exec: "nvidia-container-runtime-hook": executable file not found in $PATH"?

I am trying to run the h2o-gpt chatbot on my computer, but I have trouble using the NVIDIA graphics card. The error message I get is “Auto-detected mode as ‘legacy’”, which indicates that the NVIDIA container runtime is not able to communicate with the graphics card. I guess it is likely because the NVIDIA drivers are not installed or configured correctly. But still, I can use nvidia-smi. Here is the error message:

(base) user@user-16GB-computer:~/dev/project/chatbot-rag/v2_h2ogpt/h2ogpt-docker$ sudo docker compose up
[sudo] password for user: 
Attaching to h2ogpt
Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown

It seems that I can’t deal with nvidia services:

(base) user@user-16GB-computer:~/dev/project/chatbot-rag/v2_h2ogpt/h2ogpt-docker$ sudo systemctl start nvidia-container-runtime
Failed to start nvidia-container-runtime.service: Unit nvidia-container-runtime.service not found.

But the pilot seems to work:

(base) user@user-16GB-computer:~/dev/project/chatbot-rag/v2_h2ogpt/h2ogpt-docker$ nvidia-smi
Mon Jan 15 18:29:04 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.146.02             Driver Version: 535.146.02   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3080 ...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   43C    P0              N/A / 125W |      8MiB / 16384MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      2440      G   /usr/lib/xorg/Xorg                            4MiB |
+---------------------------------------------------------------------------------------+

Here is part of my docker-compose.yaml:

version: '3'

services:
  h2ogpt:
    image: gcr.io/vorvan/h2oai/h2ogpt-runtime:latest
    container_name: h2ogpt
    shm_size: '2gb'
    environment:
      - ANONYMIZED_TELEMETRY=False
      - HF_DATASETS_OFFLINE=1
      - TRANSFORMERS_OFFLINE=1
    volumes:
       - $HOME/.cache:/workspace/.cache
       - ./data/models:/workspace/models:ro
       - ./data/save:/workspace/save
       - ./data/user_path:/workspace/user_path
       - ./data/db_dir_UserData:/workspace/db_dir_UserData
       - ./data/users:/workspace/users
       - ./data/db_nonusers:/workspace/db_nonusers
       - ./data/llamacpp_path:/workspace/llamacpp_path
       - ./data/h2ogpt_auth:/workspace/h2ogpt_auth
    ports:
      - 7860:7860
    restart: always
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            count: all
            capabilities: [gpu]
    command: >
      /workspace/generate.py 
      --base_model=mistralai/Mistral-7B-Instruct-v0.2 
      --hf_embedding_model=intfloat/multilingual-e5-large  
      --load_4bit=True  
      --use_flash_attention_2=True 
      --score_model=None 
      --top_k_docs=10 
      --max_input_tokens=2048  
      --visible_h2ogpt_logo=False 
      --dark=True 
      --visible_tos_tab=True 
      --langchain_modes="['UserData', 'LLM']" 
      --langchain_mode_paths="{'UserData':'/workspace/user_path/sample_docs'}" 
      --langchain_mode_types="{'UserData':'shared'}"  
      --enable_pdf_doctr=off 
      --enable_captions=False 
      --enable_llava=False 
      --use_unstructured=False 
      --enable_doctr=False 
      --enable_transcriptions=False 
      --enable_heap_analytics=False  
      --use_auth_token=hf_XXXX
      --prompt_type=mistral
      --pre_prompt_query="Use the following pieces of informations to answer,  don't try to make up an answer, just say I don't know if you don't know."
      --prompt_query="Cite relevant passages from context to justify your answer."
      --use_safetensors=False --verbose=True
    networks:
      - h2ogpt-net

I don’t know if it is related but right now I find my computer very slow. I had read something about the GEForce bringing a bunch of modules running in the background, which served no purpose and slowed down the machine.

My etc/docker/daemon.json diodn’t looked good:

ubuntu@ubuntu-GE66-Raider-11UH:~/dev/chatbot-rag/v2_h2ogpt/h2ogpt-docker$ cat /etc/docker/daemon.json
{
    "runtimes": {
        "nvidia": {
            "args": [],
            "path": "nvidia-container-runtime"
        }
    }
}

I modified it so path is /etc/docker/daemon.json and ran the command again:

ubuntu@ubuntu-GE66-Raider-11UH:~/dev/chatbot-rag/v2_h2ogpt/h2ogpt-docker$ sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown.

So I tried the 3rd solution, downgrading my nvidia driver, but got the runtime hook missing:

ubuntu@ubuntu-GE66-Raider-11UH:~/dev/chatbot-rag/v2_h2ogpt/h2ogpt-docker$ sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
docker: Error response from daemon: exec: "nvidia-container-runtime-hook": executable file not found in $PATH.

Have you solved this?

This issue may help you “nvidia-container-runtime-hook”: executable file not found in $PATH