Nvidia-container-cli: device error: 1: unknown device\\\\n\\\"\"": unknown

Hi

I m trying to run startDocker.sh file that I downloaded from https://ngc.nvidia.com/catalog/resources/nvidia:med:getting_started.

Below is the scripts from the sh file.
#!/bin/bash

DOCKER_IMAGE=nvcr.io/nvidia/clara-train-sdk:v3.0

DOCKER_Run_Name=claradevday

jnotebookPort=$1
GPU_IDs=$2
AIAA_PORT=$3
#################################### check if parameters are empty
if [[ -z $jnotebookPort ]]; then
jnotebookPort=8890
fi
if [[ -z $GPU_IDs ]]; then #if no gpu is passed
GPU_IDs=‘0,1,2,3’
fi
if [[ -z AIAA_PORT ]]; then AIAA_PORT=5000 fi #################################### check if name is used then exit docker ps -a|grep {DOCKER_Run_Name}
dockerNameExist=? if (({dockerNameExist}==0)) ;then
echo — dockerName {DOCKER_Run_Name} already exist echo ----------- attaching into the docker docker exec -it {DOCKER_Run_Name} /bin/bash
exit
fi

echo -----------------------------------
echo starting docker for {DOCKER_IMAGE} using GPUS {GPU_IDs}
echo -----------------------------------

extraFlag="-it "
cmd2run="/bin/bash"

extraFlag={extraFlag}" -p "{jnotebookPort}":8888 -p "{AIAA_PORT}":80" echo starting please run "./installDashBoardInDocker.sh" to install the lab extensions then start the jupeter lab echo once completed use web browser with token given yourip:{jnotebookPort} to access it

docker run --rm {extraFlag} \ --name={DOCKER_Run_Name}
-e NVIDIA_VISIBLE_DEVICES={GPU_IDs} \ -v {PWD}/…/:/claraDevDay/
-w /claraDevDay/scripts
–runtime=nvidia
–shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864
{DOCKER_IMAGE} \ {cmd2run}

I am encountering the below error

ml@ml-GL503VM:~/Documents/claraDevDay/scripts$ ./startDocker.sh


starting docker for nvcr.io/nvidia/clara-train-sdk:v3.0 using GPUS 0,1,2,3


starting please run ./installDashBoardInDocker.sh to install the lab extensions then start the jupeter lab

once completed use web browser with token given yourip:8890 to access it

docker: Error response from daemon: OCI runtime create failed: container_linux.go:349: starting container process caused “process_linux.go:449: container init caused “process_linux.go:432: running prestart hook 1 caused \“error running hook: exit status 1, stdout: , stderr: nvidia-container-cli: device error: 1: unknown device\\n\”””: unknown.

As I am a novice in this. Can you guide me how to troubleshoot this issue?

Thanks.

1 Like

Hi
Thanks for your interest in Clara Train SDK.

The default parameters for the start docker is to use port 8890 and GPU_IDs=‘0,1,2,3’. It could be you only have 1 gpu so you should change this to GPU_IDs=‘0’ or run
./startDocker.sh 8890 '0'

Hi,

I have the same issue. Have you solve the problem?

kindly refer to @aharouni’s post #2 on this thread. One way to figure out the GPU setup in your system would be to run nvidia-smi; for a single GPU system you may try ./startDocker.sh 8890 ‘0’; for a 2 GPU system you may try something like ./startDocker.sh 8890 ‘0,1’ etc. Hope this helps.

Thanks. able to run with ./startDocker.sh 8890 ‘0’