Docker instantiation failed with error (TAO Toolkit - Yolo_v4_tiny)

Hello,
I was trying to follow the TAO Toolkit Quick Start Guide and want to train a yolo-v4-tiny network.
I’ve opened Jupyter notebook and I’ve finished the re-training procedure of the network, everything was fine yet when I’ve runned the Evaluate retrained model section :

!tao yolo_v4_tiny evaluate -e $SPECS_DIR/yolo_v4_tinu_retrain_kitti.txt
-m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_cspdarknet_tiny_epoch_$EPOCH.tlt
-k $KEY

I’ve received the following output :

2023-03-07 10:05:42, 107 [INFO] root: Registry: [‘nvcr.io’]
2023-03-07 10:05:42, 218 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
2023-03-07 10:05:42, 317 [WARNING] tlt.components.docker_handler.docker_handler: Docker will run the commands as root. If you would like to retain your local host permissions, please add the “user”: “UID:GID” in the DockerOptions portion of the “/home/user/.tao_mounts.json” file. You can obtain your users UID and GID by using the “id -u” and “id -g” commands on the terminal.
Docker instantiation failed with error: 500 Server Error: Internal Server Error (“failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status1, stdout: , stderr: Auto-detected mode as ‘legacy’
nvidia-container-cli: initialization error: nvml error: driver/library version mismatch: unknown”)

Thank you for your help!

$ sudo apt purge nvidia* libnvidia*
$ sudo apt install nvidia-driver-520 nvidia-container-toolkit

Refer to Problems of telemetry using detectnet_v2 using tao toolkit - #57 by usuario3602

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.