Please provide the following information when requesting support.
• Hardware RTX4090 Laptop version
• Network Type unet
• TLT Version tlt: command not found
Followed this guide to install all of TAO requirements
Downloaded TAO5 and installed all requirements.
Selected unet as a first test
In the unet notebook, every cell runs well, until the train cell:
!tao model unet train --gpus $NUM_GPUS \
--gpu_index $GPU_INDEX \
-e $SPECS_DIR/unet_train_resnet_unet_isbi.txt \
-r $USER_EXPERIMENT_DIR/isbi_experiment_unpruned \
-m $USER_EXPERIMENT_DIR/pretrained_resnet18/pretrained_semantic_segmentation_vresnet18/resnet_18.hdf5 \
-n model_isbi
Results in error:
2023-09-11 21:05:23,052 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’]
2023-09-11 21:05:23,102 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 361: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2023-09-11 21:05:23,124 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 267:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/david/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
2023-09-11 21:05:23,125 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Docker instantiation failed with error: 500 Server Error: Internal Server Error (“could not select device driver “” with capabilities: [[gpu]]”)
I installed docker following the instructions here
and sudo docker run hello-world
runs well
and the output of nvidia-smi is
nvidia-smi
Mon Sep 11 22:26:02 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.86.05 Driver Version: 535.86.05 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 4090 ... Off | 00000000:01:00.0 On | N/A |
| N/A 43C P8 5W / 115W | 62MiB / 16376MiB | 1% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 1846 G /usr/lib/xorg/Xorg 55MiB |
+---------------------------------------------------------------------------------------+
Thanks for the help
DBG