Problems of telemetry using detectnet_v2 using tao toolkit

Hello, I’m new in this group, detectnet_v2 de tao tool kit, in this specific part:

!tao detectnet_v2 dataset_convert
-d $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt
-o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval

I have the following errors, please, I’d like that somebody help me. Thanks!!

2023-02-26 23:58:08,504 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Tfrecords generation complete.
Telemetry data couldn’t be sent, but the command ran successfully.
[WARNING]: <urlopen error [Errno -2] Name or service not known>
Execution status: PASS
2023-02-26 18:58:09,790 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

You can ignore this kind of error.
The tfrecord files should be already generated successfully.

Hello Morganh:

I’m sorry late. I’m trying to use LPR-Net of getting-started_4.0.0 in this part of the code:

print(“For multi-GPU, change --gpus based on your machine.”)
!tao lprnet train --gpus=1 --gpu_index=$GPU_INDEX
-e $SPECS_DIR/tutorial_spec.txt
-r $USER_EXPERIMENT_DIR/experiment_dir_unpruned
-k $KEY
-m $USER_EXPERIMENT_DIR/pretrained_lprnet_baseline18/lprnet_vtrainable_v1.0/us_lprnet_baseline18_trainable.tlt

I have the following problems:

For multi-GPU, change --gpus based on your machine.
2023-03-03 19:08:43,304 [INFO] root: Registry: [‘nvcr.io’]
2023-03-03 19:08:43,343 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker instantiation failed with error: 400 Client Error: Bad Request (“failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting “/home/guille/Documentos/getting-started_4.0.0/notebooks/tao_launcher_starter_kit/lprnet/specs” to rootfs at “/workspace/tao-experiments/lprnet/specs”: mkdir /var/lib/docker/100000.100000/overlay2/410bb2836bde724ad8bc453ceb5a399cf93576dca921c6aca55d05ddc14aaf56/merged/workspace/tao-experiments/lprnet/specs: permission denied: unknown”)


My configuration /etc/docker/daemon.json

{
“runtimes”: {
“nvidia”: {
“args”: ,
“path”: “nvidia-container-runtime”
}
},
“builder”: {
“gc”: {
“defaultKeepStorage”: “20GB”,
“enabled”: true
}
},
“experimental”: false,
“features”: {
“buildkit”: true
},
“userns-remap”: “guille”,
“default-shm-size”: “1G”,
“default-ulimits”: {
“memlock”: { “name”:“memlock”, “soft”: -1, “hard”: -1 },
“stack” : { “name”:“stack”, “soft”: 67108864, “hard”: 67108864 }
}
}

And this is my configuration in .tao_mounts.json

{
“Mounts”: [
{
“source”: “/home/guille/tlt-experiments”,
“destination”: “/workspace/tao-experiments”
},
{
“source”: “/home/guille/Documentos/getting-started_4.0.0/notebooks/tao_launcher_starter_kit/lprnet/specs”,
“destination”: “/workspace/tao-experiments/lprnet/specs”
}
],
“DockerOptions”: {
“user”: “1000:1000”
}
}

Please, what’s my error?

best regards

this is information about my drivers:

guille@usuario-pc:~/Documentos/getting-started_4.0.0$ nvidia-smi
Fri Mar 3 20:44:03 2023
±----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … On | 00000000:01:00.0 Off | N/A |
| N/A 49C P8 8W / 60W | 10MiB / 4096MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1037 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 1941 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------+

guille@usuario-pc:~/Documentos/getting-started_4.0.0$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Jan__6_16:45:21_PST_2023
Cuda compilation tools, release 12.0, V12.0.140
Build cuda_12.0.r12.0/compiler.32267302_0

guille@usuario-pc:~/Documentos/getting-started_4.0.0$ nvidia-container-cli --load-kmods info
NVRM version: 525.85.12
CUDA version: 12.0

Device Index: 0
Device Minor: 0
Model: NVIDIA GeForce RTX 3050 Laptop GPU
Brand: GeForce
GPU UUID: GPU-e912adaf-661f-f728-3bf1-2e4e87428764
Bus Location: 00000000:01:00.0
Architecture: 8.6

Please try to use 520 driver.

sudo apt purge nvidia-driver-525
sudo apt autoremove
sudo apt autoclean

sudo apt install nvidia-driver-520

Hello Morganh:

I used the four commands that you advice me; but the problem continue.

For multi-GPU, change --gpus based on your machine.
2023-03-04 15:27:51,273 [INFO] root: Registry: [‘nvcr.io’]
2023-03-04 15:27:51,315 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker instantiation failed with error: 400 Client Error: Bad Request (“failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting “/home/guille/Documentos/getting-started_4.0.0/notebooks/tao_launcher_starter_kit/lprnet/specs” to rootfs at “/workspace/tao-experiments/lprnet/specs”: mkdir /var/lib/docker/100000.100000/overlay2/b1bcdc2fdced117cee79af3ee6e0f6b48c92080a12fd527675a567d4cb57c1ba/merged/workspace/tao-experiments/lprnet/specs: permission denied: unknown”)

guille@usuario-pc:~/Documentos/getting-started_4.0.0$ nvidia-smi
Sat Mar 4 15:36:32 2023
±----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … On | 00000000:01:00.0 Off | N/A |
| N/A 45C P0 N/A / 60W | 10MiB / 4096MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1080 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 1929 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------+

guille@usuario-pc:~/Documentos/getting-started_4.0.0$ nvidia-container-cli --load-kmods info
NVRM version: 525.85.12
CUDA version: 12.0

Device Index: 0
Device Minor: 0
Model: NVIDIA GeForce RTX 3050 Laptop GPU
Brand: GeForce
GPU UUID: GPU-e912adaf-661f-f728-3bf1-2e4e87428764
Bus Location: 00000000:01:00.0
Architecture: 8.6

This was the before configuration of .bashrc

NVIDIA CUDA TOOLKIT

export PATH=/usr/local/cuda-12.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

now it’s commented out

NVIDIA CUDA TOOLKIT

#export PATH=/usr/local/cuda-12.0/bin${PATH:+:${PATH}}
#export LD_LIBRARY_PATH=/usr/local/cuda-12.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

I’m not sure which one to choose to configure .bashrc

guille@usuario-pc:/usr/local$ ls -la
total 60
drwxr-xr-x 15 root root 4096 mar 3 21:56 .
drwxr-xr-x 14 root root 4096 ago 31 2022 …
drwxr-xr-x 2 guille guille 4096 mar 3 21:56 bin
lrwxrwxrwx 1 root root 22 feb 25 22:50 cuda → /etc/alternatives/cuda
lrwxrwxrwx 1 root root 25 feb 25 23:02 cuda-11 → /etc/alternatives/cuda-11
drwxr-xr-x 5 root root 4096 feb 25 23:00 cuda-11.1
drwxr-xr-x 3 root root 4096 feb 26 01:16 cuda-11.7
drwxr-xr-x 4 root root 4096 feb 25 23:00 cuda-11.8
drwxr-xr-x 3 guille guille 4096 may 14 2019 doc
drwxr-xr-x 2 root root 4096 ago 31 2022 etc
drwxr-xr-x 2 root root 4096 ago 31 2022 games
drwxr-xr-x 2 root root 4096 ago 31 2022 include
drwxr-xr-x 3 root root 4096 ago 31 2022 lib
drwxr-xr-x 4 guille guille 4096 may 14 2019 man
drwxr-xr-x 2 root root 4096 ago 31 2022 sbin
drwxr-xr-x 13 guille guille 4096 mar 1 09:52 share
drwxr-xr-x 2 root root 4096 ago 31 2022 src

best regards,

Guillermo

Configuration docker is on /home/guille and not on /root

guille@usuario-pc:~$ ls -la
total 212
drwxr-xr-x 35 guille guille 4096 mar 4 15:11 .
drwxr-xr-x 3 root root 4096 feb 25 22:16 …
drwxrwxr-x 28 guille guille 4096 feb 25 23:54 anaconda3
drwxr-xr-x 3 guille guille 4096 feb 25 23:33 .anydesk
-rw------- 1 guille guille 31499 mar 4 15:13 .bash_history
-rw-r–r-- 1 guille guille 220 feb 25 22:16 .bash_logout
-rw-rw-r-- 1 guille docker 41 feb 26 17:51 .bash_profile
-rw-r–r-- 1 guille guille 4615 mar 4 15:11 .bashrc
drwxr-xr-x 23 guille guille 4096 feb 28 11:50 .cache
drwxrwxr-x 2 guille guille 4096 feb 25 23:54 .conda
-rw-rw-r-- 1 guille guille 26 feb 25 23:55 .condarc
drwx------ 20 guille guille 4096 feb 27 02:26 .config
drwx------ 3 guille guille 4096 feb 26 13:49 .dbus
drwxr-xr-x 3 guille guille 4096 mar 4 15:31 Descargas
drwxrwx— 3 guille guille 4096 mar 3 20:18 .docker
drwxr-xr-x 5 guille guille 4096 feb 26 18:51 Documentos
drwxrwxr-x 3 guille guille 4096 feb 25 23:05 .eclipse
drwxr-xr-x 2 guille guille 4096 feb 25 22:28 Escritorio
drwx------ 3 guille guille 4096 mar 4 15:16 .gnupg
drwxr-xr-x 2 guille guille 4096 mar 4 15:31 Imágenes
drwxr-xr-x 5 guille guille 4096 feb 26 01:30 .ipython
drwxrwxr-x 2 guille guille 4096 feb 27 00:31 .jupyter
drwxrwxr-x 2 guille guille 4096 feb 27 02:01 .keras
drwxr-xr-x 3 guille guille 4096 feb 25 22:28 .local
drwx------ 4 guille guille 4096 feb 25 22:51 .mozilla
drwxr-xr-x 2 guille guille 4096 feb 25 22:28 Música
drwx------ 2 guille docker 4096 feb 26 17:53 .ngc
drwxr-xr-x 22 guille guille 4096 feb 15 11:01 ngc-cli
drwxrwxr-x 3 guille guille 4096 feb 25 23:05 .nsightsystems
drwx------ 3 guille guille 4096 feb 25 23:43 .nv
-rw-rw-r-- 1 guille guille 477 feb 25 23:06 .nvidia-settings-rc
drwxrwxr-x 2 guille docker 4096 feb 27 01:14 .pip
drwx------ 3 guille guille 4096 feb 25 23:24 .pki
drwxr-xr-x 2 guille guille 4096 feb 25 22:28 Plantillas
-rw-r–r-- 1 guille guille 807 feb 25 22:16 .profile
drwxr-xr-x 2 guille guille 4096 feb 25 22:28 Público
drwx------ 3 guille guille 4096 feb 25 23:18 snap
drwx------ 2 guille guille 4096 feb 25 23:06 .ssh
-rw-r–r-- 1 guille guille 0 feb 25 22:38 .sudo_as_admin_successful
-rw-rw-r-- 1 guille guille 422 mar 4 15:22 .tao_mounts.json
drwxrwxr-x 5 guille guille 4096 mar 4 15:23 tlt-experiments
drwxrwxrwx 7 guille guille 4096 feb 27 01:01 tlt-experiments-Colab
drwxrwxr-x 5 guille guille 4096 feb 26 22:36 tlt-experiments-cv3
drwxr-xr-x 3 guille guille 4096 feb 25 23:33 Vídeos
-rw-rw-r-- 1 guille guille 204 mar 4 15:23 .wget-hsts

Configuration docker is on /home/guille and not on /root

root@usuario-pc:~# ls -la
total 52
drwx------ 10 root root 4096 feb 26 12:35 .
drwxr-xr-x 21 root root 4096 feb 27 01:13 …
drwxr-xr-x 3 root root 4096 feb 26 12:50 .anydesk
-rw------- 1 root root 453 mar 3 17:55 .bash_history
-rw-r–r-- 1 root root 3106 dic 5 2019 .bashrc
drwx------ 6 root root 4096 mar 3 20:24 .cache
drwx------ 7 root root 4096 feb 25 23:22 .config
drwx------ 3 root root 4096 feb 25 22:52 .dbus
drwx------ 2 root root 4096 feb 25 22:38 .gnupg
drwx------ 3 root root 4096 feb 25 22:52 .local
-rw-r–r-- 1 root root 161 dic 5 2019 .profile
drwx------ 3 root root 4096 feb 25 22:28 snap
drwx------ 3 root root 4096 feb 26 12:49 .synaptic

guille@usuario-pc:~$ sudo apt install nvidia-driver-520
[sudo] contraseña para guille:
Leyendo lista de paquetes… Hecho
Creando árbol de dependencias
Leyendo la información de estado… Hecho
nvidia-driver-520 ya está en su versión más reciente (525.85.05-0ubuntu0.20.04.1).
0 actualizados, 0 nuevos se instalarán, 0 para eliminar y 8 no actualizados.

Can you share the latest result of $nvidia-smi ?

Hello,

I used the command that you advised me, sudo apt install nvidia-driver-520
but the results on $nvidia-smi was the same:

guille@usuario-pc:~/Documentos/getting-started_4.0.0$ nvidia-smi
Fri Mar 3 20:44:03 2023
±----------------------------------------------------------------------------+
| NVIDIA-SMI 525.85.12 Driver Version: 525.85.12 CUDA Version: 12.0 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … On | 00000000:01:00.0 Off | N/A |
| N/A 49C P8 8W / 60W | 10MiB / 4096MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1037 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 1941 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------+

For this reason, I downloaded this file: nvidia-graphics-drivers-520_520.56.06.orig-amd64.tar.gz since this site 520.56.06-0ubuntu0.20.04.1 : nvidia-graphics-drivers-520 package : Ubuntu
and when I restart mi laptop, I had a problem with the screen, for this reason. I installed all the operation system again.


After, I installed ubuntu 20.04, I used your command sudo apt install nvidia-driver-520

guille@usuario-pc:~$ sudo apt install nvidia-driver-520
Leyendo lista de paquetes… Hecho
Creando árbol de dependencias
Leyendo la información de estado… Hecho
nvidia-driver-520 ya está en su versión más reciente (525.85.05-0ubuntu0.20.04.1).
0 actualizados, 0 nuevos se instalarán, 0 para eliminar y 0 no actualizados.

Now, I use nvidia-smi there isn’t any answer.

guille@usuario-pc:~$ nvidia-smi
No devices were found

But when I’m looking for Software and Updates this is the results:

guille@usuario-pc:/etc/docker$ ls -la
total 20
drwxr-xr-x 2 root root 4096 mar 5 21:13 .
drwxr-xr-x 133 root root 12288 mar 5 20:33 …
-rw-r–r-- 1 root root 408 mar 5 21:13 daemon.json
guille@usuario-pc:/etc/docker$ cat daemon.json
{
“runtimes”: {
“nvidia”: {
“args”: ,
“path”: “nvidia-container-runtime”
}
},
“builder”: {
“gc”: {
“defaultKeepStorage”: “20GB”,
“enabled”: true
}
},
“experimental”: false,
“features”: {
“buildkit”: true
},
“userns-remap”: “guille”,
“default-shm-size”: “1G”,
“default-ulimits”: {
“memlock”: { “name”:“memlock”, “soft”: -1, “hard”: -1 },
“stack” : { “name”:“stack”, “soft”: 67108864, “hard”: 67108864 }
}
}

Using this part of the code:

print(“For multi-GPU, change --gpus based on your machine.”)
!tao lprnet train --gpus=1 --gpu_index=$GPU_INDEX
-e $SPECS_DIR/tutorial_spec.txt
-r $USER_EXPERIMENT_DIR/experiment_dir_unpruned
-k $KEY
-m $USER_EXPERIMENT_DIR/pretrained_lprnet_baseline18/lprnet_vtrainable_v1.0/us_lprnet_baseline18_trainable.tlt

Now, the problem is the following:

For multi-GPU, change --gpus based on your machine.
2023-03-05 21:19:58,871 [INFO] root: Registry: [‘nvcr.io’]
2023-03-05 21:19:58,918 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Docker instantiation failed with error: 500 Server Error: Internal Server Error ("failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘legacy’
nvidia-container-cli: detection error: nvml error: not found: unknown")

In this case, I’m not install:

– 1 –
Install TensorRT 8.4.1.5

sudo apt-get install libnvinfer8=8.4.1-1+cuda11.6 libnvinfer-plugin8=8.4.1-1+cuda11.6 libnvparsers8=8.4.1-1+cuda11.6
libnvonnxparsers8=8.4.1-1+cuda11.6 libnvinfer-bin=8.4.1-1+cuda11.6 libnvinfer-dev=8.4.1-1+cuda11.6
libnvinfer-plugin-dev=8.4.1-1+cuda11.6 libnvparsers-dev=8.4.1-1+cuda11.6 libnvonnxparsers-dev=8.4.1-1+cuda11.6
libnvinfer-samples=8.4.1-1+cuda11.6 libcudnn8=8.4.1.50-1+cuda11.6 libcudnn8-dev=8.4.1.50-1+cuda11.6
python3-libnvinfer=8.4.1-1+cuda11.6 python3-libnvinfer-dev=8.4.1-1+cuda11.6

– 2 –
sudo apt install
libssl1.1
libgstreamer1.0-0
gstreamer1.0-tools
gstreamer1.0-plugins-good
gstreamer1.0-plugins-bad
gstreamer1.0-plugins-ugly
gstreamer1.0-libav
libgstreamer-plugins-base1.0-dev
libgstrtspserver-1.0-0
libjansson4
libyaml-cpp-dev
gcc
make
git
python3

– 3 –
Install the DeepStream SDK

https://developer.nvidia.com/deepstream-6.1_6.1.1-1_amd64.deb
sudo apt-get install ./deepstream-6.1_6.1.1-1_amd64.deb