Error training detectNet v2 with TAO

Hello,
I am trying to train detectNet with your jupyter notebook. I got the same error with both my custom data and the data I downloaded (from the links inside the notebook).
I ran docker login nvcr.io and logged in successfully, but then I ran the command

jupyter notebook --ip 0.0.0.0 --allow-root --port 8888

without docker pull or docker run…

error is printed below

• Hardware GeForce RTX 2080 Ti
• Network Type Detectnet_v2
• TLT Version format_version: 2.0
toolkit_version: 4.0.1
published_date: 03/06/2023
4.0.0-tf1.15.5: docker_registry: nvcr.io

• Training spec file
nanovel_detectnet_v2_train_resnet18_kitti.txt (3.7 KB)

• How to reproduce the issue ?

!tao detectnet_v2 train -e $SPECS_DIR/nanovel_detectnet_v2_train_resnet18_kitti.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                        -k $KEY \
                        -n resnet18_detector \
                        --gpus $NUM_GPUS

This is the error I got:

tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: the provided PTX was compiled with an unsupported toolchain.
Telemetry data couldn't be sent, but the command ran successfully.
[WARNING]: <urlopen error [Errno -2] Name or service not known>
File "<frozen iva.detectnet_v2.training.utilities>", line 143, in get_singular_monitored_session
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py", line 1104, in __init__

Thanks for your help

Please update nvidia driver.

For example,
Uninstall current driver:
sudo apt purge nvidia-driver-510
sudo apt autoremove
sudo apt autoclean

Install new driver.
sudo apt install nvidia-driver-520

Ok, Thanks, I’ll try. I’ll just point out that Yolov4 training worked for me… with the same driver and the same method…

Hi daphna,
Do you still need support here? Or shall we close this topic?
Thank you.

Hi @yingliu, you can close the topic
Thanks!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.