Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc)
geforce 3090
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
Detectnet_v2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
tao info --verbose
Configuration of the TAO Toolkit Instance
dockers:
nvidia/tao/tao-toolkit:
4.0.0-tf2.9.1:
docker_registry: nvcr.io
tasks:
1. classification_tf2
2. efficientdet_tf2
4.0.0-tf1.15.5:
docker_registry: nvcr.io
tasks:
1. augment
2. bpnet
3. classification_tf1
4. detectnet_v2
5. dssd
6. emotionnet
7. efficientdet_tf1
8. faster_rcnn
9. fpenet
10. gazenet
11. gesturenet
12. heartratenet
13. lprnet
14. mask_rcnn
15. multitask_classification
16. retinanet
17. ssd
18. unet
19. yolo_v3
20. yolo_v4
21. yolo_v4_tiny
22. converter
4.0.1-tf1.15.5:
docker_registry: nvcr.io
tasks:
1. mask_rcnn
2. unet
4.0.0-pyt:
docker_registry: nvcr.io
tasks:
1. action_recognition
2. deformable_detr
3. segformer
4. re_identification
5. pointpillars
6. pose_classification
7. n_gram
8. speech_to_text
9. speech_to_text_citrinet
10. speech_to_text_conformer
11. spectro_gen
12. vocoder
13. text_classification
14. question_answering
15. token_classification
16. intent_slot_classification
17. punctuation_and_capitalization
format_version: 2.0
toolkit_version: 4.0.1
published_date: 03/06/2023
I am trying to run detectnet_v2/detectnet_v2.ipynb jupyter notebook. I am getting the following error:
!tao detectnet_v2 train -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \
-r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
-k $KEY \
-n resnet18_detector \
--gpus $NUM_GPUS
docker.errors.DockerException: Error while fetching server API version: request() got an unexpected keyword argument 'chunked'
Running this command:
docker run -it --rm --gpus all \
-v /var/run/docker.sock:/var/run/docker.sock \
nvcr.io/nvidia/tao/tao-toolkit:4.0.0-tf1.15.5
Gives
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.
I have docker up and running, logged in docker login nvcr.io
not sure what the error means, also tao toolkit instructions are not clear.
All solutions say to:
add `-v /var/run/docker.sock:/var/run/docker.sock`
Reference: [Run TLT inside docker - #8 by Morganh ](https://forums.developer.nvidia.com/t/run-tlt-inside-docker/181992/8)
Where exactly? because is that nowhere in the jupyter notebook nor the documentation?
Is there anyway to no use docker and run everything locally?
Can someone please explain how to actually use docker for tao? What are the steps to follow before running the jupyter notebooks because I followed all the steps in the getting started section of the documentation and it does not work