I’m attempting to run “tao detectnet_v2 --help” on an Xavier NX and get the following output:
(launcher) nvidia@ubuntu:~/my_apps/testing_tensorRT__files_from_AWS_example_notebook$ tao detectnet_v2 --help
2022-09-07 14:49:48,190 [INFO] root: Registry: [‘nvcr.io’]
2022-09-07 14:49:48,470 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.4-py3
Docker instantiation failed with error: 500 Server Error: Internal Server Error (“failed to create shim: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as ‘csv’
invoking the NVIDIA Container Runtime Hook directly (e.g. specifying the docker --gpus flag) is not supported. Please use the NVIDIA Container Runtime instead.: unknown”)
I have run “sudo apt-get install -y nvidia-docker2” and also “sudo apt-get install nvidia-container-runtime” - both are installed.
Running “tao -h” gives the following output:
(launcher) nvidia@ubuntu:~/my_apps/testing_tensorRT__files_from_AWS_example_notebook$ tao -h
usage: tao [-h]
{list,stop,info,action_recognition,augment,bpnet,classification,converter,detectnet_v2,dssd,efficientdet,emotionnet,faster_rcnn,fpenet,gazenet,gesturenet,heartratenet,intent_slot_classification,lprnet,mask_rcnn,multitask_classification,n_gram,pointpillars,pose_classification,punctuation_and_capitalization,question_answering,retinanet,spectro_gen,speech_to_text,speech_to_text_citrinet,speech_to_text_conformer,ssd,text_classification,token_classification,unet,vocoder,yolo_v3,yolo_v4,yolo_v4_tiny}
…
Launcher for TAO Toolkit.
optional arguments:
-h, --help show this help message and exit
tasks:
{list,stop,info,action_recognition,augment,bpnet,classification,converter,detectnet_v2,dssd,efficientdet,emotionnet,faster_rcnn,fpenet,gazenet,gesturenet,heartratenet,intent_slot_classification,lprnet,mask_rcnn,multitask_classification,n_gram,pointpillars,pose_classification,punctuation_and_capitalization,question_answering,retinanet,spectro_gen,speech_to_text,speech_to_text_citrinet,speech_to_text_conformer,ssd,text_classification,token_classification,unet,vocoder,yolo_v3,yolo_v4,yolo_v4_tiny}
I have defined a ~/.tao_mounts.json file as follows:
{
“Mounts”: [
{
“source”: “/home/nvidia/my_apps/testing_tensorRT__files_from_AWS_example_notebook”,
“destination”: “/workspace/tlt-experiments”
}
],
“Envs”: [
{
“variable”: “CUDA_DEVICE_ORDER”,
“value”: “PCI_BUS_ID”
}
],
“DockerOptions”: {
“shm_size”: “16G”,
“ulimits”: {
“memlock”: -1,
“stack”: 67108864
},
“user”:“1000:1000”,
“ports”: {
“8888”: 8888
}
}
}
The Docker images on the Xavier are as follows:
REPOSITORY TAG IMAGE ID CREATED SIZE
nvcr.io/nvidia/l4t-tensorflow r35.1.0-tf1.15-py3__my_updates b16634b62b0a 6 days ago 13GB
nvcr.io/nvidia/l4t-tensorrt r8.4.1.5-devel 9d233de1abe7 12 days ago 10.2GB
nvcr.io/nvidia/l4t-cuda 11.4.14-runtime 17b2eaaef496 6 weeks ago 2.48GB
nvcr.io/nvidia/tao/tao-toolkit-tf v3.22.05-tf1.15.5-py3 b85103564252 3 months ago 11.7GB
nvcr.io/nvidia/tao/tao-toolkit-tf v3.22.05-tf1.15.4-py3 ca92a571a959 3 months ago 16.1GB
nvcr.io/nvidia/deepstream-l4t 6.1-samples 6fc8884e47d9 4 months ago 6.07GB
nvcr.io/nvidia/deepstream-l4t 6.1-base 0f92b3eb66ba 4 months ago 5.4GB
nvcr.io/nvidia/dli/dli-nano-deepstream v2.0.0-DS6.0.1 eb0e1e157f1d 5 months ago 2.22GB
nvcr.io/nvidia/tao/tao-cv-inference-pipeline-l4t r32.5.0-v0.3-ga-client bac152d44466 12 months ago 877MB
My final goal is to convert the combination of .etlt & .bin files from an nVidia TAO Toolkit example Jupyter notebook to a TensorRT engine file on the Xavier. Running the command for that purpose (“tao converter resnet18_detector.etlt -k $KEY…”) throws the same error as above when running “tao detectnet_v2”. This led me to trying the latter option to attempt to isolate the problem.
Any ideas on how to have the tao command run the associated Docker containers using the nVidia Container Runtime?