Problem about installing TLT

I want to use tlt, but I meet some problem when I install the TLT launcher.

(launcher) (base) vlab@vlab-C180300750:~$ tlt detectnet_v2 --help
2021-03-08 15:43:36,966 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the ~/.tlt_mounts.json file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
Using TensorFlow backend.
Traceback (most recent call last):
File “/usr/local/bin/detectnet_v2”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/entrypoint/detectnet_v2.py”, line 12, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/entrypoint/entrypoint.py”, line 227, in launch_job
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/entrypoint/entrypoint.py”, line 47, in get_modules
File “/usr/lib/python3.6/importlib/init.py”, line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File “”, line 994, in _gcd_import
File “”, line 971, in _find_and_load
File “”, line 955, in _find_and_load_unlocked
File “”, line 665, in _load_unlocked
File “”, line 678, in exec_module
File “”, line 219, in _call_with_frames_removed
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/export.py”, line 8, in
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/export/exporter.py”, line 12, in
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/keras_exporter.py”, line 22, in
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/core/build_wheel.runfiles/ai_infra/moduluspy/modulus/export/_tensorrt.py”, line 27, in
File “/usr/local/lib/python3.6/dist-packages/pycuda/autoinit.py”, line 9, in
context = make_default_context()
File “/usr/local/lib/python3.6/dist-packages/pycuda/tools.py”, line 204, in make_default_context
“on any of the %d detected devices” % ndevices)
RuntimeError: make_default_context() wasn’t able to create a context on any of the 1 detected devices
2021-03-08 15:43:43,809 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

My environment:

tlt_mounts.json
image

tlt --help works well


docker is ok

Can you run tlt info ?
More, did you follow TLT Launcher — Transfer Learning Toolkit 3.0 documentation ?

Please also refer to tlt-export error - #3 by 010akv

Thank you for your reply!
my tlt info:

Configuration of the TLT Instance
dockers: [‘nvcr.io/nvidia/tlt-streamanalytics’, ‘nvcr.io/nvidia/tlt-pytorch’]
format_version: 1.0
tlt_version: 3.0
published_date: 02/02/2021

yes, I follow that doc step by step.

I had read that issue before I wrote this question.
on my pc ,I find

-rwxr-xr-x 1 root root 241 Sep 13 21:15 /usr/local/bin/tlt-dataset-convert*
-rwxr-xr-x 1 root root 227 Sep 13 21:15 /usr/local/bin/tlt-evaluate*
-rwxr-xr-x 1 root root 225 Sep 13 21:15 /usr/local/bin/tlt-export*
-rwxr-xr-x 1 root root 224 Sep 13 21:15 /usr/local/bin/tlt-infer*
-rwxr-xr-x 1 root root 229 Sep 13 21:15 /usr/local/bin/tlt-int8-tensorfile*
-rwxr-xr-x 1 root root 224 Sep 13 21:15 /usr/local/bin/tlt-prune*
-rwxr-xr-x 1 root root 215 Sep 13 21:15 /usr/local/bin/tlt-pull*
-rwxr-xr-x 1 root root 736 Aug 27 21:09 /usr/local/bin/tlt-train*
-rwxr-xr-x 1 root root 224 Sep 13 21:15 /usr/local/bin/tlt-train-g1*

neither in /usr/local/bin nor in ~/.local/bin
why? Does it mean I install TLT failed?
Or I should look for that in docker? how to enter docker contain?
Thank you very much.

To narrow down, there is a debug way. You can check if you can login the TLT 3.0 docker.

$ docker run --runtime=nvidia -it nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3 /bin/bash

Thank you for your reply.
OK, It seems to be OK.

Firstly, please check if you will get error for all the modules. Try
$ tlt ssd --help

$ tlt dssd --help

etc

Next, please check if you already meet the requirement. TLT Launcher — Transfer Learning Toolkit 3.0 documentation
Pay attention to GitHub - NVIDIA-AI-IOT/gesture_recognition_tlt_deepstream: A project demonstrating how to train your own gesture recognition deep learning pipeline. We start with a pre-trained detection model, repurpose it for hand detection using Transfer Learning Toolkit 3.0, and use it together with the purpose-built gesture recognition model. Once trained, we deploy this model on NVIDIA® Jetson™ using Deepstream SDK. too.

Last, I am afraid it is setup issue. Search “return _bootstrap._gcd_import(name[level:], package, level)” or others in google for help.

OK, thank you.
I will keep trying until I resolve this issue, and recorde below.