TLT Classification crash

While running tlt-train classification I am getting below error:
Using TensorFlow backend.
2021-05-17 10:27:27.402390: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
File “/usr/local/bin/tlt-train-g1”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/magnet_train.py”, line 39, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/makenet/scripts/train.py”, line 444, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/makenet/scripts/train.py”, line 98, in parse_command_line
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/makenet/scripts/train.py”, line 61, in build_command_line_parser
AttributeError: ‘list’ object has no attribute ‘add_argument’

I am running this command: tlt-train classification -e /workspace/examples/classification/specs/classification_spec.cfg -r /workspace/tlt_training/output -k tlt --gpus 1
even if I run just: tlt-train classification is gives the same error.

Did you run in TLT 3.0 or TLT 2.0?

I think it’s 3.0
docker run -it --gpus all --entrypoint /bin/bash -v /home/ubuntu/tlt_training:/workspace/tlt_training nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3

For TLT 3.0, all the commands are suggested running at host PC instead of inside docker.
For example, $ tlt classification train xxx
See Migrating to TLT 3.0 — Transfer Learning Toolkit 3.0 documentation and TLT Launcher — Transfer Learning Toolkit 3.0 documentation

If you still want to run your current command inside the docker, you can try
# classification train -e /workspace/examples/classification/specs/classification_spec.cfg -r /workspace/tlt_training/output -k tlt --gpus 1

1 Like