Failed to import TensorRT package, exporting TLT to a TensorRT engine will not be available

I am trying to run the notebook faster_rcnn.ipynb from the TAO 5 tutorials:
https://api.ngc.nvidia.com/v2/resources/nvidia/tao/tao-getting-started/versions/5.0.0/zip , namely: notebooks/tao_launcher_starter_kit/faster_rcnn/faster_rcnn.ipynb

At this cell:

#KITTI trainval
!tao model faster_rcnn dataset_convert --gpu_index $GPU_INDEX -d $SPECS_DIR/frcnn_tfrecords_kitti_trainval.txt \
                     -o $DATA_DOWNLOAD_DIR/faster_rcnn/tfrecords/kitti_trainval/kitti_trainval \
                     -r $USER_EXPERIMENT_DIR/

I am getting the below error:

2023-08-31 13:57:27,179 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-08-31 13:57:27,249 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2023-08-31 13:57:27,260 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 262: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2023-08-31 13:57:27,260 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
Using TensorFlow backend.
2023-08-31 13:57:43.648206: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-08-31 13:57:43,696 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
2023-08-31 13:57:44,847 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
2023-08-31 13:57:44,878 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
2023-08-31 13:57:44,881 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-08-31 13:57:46,246 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
2023-08-31 13:57:53,792 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.faster_rcnn.scripts.inference 37: Failed to import TensorRT package, exporting TLT to a TensorRT engine will not be available.
2023-08-31 13:58:06,138 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.faster_rcnn.export.exporter 38: Failed to import TensorRT package, exporting TLT to a TensorRT engine will not be available.
2023-08-31 13:58:19,549 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.common.export.keras_exporter 36: Failed to import TensorRT package, exporting TLT to a TensorRT engine will not be available.
Traceback (most recent call last):
  File "/usr/local/bin/faster_rcnn", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/faster_rcnn/entrypoint/faster_rcnn.py", line 12, in main
    launch_job(nvidia_tao_tf1.cv.faster_rcnn.scripts, "faster_rcnn", sys.argv[1:])
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/entrypoint/entrypoint.py", line 276, in launch_job
    modules = get_modules(package)
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/entrypoint/entrypoint.py", line 47, in get_modules
    module = importlib.import_module(module_name)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/faster_rcnn/scripts/export.py", line 21, in <module>
    from nvidia_tao_tf1.cv.faster_rcnn.export.exporter import FrcnnExporter as Exporter
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/faster_rcnn/export/exporter.py", line 42, in <module>
    from nvidia_tao_tf1.cv.common.export.keras_exporter import KerasExporter as Exporter
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/export/keras_exporter.py", line 46, in <module>
    from nvidia_tao_tf1.core.export.app import get_model_input_dtype
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/core/export/app.py", line 40, in <module>
    from nvidia_tao_tf1.core.export._tensorrt import keras_to_tensorrt
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/core/export/_tensorrt.py", line 39, in <module>
    import pycuda.autoinit  # noqa pylint: disable=W0611
  File "/usr/local/lib/python3.8/dist-packages/pycuda/autoinit.py", line 5, in <module>
    cuda.init()
pycuda._driver.LogicError: cuInit failed: invalid argument
2023-08-31 13:58:33,051 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 337: Stopping container.

Here are the details of the environment:
• Hardware : T4
• Network Type: Faster_rcnn
• TaO version:
format_version: 3.0
toolkit_version: 5.0.0
published_date: 07/14/2023
• Training spec file: default specs in getting_started_v5.0.0/notebooks/tao_launcher_starter_kit/faster_rcnn/specs

Please reinstall nvidia driver in your machine.
Uninstall:
$sudo apt purge nvidia-driver-xxx
$sudo apt autoremove
$sudo apt autoclean

Install:
$sudo apt install nvidia-driver-525

Thank zyou @Morganh for your help.

I wasn’t able to execute the commands. But, it seems that the driver installation was on hold! (surprisingly, nvidia-smi was giving me information about the driver version)!

dpkg --get-selections | grep hold gave me that some nvidia package was in hold. I therfore executed:
sudo apt install nvidia-fabricmanager-525

Then, I restarted the machine and it works :)

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.