Yolo-v4 on colab - ModuleNotFound - No module named 'uff'

naldsamonte001 · March 10, 2024, 3:32pm

Please provide the following information when requesting support.

• Hardware (T4)
• Network Type (resnet18)
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (!tao model yolo_v4 dataset_convert -d $SPECS_DIR/yolo_v4_tfrecords_kitti_train.txt
-o $DATA_DIR/train/tfrecords/train.)

Hello,

This is my first time of using TaoToolkit. My goal is to do transfer learning using Yolo_V4. So I open the link for theYoloV4 object detection. I was able to run the until step 2.1 -setup python environment. I was step#3 when I execute the following code it resulted to the error -ModuleNotFoundError: No module named ‘uff’ :
Yolo_V4_Colab_Error…txt (2.7 KB)
!tao model yolo_v4 dataset_convert -d $SPECS_DIR/yolo_v4_tfrecords_kitti_train.txt
-o $DATA_DIR/train/tfrecords/train

is there a way to install this manually?

Morganh · March 10, 2024, 3:45pm

Could you save the colab ipynb file and upload here?

naldsamonte001 · March 10, 2024, 3:58pm

Copy of yolo_v4.zip (8.1 KB)

Hello,

I uploaded as zip file, since cannot upload the *.ipynb file.

Morganh · March 10, 2024, 4:02pm

I cannot see the logs when you run the cells. Could you please double check?

naldsamonte001 · March 10, 2024, 4:09pm

After the error occurred I needed to close the colab notebook to conserve my session usage, so the logs are already gone. Anyway, running step2.0 took quite a long time since tensor RT needs to be installed plus tensor flow and it’s dependencies. I was able to complete the install but since the log file was quite long it got clipped in the colab so I was not able to view what dependencies/modules were not installed properly. I only got the module error.

Morganh · March 10, 2024, 4:15pm

Suggest you to run the cells to double check, especially below cell.

import os
if os.environ["GOOGLE_COLAB"] == "1":
    os.environ["bash_script"] = "setup_env.sh"
else:
    os.environ["bash_script"] = "setup_env_desktop.sh"

os.environ["NV_TAO_TF_TOP"] = "/tmp/tao_tensorflow1_backend/"

!sed -i "s|PATH_TO_TRT|$trt_untar_folder_path|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script
!sed -i "s|TRT_VERSION|$trt_version|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script
!sed -i "s|PATH_TO_COLAB_NOTEBOOKS|$COLAB_NOTEBOOKS_PATH|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script

!sh $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script

naldsamonte001 · March 10, 2024, 4:25pm

Ok, I’m re-running again the colab notebook. But I’m a bit confused why tensorRT needs to be installed? There’s a youtube video by Nvidia regarding running Tao ToolKit on colab, I did not see that tensorRT being installed. Also my purpose was to do transfer learning and not to do any inference. I plan to do the inference on my Jetson Orin NX. Anyway I’ll send you the copy of the notebook with the captured logs.

Morganh · March 10, 2024, 4:31pm

For running tao in Colab, please follow the cells. For the detailed of setup.sh in tf branch , you can refer to nvidia-tao/tensorflow/setup_env.sh at main · NVIDIA-AI-IOT/nvidia-tao · GitHub.

naldsamonte001 · March 10, 2024, 5:01pm

Copy of yolo_v4 (1).zip (55.2 KB)
I re-run the colab notebook and still encountering the module not found error.

naldsamonte001 · March 10, 2024, 7:48pm

I removed all the files/folders generated using step#2 of the notebook and tried a fresh install. The error - No module named ‘uff’ did not appear. But when executing the code for converting the kitti formated annotation files to TFREcords newset of warnings and error appeared :
Using TensorFlow backend.
2024-03-10 19:20:19,389 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
2024-03-10 19:20:20,360 [TAO Toolkit] [WARNING] root 329: Limited tf.compat.v2.summary API due to missing TensorBoard installation.
2024-03-10 19:20:20,920 [TAO Toolkit] [WARNING] root 329: Limited tf.compat.v2.summary API due to missing TensorBoard installation.
2024-03-10 19:20:22,837 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.common.export.trt_utils 36: Failed to import TensorRT package, exporting TLT to a TensorRT engine will not be available.
2024-03-10 19:20:22,838 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.common.export.base_exporter 44: Failed to import TensorRT package, exporting TLT to a TensorRT engine will not be available.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
Using TensorFlow backend.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/dataset_converter_lib.py:181: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

WARNING:tensorflow:From /usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/dataio/dataset_converter_lib.py:181: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

Telemetry data couldn’t be sent, but the command ran successfully.
[WARNING]: ‘str’ object has no attribute ‘decode’
Execution status: PASS

The warning message highlights missing TensorBoard install, not sure if that affected the error. Another warning related to no attribute decode. I assume this attribute should be on the generated TFRecords.
The notebook details another method of using the Kitti annotation files instead of the TFRecords:
The default YOLOv4 data format requires generation of TFRecords. Currently, the old sequence data format (image folders and label txt folders) is still supported and if you prefer to use the sequence data format, you can skip this section. To use sequence data format, please use spec file yolo_v4_train_resnet18_kitti_seq.txt and yolo_v4_retrain_resnet18_kitti_seq.txt

The only thing is there is no yolo_v4_train_resnet18_kitti_seq.txt and yolo_v4_retrain_resnet18_kitti_seq.txt on the SPECS DIR. Another more concerning warning is : nvidia_tao_tf1.cv.common.export.base_exporter 44: Failed to import TensorRT package, exporting TLT to a TensorRT engine will not be available.

As I plan to do the inference on my Jetson Orin NX I would need to convert the model TLT file to TensorRT engine file. However with the warning message this would defeat my purpose 100%. Need help to find why it failed to import TensotRT.
Copy of yolo_v4.zip (57.4 KB)

Morganh · March 11, 2024, 2:12am

Can you find the output tfrecords under $DATA_DIR/train/tfrecords/train?

naldsamonte001 · March 11, 2024, 7:54am

Hello,

Yes I can see the output tfrecords on my data folder. But I have some doubts on these data as some have 0kb size and there is no file extension. I’m trying to figure-out how to view these files to confirm if something got corrupted.

I’ll try to re-run again the training but I’m also worried about the TensorRT warning of not being able to convert TLT to engine.

Morganh · March 11, 2024, 7:57am

How many images did you run dataset_convert? Can you share the spec file as well?

naldsamonte001 · March 12, 2024, 2:05pm

I have 210 training images and 90 validation images. I think I found the issue why some of the shard files have 0kb. Somy of my images don’t have matching extensions like *.jpg and *.JPG. Because the spec file is looking for *.jpg it ignored most of the images with 8.JPG. I have fixed this issue and my images now have standard extensions. Now I tried to execute the TFRecords conversion. It took some 5 minutes to finish but resulted with the same error as before. I also tried to import tensorrt but got an error → No module names ‘tensorrt’. I’m not sure why it’s not importing as I checked tensorrt is installed at this path - !tar -xzf $trt_tar_path -C /content/trt_untar.

Warning and Errors: Using TensorFlow backend.
2024-03-10 18:09:09.265904: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2024-03-10 18:09:09,319 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
2024-03-10 18:09:10,325 [TAO Toolkit] [WARNING] root 329: Limited tf.compat.v2.summary API due to missing TensorBoard installation.
2024-03-10 18:09:10,903 [TAO Toolkit] [WARNING] root 329: Limited tf.compat.v2.summary API due to missing TensorBoard installation.
2024-03-10 18:09:12,898 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.common.export.trt_utils 36: Failed to import TensorRT package, exporting TLT to a TensorRT engine will not be available.
2024-03-10 18:09:12,898 [TAO Toolkit] [WARNING] nvidia_tao_tf1.cv.common.export.base_exporter 44: Failed to import TensorRT package, exporting TLT to a TensorRT engine will not be available.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
Using TensorFlow backend.
WARNING:root:Limited tf.compat.v2.summary API due to missing TensorBoard installation.
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

Telemetry data couldn’t be sent, but the command ran successfully.
[WARNING]: ‘str’ object has no attribute ‘decode’
Execution status: PASS

Morganh · March 13, 2024, 4:56am

The dataset_convert already succeeds.
Also, the tfrecords files look normal. You can ignore the warning log.

naldsamonte001 · March 13, 2024, 3:52pm

I was able to proceed with the training. Although I did not expect to see very high loss starting from 26,000 down to 482 after 100 epoch. My main issue I’m facing is that only 1 out of 2 classes is getting prediction and I’m not sure why. I followed the recommended dataset formatting/folder layout. My Training Images are slightly mismatch: ‘awake’ - 103 images , ‘drowsy’ - 107. My validation images are 50 each. I combined all the training images into 1 folder named images w/ the following format → awake_N.jpg, drowsy_N.jpg.

I’m checking the settings on yolo_v4_train_resnet18_kitti.txt, but could not find anything wrong that would cause this issue. I have uploaded the training config file.

model_output_labels.txt (12 Bytes)

yolo_v4_train_resnet18_kitti.txt (2.0 KB)

Portion of Training Log(100 epoch, GPU: V100:

/content/drive/MyDrive/results/yolo_v4/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_090.hdf5 Epoch 91/100 23/23 [==============================] - 11s 484ms/step - loss: 517.2122 Epoch 92/100 23/23 [==============================] - 10s 448ms/step - loss: 513.0973 Epoch 93/100 23/23 [==============================] - 10s 441ms/step - loss: 509.4698 Epoch 94/100 23/23 [==============================] - 10s 438ms/step - loss: 505.0564 Epoch 95/100 23/23 [==============================] - 10s 434ms/step - loss: 499.1268 Epoch 96/100 23/23 [==============================] - 10s 439ms/step - loss: 495.6057 Epoch 97/100 23/23 [==============================] - 10s 439ms/step - loss: 492.7333 Epoch 98/100 23/23 [==============================] - 10s 441ms/step - loss: 490.6606 Epoch 99/100 23/23 [==============================] - 10s 437ms/step - loss: 488.0959 Epoch 100/100 23/23 [==============================] - 10s 442ms/step - loss: 482.2681 Producing predictions: 100% 4/4 [00:01<00:00, 2.70it/s] Start to calculate AP for each class ******************************* awake AP 0.88322 drowsy AP 0.0 mAP 0.44161 ******************************* Validation loss: 431.5468444824219

Morganh · March 14, 2024, 3:27am

This is a new question. Could you generate a new forum topic? Because the original issue is gone now.

For this latest question, did you have the log when you generate the tfrecords files /content/drive/MyDrive/kitti_data/DATA_DIR/train/tfrecords/train*?
In the log, we can see how many “awake” objects and how many “drowsy” objects.
Since you set validation_fold: 0, the evaluation will use the tfrecords files which has -000-of in the filename.

naldsamonte001 · March 14, 2024, 2:50pm

Thanks for your inputs on the validation_fold: 0. I tried to replace this setting with paths for my validation TFRecords and images instead and it worked and I’m getting predictions for both of my 2 classes. Although the loss seems a bit high. Based on the forum discussion on Yolo_V4 the high loss seems to be normal. Not sure what would be the impact on the inference by I’ll try the model.

Epoch 00130: saving model to /content/drive/MyDrive/results/yolo_v4/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_130.hdf5
Epoch 131/140
27/27 [==============================] - 13s 465ms/step - loss: 224.1768
Epoch 132/140
27/27 [==============================] - 12s 429ms/step - loss: 221.7701
Epoch 133/140
27/27 [==============================] - 12s 431ms/step - loss: 221.1449
Epoch 134/140
27/27 [==============================] - 12s 432ms/step - loss: 219.0267
Epoch 135/140
27/27 [==============================] - 12s 431ms/step - loss: 220.0725
Epoch 136/140
27/27 [==============================] - 12s 436ms/step - loss: 216.0832
Epoch 137/140
27/27 [==============================] - 12s 433ms/step - loss: 214.3811
Epoch 138/140
27/27 [==============================] - 12s 433ms/step - loss: 211.9911
Epoch 139/140
27/27 [==============================] - 12s 431ms/step - loss: 210.4830
Epoch 140/140
27/27 [==============================] - 12s 434ms/step - loss: 210.1095
Producing predictions: 100% 13/13 [00:04<00:00, 2.83it/s]
Start to calculate AP for each class

awake AP 0.97174
drowsy AP 0.87812
mAP 0.92493

Validation loss: 172.96787438025842

I think can close this issue, thanks for the inputs it helped a lot!

system · March 28, 2024, 2:50pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unable to export hdf5 to etlt after Tao Training on Colab TAO Toolkit yolo	11	724	March 21, 2024
Inference YOLO_v4 int8 mode doesn't show any bounding box TAO Toolkit	31	2549	November 12, 2021
Error when training YOLOV3 with TAO TAO Toolkit	5	563	May 20, 2022
Doing inference in python with YOLO V4 in TensorRT - postporsessing TAO Toolkit yolo	7	3349	October 12, 2021
Unable to train yolov4 with Tao succesfully TAO Toolkit	6	508	April 28, 2023
Tao Deploying to DeepStream for YOLOv4-tiny TAO Toolkit	6	696	August 25, 2023
Yolo V4 Training Error TAO Toolkit	3	652	August 2, 2022
YOLOV4 AP all zero in custom dataset TAO Toolkit	14	1300	January 17, 2022
Convert TAO Yolov4 model to DLA engine fails TAO Toolkit	22	1669	March 1, 2022
Excessive Detections (False Positives) with TAO Model (BatchedNMS) after DeepStream 6.4 to 7.0 Migration DeepStream SDK deepstream	19	73	May 26, 2025

Yolo-v4 on colab - ModuleNotFound - No module named 'uff'

Related topics