Training yolov4 with TAO toolkit occupies a lot of cpu resoures

thuan169993 · September 29, 2021, 2:05am

Hi everyone. I’m training my own dataset with Yolov4 using TAO toolkit. It has a issue that, its computations mostly run on CPU. As a result, it is quite slow. Futher, I have trained with another algorithm like yolov3 and detecnet_v2 and their training mostly run on GPU.
Screenshot from 2021-09-29 08-57-47

I have some following questions:

Is that normal or did I have a mistake somewhere? I have trained yolov4 with TLT toolkit and I met the same problem.
When training stage finishes, can the model still run on GPU with deepstream?

Could you guys help me with these questions? Thanks in advance!

Morganh · September 29, 2021, 6:12am

Could you please share your training spec file?

thuan169993 · September 29, 2021, 6:34am

Yeah. Here is my spec file:yolo_v4_train_resnet18_kitti.txt (2.3 KB)

Morganh · September 29, 2021, 6:38am

Since you are using tfrecords format, please try to disable mosaic augmentation.
See YOLOv4 - NVIDIA Docs

YOLOv4 supports two data formats: the sequence format (KITTI images folder and raw labels folder) and the tfrecords format (KITTI images folder and TFRecords). From our experience, if mosaic augmentation is disabled (mosaic_prob=0), training with TFRecords format is faster. If mosaic augmentation is enabled (mosaic_prob>0), training with sequence format is faster.

thuan169993 · September 29, 2021, 6:41am

Thank you. I’ll try it.

thuan169993 · September 29, 2021, 6:50am

But I still wonder that is this normal? Because I have tried another networks and they were trained quite fast.

Morganh · September 29, 2021, 6:54am

The augmentation is different.

thuan169993 · September 29, 2021, 7:06am

Yeah, I see. But another networks have GPU utilization nearly 100% while yolov4 is quite lower. Is it due to augmentation or training operations?

Morganh · September 29, 2021, 7:10am

Also, suggest you change
force_on_cpu: true

to
force_on_cpu: false

thuan169993 · September 29, 2021, 7:20am

Thank you, I’ll notice that.

Topic		Replies	Views
Extremely slow train and evaluation of yolo_v4_tiny TAO Toolkit yolo , tao	12	1271	April 12, 2023
Training Yolov4 with 4 GPUs cause out of memory TAO Toolkit	4	1001	August 3, 2022
Unable to train yolov4 with Tao succesfully TAO Toolkit	6	519	April 28, 2023
[TLT] YoloV4 training fails. training process asigned to CPU instead of GPU? TAO Toolkit	8	461	August 9, 2022
Why it is killed when start training with tao toolkit? TAO Toolkit	8	542	October 12, 2021
TLT yolo_v4 slow training TAO Toolkit	11	875	October 12, 2021
Training got killed before start TAO Toolkit	18	1476	February 8, 2022
No CUDA-capable device is detected - yolov4 TAO Toolkit	10	183	August 16, 2024
Training Become very slow Yolov4 TAO Toolkit	25	2175	January 25, 2022
TAO yolov4_tiny training fails with error TAO Toolkit	4	582	February 2, 2023

Training yolov4 with TAO toolkit occupies a lot of cpu resoures

Related topics