Invalid Loss

Morganh · June 23, 2022, 7:47am

Just in order to narrow down. Since you mentioned that the multi gpus was working fine earlier with your dataset but now not working the same, and last week there is a blocking issue(see Chmod: cannot access '/opt/ngccli/ngc': No such file or directory - #2 by Morganh) , I am not sure if that issue will result in your current nan loss issue.
So, if possible, just to run default jupyter notebook(run against KITTI dataset) again to check if it still works.
If it works, that means it is not related to above-mentioned issue. It is needed to check more in your training dataset or parameters.

Topic		Replies	Views
Multi GPU's and invalid loss TAO Toolkit	18	1312	July 19, 2022
Tao pre-trained yolo4tiny - AssertionError: Must have more boxes than clusters TAO Toolkit	54	2856	January 21, 2022
Training yolov4 tiny issue TAO Toolkit	11	479	March 21, 2024
Unable to train yolov4 with Tao succesfully TAO Toolkit	6	579	April 28, 2023
Multigpu training raises error TAO Toolkit	9	1251	November 15, 2022
TAO toolkit yolo_v4 training NAN loss TAO Toolkit	1	319	July 6, 2023
Training Become very slow Yolov4 TAO Toolkit	25	2378	January 25, 2022
Cannot reshape a tensor with 25690112 elements to shape [256,256,14,14] TAO Toolkit	51	1792	July 26, 2022
Training got killed before start TAO Toolkit	18	1608	February 8, 2022
LPRNet: Invalid loss, terminating training TAO Toolkit	24	2420	January 5, 2022

Invalid Loss

Related topics