Config file is not saved on wandb

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc)
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc)
Yolov4 (Yolov3, faster rcnn)
• Training spec file(If have, please share here)
output_yolo_v4.txt (2.2 KB)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
cmd : tao model yolo_v4 train -e $DOCKER_SPECS_DIR/output_yolo_v4.txt
–gpus 1
Performances are saved to weights and biases, but no config file is saved whereas the doc says that configuration of the experiments are saved.
Here is the overview of the run in weights and biases:

The status is “crashed” since I interupted the training. If that is the issue, is there a way to save the config file at the start of the training?
I have a yolo_v3 and a faster rcnn currently running as well. For both of, I don’t have any configuration file saved in the overview.

Thanks for your help.

Could you please double check? You can run a quick experiment which runs 1 epoch. After training completes, please check again.

The doc mentions,

But yours shows

It’s because I saw in the docs that the config was supposed to be saved that I wrote this issue.
I just ran a quick yolo_v3 for 1 epoch.
It completed the training without any problem and wandb didn’t log any output either, but no config on my wandb page:

YOLOv3 and YOLOv4 do not show this config. After checking, when running detectnet_v2 network, the config can be shown. It matches the doc.

Actually, the config is the from the training spec file. You can check it as well.

So the config file is only uploaded for detectnet_v2?

No, I check Retinanet network and it can also show the config. It may be related to tao_tensorflow1_backend/nvidia_tao_tf1/cv/detectnet_v2/scripts/ at 2ec95cbbe0d74d6a180ea6e989f64d2d97d97712 · NVIDIA/tao_tensorflow1_backend · GitHub in detectnet_v2 and tao_tensorflow1_backend/nvidia_tao_tf1/cv/retinanet/scripts/ at 2ec95cbbe0d74d6a180ea6e989f64d2d97d97712 · NVIDIA/tao_tensorflow1_backend · GitHub in retinanet

Is there a reason why this has not been implemented for the other models?

We will check further the difference. For workaround, you can check the training spec file to view the config parameters. Thanks for the catching.

1 Like