Error when training with multiple GPUs in TAO

Appreciate your work. Would you please help run training with one older version of TAO container? Since in 21.11 version, we did not receive training error of multi-gpus.
For yolov4, please

$ docker pull nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3

Then login the docker, and run command without “tao”. That means,

$ docker run --runtime=nvidia -it --rm nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3

After login inside the docker, run

yolo_v4 train -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt
-r $USER_EXPERIMENT_DIR/experiment_dir_unpruned
-k $KEY
--gpus 8

The docker is from TAO Toolkit for Computer Vision | NVIDIA NGC