I am using 8 gpus for training and randomly I get this error after some number of epochs:
[2020-08-03 11:25:36.480844: W horovod/common/operations.cc:588] training_1/SGD/DistributedSGD_Allreduce/HorovodAllreduce_training_1_SGD_gradients_block_1b_bn_2_1_FusedBatchNorm_grad_FusedBatchNormGrad_1 [missing ranks: 4]
[2020-08-03 11:25:36.480864: W horovod/common/operations.cc:588] training_1/SGD/DistributedSGD_Allreduce/HorovodAllreduce_training_1_SGD_gradients_block_1b_bn_2_1_FusedBatchNorm_grad_FusedBatchNormGrad_2 [missing ranks: 4]
[2020-08-03 11:25:36.480884: W horovod/common/operations.cc:588] training_1/SGD/DistributedSGD_Allreduce/HorovodAllreduce_training_1_SGD_gradients_AddN_52_0 [missing ranks: 4]