Training a State-of-the-Art ImageNet-1K Visual Transformer Model using NVIDIA DGX SuperPOD

Originally published at: https://developer.nvidia.com/blog/training-a-state-of-the-art-imagenet-1k-visual-transformer-model-using-nvidia-dgx-superpod/

This post shows how the SOTA Visual Transformer model, VOLO, is trained on the NVIDIA DGX SuperPOD. VOLO_D5 model.

Because of the dockerhub per image pull limit, we’ve re-uploaded our MNMG training image with a new ID:
terryjx/volo:nvdocker_cuda11.1_devel_cudnn8_ubuntu20.04 on dockerhub.