Originally published at: https://developer.nvidia.com/blog/validating-distributed-multi-node-av-ai-training-with-dgx-systems-on-openshift-with-dxc-robotic-drive/
Deep neural network (DNN) development for self-driving cars is a demanding workload. In this post, we validate DGX multi-node, multi-GPU, distributed training running on RedHat OpenShift in the DXC Robotic Drive environment. We used OpenShift 3.11, also a part of the Robotic Drive containerized compute platform, to orchestrate and execute the deep learning (DL) workloads.…