Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model

Originally published at: Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model | NVIDIA Developer Blog

MT-NLG has 3x the number of parameters compared to the existing largest model of this type and demonstrates unmatched accuracy in a broad set of natural language tasks

Awesome. How can end-user get access to this model? Is it integrated with Azure cognitive services - given it is a joint effort by Microsoft?