State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU

jwitsoe · August 25, 2020, 11:53pm

Originally published at: State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU | NVIDIA Technical Blog

Recent work has demonstrated that larger language models dramatically advance the state of the art in natural language processing (NLP) applications such as question-answering, dialog systems, summarization, and article completion. However, during training, large models do not fit in the available memory of a single accelerator, requiring model parallelism to split the parameters across multiple…

daniel.levine · April 5, 2023, 1:53pm

Could I expect to be able to run this Megatron Q&A model on a Jetson Xavier NX device if it was the only model loaded?

Topic		Replies	Views
Scaling Language Model Training to a Trillion Parameters Using Megatron Technical Blog	1	801	April 12, 2021
Announcing Megatron for Training Trillion Parameter Models & NVIDIA Jarvis Availability Technical Blog	0	403	April 12, 2021
Understanding Natural Language with Deep Neural Networks Using Torch Technical Blog	18	464	September 26, 2016
Scaling Language Model Training to a Trillion Parameters Using Megatron Data Science of the Day ai , fun-facts , natural-language-processing-nlp	0	1244	June 7, 2021
The New Parallel Forall Technical Blog	1	309	November 12, 2013
MT-NLG - Are we ever getting access to the 530 B parameters trained model? TensorRT	3	637	July 7, 2022
Train Generative AI Models More Efficiently with New NVIDIA Megatron-Core Functionalities Technical Blog	1	83	July 13, 2024
Using TLT to fine-tune models on Jetson devices TAO Toolkit	3	833	October 12, 2021
Rtx 2080 ...not the Ti ...Does it have any cappabilities in AI models...i want to buy a jetson but i cant right now System Management and Monitoring (NVML) cuda , conversational-ai	0	849	November 10, 2021
Can't run LSTM based (TF-Keras) model on Jetson Nano - Function call stack: distributed_function -> distributed_function -> distributed_function Jetson Nano	3	1697	October 14, 2021

State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU

Related topics