MT-NLG - Are we ever getting access to the 530 B parameters trained model?

Pappachuck_renan · July 6, 2022, 9:29pm

Just following up on the Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model

The framework is awesome, however it would be awesome to get the full model at NGC. A tremendous Head Start.

NVES · July 6, 2022, 9:37pm

Hi,
We recommend you to raise this query in TRITON Inference Server Github instance issues section.

Thanks!

Pappachuck_renan · July 6, 2022, 9:38pm

How can I move it into there? SHould I delete this thread or edit and it will move there ?

spolisetty · July 7, 2022, 2:43pm

Hi,

Sorry above one is not related to Triton either.
This forum talks more about updates and issues related to the TensorRT.
We recommend you to please reach out to the same post you mentioned.

Thank you.

Topic		Replies	Views
Deploying a 1.3B GPT-3 Model with NVIDIA NeMo Megatron Technical Blog	3	1023	March 31, 2023
TensorRT server (github with an example mode has been deleted) TensorRT	3	764	October 12, 2021
Deploying GPT-J and T5 with FasterTransformer and Triton Inference Server Technical Blog	7	1084	April 19, 2023
Optimizing Inference on Large Language Models with NVIDIA TensorRT-LLM, Now Publicly Available Technical Blog	8	1875	January 25, 2024
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, the World’s Largest and Most Powerful Generative Language Model Technical Blog	2	1474	May 15, 2025
Triton server for squad model on P100 with TensorRT 6.0 Triton Inference Server (archived)	0	918	June 23, 2020
Microsoft Trains Turing-NLG, World’s Largest Transformer Language Model Technical Blog	0	305	August 21, 2022
TensorRT Inference Server - AWS S3 Model repository Triton Inference Server (archived)	0	612	May 23, 2019
TF-TRT RNN NMT model optimise, Input tensor with shape [?,?] TensorRT	0	648	May 29, 2019
Optimizing and Serving Models with NVIDIA TensorRT and NVIDIA Triton Technical Blog	1	418	July 20, 2022

MT-NLG - Are we ever getting access to the 530 B parameters trained model?

Related topics