Originally published at: https://developer.nvidia.com/blog/deploying-a-1-3b-gpt-3-model-with-nvidia-nemo-megatron/
Large language models (LLMs) are some of the most advanced deep learning algorithms that are capable of understanding written language. Many modern LLMs are built using the transformer network introduced by Google in 2017 in the Attention Is All You Need research paper. NVIDIA NeMo Megatron is an end-to-end GPU-accelerated framework for training and deploying…
I loved deploying NeMo Megatron locally to power language-based applications and look forward to seeing exciting new ways to use LLMs. Let me know if there are any questions and I will be happy to help!
Hey that was a great article! It worked well (20b param) but I’m having no luck in changing the temperature. Tried changing it where it was defined and no luck, tried adding it to the argparser, and still no luck… what am I missing?!? and still no luck… what am I missing?!? and still no luck… what am I missing?!? and still no
jkjk thank you for your patience and maybe your guidance :D
Thanks for the excellent tutorial! Everything worked well. I did have one question about the tokenizer. Why is the tokenizer GPT2 even though the model is GPT3 ?