Google's New Gemma 2 Model Now Optimized and Available on NVIDIA API Catalog

Originally published at: NVIDIA NIM | gemma-2-27b-it

Gemma 2, the next generation of Google Gemma models, is now optimized with TensorRT-LLM and packaged as NVIDIA NIM inference microservice.

That’s great news! Gemma 2 with TensorRT-LLM and NVIDIA NIM inference microservice sounds like a powerful combination for efficient LLM inference. I’m excited to see how this will improve performance and deployment options.