Originally published at: NVIDIA NIM | gemma-2-27b-it
Gemma 2, the next generation of Google Gemma models, is now optimized with TensorRT-LLM and packaged as NVIDIA NIM inference microservice.
That’s great news! Gemma 2 with TensorRT-LLM and NVIDIA NIM inference microservice sounds like a powerful combination for efficient LLM inference. I’m excited to see how this will improve performance and deployment options.