Gemma2 support in tensor RT LLM

AI & Data Science Deep Learning (Training & Inference) TensorRT

jaya.kommuru July 24, 2024, 5:02am 1

Hi,

Can we deploy gemma 2 model in tensorrt_llm backend of triton?
In the examples of tensor rt llm, there is an example of gemma but not gemma2. Is there a reference on how to convert huggingface gemma 2 to tensor rt engine and deploy in triton

Topic		Replies	Views
Is it possible to deploy the Llama-70b model with TensorRT LLM on an L40S GPU? TensorRT tensorrt , ubuntu , inference-server-triton	2	546	May 30, 2024
Using beam search with the TensorRT compiled T5 model? TensorRT tensorrt , pytorch , python , onnx	1	1174	April 8, 2022
How to deploy and run Mistral-7B-Instruct-v0.2 model using Triton Inference Server on AWS EC2 instance Deep Learning (Training & Inference) mistral-7b-instruct-v02	1	27	May 1, 2025
Is NVIDIA NIM the same as TensorRT-LLM and Triton working together? TensorRT	0	748	May 3, 2024
TRM support on TX2 Jetson TX2	1	513	December 23, 2018
Nvidia jetson orin nano has tensorrt support? Jetson Orin Nano tensorrt	2	32	April 7, 2025
Questions about implementing RetinaNet with TensorRT on Jetson TX2 TensorRT	0	1050	May 1, 2019
TensorRT for Large Language Models Jetson AGX Orin	2	588	September 11, 2023
How to convert Hugging Face Llama 7B Chat to Nvidia NeMo model? AI Foundation Models and Endpoints	0	376	March 18, 2024
CONV2DLSTM is compatible with TENSORT? TensorRT	1	289	March 1, 2021

Gemma2 support in tensor RT LLM

Related topics