Hi,
Can we deploy gemma 2 model in tensorrt_llm backend of triton?
In the examples of tensor rt llm, there is an example of gemma but not gemma2. Is there a reference on how to convert huggingface gemma 2 to tensor rt engine and deploy in triton
Hi,
Can we deploy gemma 2 model in tensorrt_llm backend of triton?
In the examples of tensor rt llm, there is an example of gemma but not gemma2. Is there a reference on how to convert huggingface gemma 2 to tensor rt engine and deploy in triton