Support tensorrt llm and tensorrt llm on triton inference server backend

Description

by document backend-platform support martic for triton inference server

still they aren’t support tensorrt llm.

in furure release will support jetson or arm-saba too?
i have a jetson orin agx. i would have test them on my own.
and preparing to but a dgx spark.

if you not, is there suggenstion for multi model serving framework that support tensorrt llm?

i have a look dynamo. but it’s purpose is lage scale infrastructure like a datacenter.

Environment

TensorRT Version:
GPU Type:
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi @jude84.kim ,
I would recommend you to please reach out to TRITON Inference Server Github issue page.

Thanks