Hi,
Do you use an LLM model?
If not, you can use TensorRT directly.
For LLM, you can find some benchmark data with vLLM in the topic below:
Thanks.
Hi,
Do you use an LLM model?
If not, you can use TensorRT directly.
For LLM, you can find some benchmark data with vLLM in the topic below:
Thanks.
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| TensorRT Edge-LLM on the AGX Thor | 11 | 805 | December 4, 2025 | |
| TensorRT LLM | 6 | 616 | November 18, 2025 | |
| Support for TensoRT-LLM and Benchmarking Models | 7 | 393 | September 24, 2025 | |
| How to serve TensorRT-LLM engines with Triton Inference Server on Jetson Thor and compare inference speed with vLLM container? | 6 | 135 | February 2, 2026 | |
| Announcing new VLLM container & 3.5X increase in Gen AI Performance in just 5 weeks of Jetson AGX Thor Launch | 46 | 3462 | December 14, 2025 | |
| Inquiry on any updated support for tensorrt-llm support nvidia orin AGX? | 4 | 273 | June 3, 2025 | |
| Optimizing Inference on Large Language Models with NVIDIA TensorRT-LLM, Now Publicly Available | 8 | 1996 | January 25, 2024 | |
| TensorRT-LLM for Jetson | 11 | 4058 | July 7, 2025 | |
| TensorRT-LLM for Jetson | 0 | 283 | November 13, 2024 | |
| NGC Catalog: How to install TensorRT in vllm:25.11-py3 container | 10 | 190 | January 25, 2026 |