Join us to learn more about theTensorRT-LLM’s new open-development model. In this livestream, you’ll learn contribution logistics, adding new features and tests, as well as using CI/CD and interpreting results.
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| Optimizing Inference on Large Language Models with NVIDIA TensorRT-LLM, Now Publicly Available | 8 | 2018 | January 25, 2024 | |
| Chat with RTX: would the development team actively maintain and update this product? | 1 | 537 | June 24, 2024 | |
| Easier. Faster. Open. TensorRT LLM 1.0 is here | 0 | 247 | September 25, 2025 | |
| Easier. Faster. Open. TensorRT LLM 1.0 | 0 | 69 | September 25, 2025 | |
| NVIDIA TensorRT-LLM 및 NVIDIA Triton Inference Server로 Meta Llama 3 성능 강화 | 1 | 357 | May 3, 2024 | |
| Just Released: NVIDIA TensorRT-LLM 0.13.0 | 1 | 111 | October 7, 2024 | |
| 추론 성능 가속화하는 새로운 소프트웨어 TensorRT-LLM 출시 | 0 | 681 | September 12, 2023 | |
| Beyond the Algorithm: The New PyTorch Architecture for TensorRT-LLM | 1 | 331 | April 21, 2025 | |
| TensorRT-LLM Speculative Decoding Boosts Inference Throughput by up to 3.6x | 4 | 199 | January 9, 2025 | |
| Supercharging Llama 3.1 across NVIDIA Platforms | 14 | 390 | September 17, 2024 |