Can someone tell me how to benchmark LLama_v2_7b model on jetson Orin AGX with different quantization methods and also how to get perplexity score
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
How to benchmark AI processes on Jetson orin NX 8gb | 3 | 808 | October 6, 2023 | |
Running LLMs with TensorRT-LLM on Nvidia Jetson AGX Orin Dev Kit | 1 | 566 | December 8, 2024 | |
Running LMdeploy inference engine on the NVIDIA Jetson AGX Orin Devkit | 2 | 127 | January 14, 2025 | |
LLM Performance Benchmarking: Measuring NVIDIA NIM Performance with GenAI-Perf | 1 | 26 | May 6, 2025 | |
CUDA benchmark | 2 | 1365 | March 20, 2023 | |
LLMs token/sec | 2 | 992 | April 8, 2024 | |
Anyone has comparison of LLM engines(TRTLLM/VLLM/MLC)? | 2 | 92 | June 16, 2025 | |
Model Performance Request | 3 | 290 | February 20, 2024 | |
Problem: slow LLM inference speed on Jetson AGX Orin 64GB | 2 | 279 | April 8, 2025 | |
GPU benchmark test | 2 | 960 | September 20, 2023 |