LLM Performance Benchmarking: Measuring NVIDIA NIM Performance with GenAI-Perf

Originally published at: https://developer.nvidia.com/blog/llm-performance-benchmarking-measuring-nvidia-nim-performance-with-genai-perf/

This is the second post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM.  When building LLM-based applications, it is critical to understand the performance characteristics of these models on a given hardware. This serves multiple purposes:  Identifying the bottleneck and…