Hey,
I am trying to launch nim/meta/llama-3.1-405b-instruct on a machine with 8xA100 80GB with SXM4 and I get an error that the profile doesnt match the machine, even though it seems to meet like an exact match.
What an I missing?
"
== NVIDIA Inference Microservice LLM NIM ==
NVIDIA Inference Microservice LLM NIM Version 1.1.2
Model: nim/meta/llama-3.1-405b-instruct
Container image Copyright (c) 2016-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
The use of this model is governed by the NVIDIA AI Foundation Models Community License Agreement (found at NVIDIA Agreements | Enterprise Software | NVIDIA AI Foundation Models Community License Agreement.
ADDITIONAL INFORMATION: Llama 3.1 Community License Agreement, Built with Llama.
INFO 11-28 18:28:32.556 ngc_profile.py:222] Running NIM without LoRA. Only looking for compatible profiles that do not support LoRA.
INFO 11-28 18:28:32.556 ngc_profile.py:224] Detected 0 compatible profile(s).
ERROR 11-28 18:28:32.556 utils.py:21] Profile ‘tensorrt_llm-a100-fp16-tp8-latency’ is incompatible with detected hardware. Please check the system information below and select a compatible profile.
SYSTEM INFO
- Free GPUs:
- [20b2:10de] (0) NVIDIA A100-SXM4-80GB (A100 80GB) [current utilization: 0%]
- [20b2:10de] (1) NVIDIA A100-SXM4-80GB (A100 80GB) [current utilization: 0%]
- [20b2:10de] (2) NVIDIA A100-SXM4-80GB (A100 80GB) [current utilization: 0%]
- [20b2:10de] (3) NVIDIA A100-SXM4-80GB (A100 80GB) [current utilization: 0%]
- [20b2:10de] (4) NVIDIA A100-SXM4-80GB (A100 80GB) [current utilization: 0%]
- [20b2:10de] (5) NVIDIA A100-SXM4-80GB (A100 80GB) [current utilization: 0%]
- [20b2:10de] (6) NVIDIA A100-SXM4-80GB (A100 80GB) [current utilization: 0%]
- [20b2:10de] (7) NVIDIA A100-SXM4-80GB (A100 80GB) [current utilization: 0%]
"