I get this error when trying to run NIM for llama-3.1-8b
#!/bin/bash
export CONTAINER_NAME=llama-3.1-8b-instruct
export IMG_NAME=“nvcr.io/nim/meta/${CONTAINER_NAME}:1.1.1”
docker run -d --name=$CONTAINER_NAME
–runtime=nvidia
–gpus ‘“device=5,6”’
–shm-size=16GB
-e NGC_API_KEY=$NGC_API_KEY
-v “$LOCAL_NIM_CACHE:/opt/nim/.cache”
-u $(id -u)
-p 8004:8000
$IMG_NAME
How do I get compatible profiles?
===========================================
== NVIDIA Inference Microservice LLM NIM ==
NVIDIA Inference Microservice LLM NIM Version 1.0.0
Model: nim/meta/llama-3_1-8b-instruct
Container image Copyright (c) 2016-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
The use of this model is governed by the NVIDIA AI Foundation Models Community License Agreement (found at NVIDIA Agreements | Enterprise Software | NVIDIA AI Foundation Models Community License Agreement.
ADDITIONAL INFORMATION: Llama 3.1 Community License Agreement, Built with Llama.
INFO 08-05 04:44:38.299 ngc_profile.py:222] Running NIM without LoRA. Only looking for compatible profiles that do not support LoRA.
INFO 08-05 04:44:38.299 ngc_profile.py:224] Detected 0 compatible profile(s).
ERROR 08-05 04:44:38.299 utils.py:21] Could not find a profile that is currently runnable with the detected hardware. Please check the system information below and make sure you have enough free GPUs.
SYSTEM INFO
- Free GPUs:
- [20b0:10de] (0) NVIDIA A100-SXM4-40GB [current utilization: 0%]
- [20b0:10de] (1) NVIDIA A100-SXM4-40GB [current utilization: 0%]
This is the output for list-model-profiles:
===========================================
== NVIDIA Inference Microservice LLM NIM ==
NVIDIA Inference Microservice LLM NIM Version 1.0.0
Model: nim/meta/llama-3_1-8b-instruct
Container image Copyright (c) 2016-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
The use of this model is governed by the NVIDIA AI Foundation Models Community License Agreement (found at NVIDIA Agreements | Enterprise Software | NVIDIA AI Foundation Models Community License Agreement.
ADDITIONAL INFORMATION: Llama 3.1 Community License Agreement, Built with Llama.
SYSTEM INFO
- Free GPUs:
- [20b0:10de] (2) NVIDIA A100-SXM4-40GB [current utilization: 0%]
- [20b0:10de] (3) NVIDIA A100-SXM4-40GB [current utilization: 0%]
- [20b0:10de] (4) NVIDIA A100-SXM4-40GB [current utilization: 0%]
- [20b0:10de] (5) NVIDIA A100-SXM4-40GB [current utilization: 0%]
- [20b0:10de] (6) NVIDIA A100-SXM4-40GB [current utilization: 0%]
- [20b0:10de] (7) NVIDIA A100-SXM4-40GB [current utilization: 0%]
- Non-free GPUs:
- [20b0:10de] (0) NVIDIA A100-SXM4-40GB [current utilization: 45%]
- [20b0:10de] (1) NVIDIA A100-SXM4-40GB [current utilization: 90%]
MODEL PROFILES
- Compatible with system and runnable:
- Incompatible with system:
- 0bc4cc784e55d0a88277f5d1aeab9f6ecb756b9049dd07c1835035211fcfe77e (tensorrt_llm-h100-fp8-tp2-latency)
- 2959f7f0dfeb14631352967402c282e904ff33e1d1fa015f603d9890cf92ca0f (tensorrt_llm-h100-fp8-tp1-throughput)
- e45b4b991bbc51d0df3ce53e87060fc3a7f76555406ed534a8479c6faa706987 (tensorrt_llm-a10g-bf16-tp4-latency)
- d880feac6596cfd7a2db23a6bcbbc403673e57dec9b06b6a1add150a713f3fe1 (tensorrt_llm-a100-bf16-tp2-latency)
- 7f98797c334a8b7205d4cbf986558a2b8a181570b46abed9401f7da6d236955e (tensorrt_llm-h100-bf16-tp2-latency)
- 0494aafce0df9eeaea49bbca6b25fc3013d0e8a752ebcf191a2ddeaab19481ee (tensorrt_llm-l40s-bf16-tp2-latency)
- ba515cc44a34ae4db8fe375bd7e5ad30e9a760bd032230827d8a54835a69c409 (tensorrt_llm-a10g-bf16-tp2-throughput)
- a534b0f5e885d747e819fa8b1ad7dc1396f935425a6e0539cb29b0e0ecf1e669 (tensorrt_llm-l40s-bf16-tp2-throughput)
- 7ea3369b85d7aee24e0739df829da8832b6873803d5f5aca490edad7360830c8 (tensorrt_llm-a100-bf16-tp1-throughput)
- 9cff0915527166b2e93c08907afd4f74e168562992034a51db00df802e86518c (tensorrt_llm-h100-bf16-tp1-throughput)
- fcfdd389299632ae51e5560b3368d18c4774441ef620aa1abf80d7077b2ced2b (tensorrt_llm-a10g-bf16-tp4-throughput-lora)
- 6b89dc22ba60a07df3051451b7dc4ef418d205e52e19cb0845366dc18dd61bd6 (tensorrt_llm-l40s-bf16-tp2-throughput-lora)
- a506c5bed39ba002797d472eb619ef79b1ffdf8fb96bb54e2ff24d5fc421e196 (tensorrt_llm-a100-bf16-tp1-throughput-lora)
- 40543df47628989c7ef5b16b33bd1f55165dddeb608bf3ccb56cdbb496ba31b0 (tensorrt_llm-h100-bf16-tp1-throughput-lora)