How to fix 0 compatible profiles? Where to get compatible profiles?

I get this error when trying to run NIM for llama-3.1-8b

#!/bin/bash
export CONTAINER_NAME=llama-3.1-8b-instruct
export IMG_NAME=“nvcr.io/nim/meta/${CONTAINER_NAME}:1.1.1

docker run -d --name=$CONTAINER_NAME
–runtime=nvidia
–gpus ‘“device=5,6”’
–shm-size=16GB
-e NGC_API_KEY=$NGC_API_KEY
-v “$LOCAL_NIM_CACHE:/opt/nim/.cache”
-u $(id -u)
-p 8004:8000
$IMG_NAME

How do I get compatible profiles?

===========================================
== NVIDIA Inference Microservice LLM NIM ==

NVIDIA Inference Microservice LLM NIM Version 1.0.0
Model: nim/meta/llama-3_1-8b-instruct

Container image Copyright (c) 2016-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

The use of this model is governed by the NVIDIA AI Foundation Models Community License Agreement (found at NVIDIA Agreements | Enterprise Software | NVIDIA AI Foundation Models Community License Agreement.

ADDITIONAL INFORMATION: Llama 3.1 Community License Agreement, Built with Llama.

INFO 08-05 04:44:38.299 ngc_profile.py:222] Running NIM without LoRA. Only looking for compatible profiles that do not support LoRA.
INFO 08-05 04:44:38.299 ngc_profile.py:224] Detected 0 compatible profile(s).
ERROR 08-05 04:44:38.299 utils.py:21] Could not find a profile that is currently runnable with the detected hardware. Please check the system information below and make sure you have enough free GPUs.
SYSTEM INFO

  • Free GPUs:
    • [20b0:10de] (0) NVIDIA A100-SXM4-40GB [current utilization: 0%]
    • [20b0:10de] (1) NVIDIA A100-SXM4-40GB [current utilization: 0%]

This is the output for list-model-profiles:

===========================================
== NVIDIA Inference Microservice LLM NIM ==

NVIDIA Inference Microservice LLM NIM Version 1.0.0
Model: nim/meta/llama-3_1-8b-instruct

Container image Copyright (c) 2016-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

The use of this model is governed by the NVIDIA AI Foundation Models Community License Agreement (found at NVIDIA Agreements | Enterprise Software | NVIDIA AI Foundation Models Community License Agreement.

ADDITIONAL INFORMATION: Llama 3.1 Community License Agreement, Built with Llama.

SYSTEM INFO

  • Free GPUs:
    • [20b0:10de] (2) NVIDIA A100-SXM4-40GB [current utilization: 0%]
    • [20b0:10de] (3) NVIDIA A100-SXM4-40GB [current utilization: 0%]
    • [20b0:10de] (4) NVIDIA A100-SXM4-40GB [current utilization: 0%]
    • [20b0:10de] (5) NVIDIA A100-SXM4-40GB [current utilization: 0%]
    • [20b0:10de] (6) NVIDIA A100-SXM4-40GB [current utilization: 0%]
    • [20b0:10de] (7) NVIDIA A100-SXM4-40GB [current utilization: 0%]
  • Non-free GPUs:
    • [20b0:10de] (0) NVIDIA A100-SXM4-40GB [current utilization: 45%]
    • [20b0:10de] (1) NVIDIA A100-SXM4-40GB [current utilization: 90%]
      MODEL PROFILES
  • Compatible with system and runnable:
  • Incompatible with system:
    • 0bc4cc784e55d0a88277f5d1aeab9f6ecb756b9049dd07c1835035211fcfe77e (tensorrt_llm-h100-fp8-tp2-latency)
    • 2959f7f0dfeb14631352967402c282e904ff33e1d1fa015f603d9890cf92ca0f (tensorrt_llm-h100-fp8-tp1-throughput)
    • e45b4b991bbc51d0df3ce53e87060fc3a7f76555406ed534a8479c6faa706987 (tensorrt_llm-a10g-bf16-tp4-latency)
    • d880feac6596cfd7a2db23a6bcbbc403673e57dec9b06b6a1add150a713f3fe1 (tensorrt_llm-a100-bf16-tp2-latency)
    • 7f98797c334a8b7205d4cbf986558a2b8a181570b46abed9401f7da6d236955e (tensorrt_llm-h100-bf16-tp2-latency)
    • 0494aafce0df9eeaea49bbca6b25fc3013d0e8a752ebcf191a2ddeaab19481ee (tensorrt_llm-l40s-bf16-tp2-latency)
    • ba515cc44a34ae4db8fe375bd7e5ad30e9a760bd032230827d8a54835a69c409 (tensorrt_llm-a10g-bf16-tp2-throughput)
    • a534b0f5e885d747e819fa8b1ad7dc1396f935425a6e0539cb29b0e0ecf1e669 (tensorrt_llm-l40s-bf16-tp2-throughput)
    • 7ea3369b85d7aee24e0739df829da8832b6873803d5f5aca490edad7360830c8 (tensorrt_llm-a100-bf16-tp1-throughput)
    • 9cff0915527166b2e93c08907afd4f74e168562992034a51db00df802e86518c (tensorrt_llm-h100-bf16-tp1-throughput)
    • fcfdd389299632ae51e5560b3368d18c4774441ef620aa1abf80d7077b2ced2b (tensorrt_llm-a10g-bf16-tp4-throughput-lora)
    • 6b89dc22ba60a07df3051451b7dc4ef418d205e52e19cb0845366dc18dd61bd6 (tensorrt_llm-l40s-bf16-tp2-throughput-lora)
    • a506c5bed39ba002797d472eb619ef79b1ffdf8fb96bb54e2ff24d5fc421e196 (tensorrt_llm-a100-bf16-tp1-throughput-lora)
    • 40543df47628989c7ef5b16b33bd1f55165dddeb608bf3ccb56cdbb496ba31b0 (tensorrt_llm-h100-bf16-tp1-throughput-lora)

Hello,

Thanks for joining the NVIDIA Developer forums. Unfortunately, your GPUs do not meet the system requirements.

See the requirements here: Support Matrix - NVIDIA Docs

Thanks for the response!

Is there a roadmap for eventually supporting this? It’s rather unfortunate that I cannot run the llama-3.1-8b model when I have 6 unused A-100 GPUs.

Hi Jeremy,

We’re planning to include this in a future version, but no ETA is available at this time.