Multi-LoRA with LLAMA 3 NIM is not listed in API

Hi
I’m following this doc to setup LLAMA3 8B NIM with multi LoRA Adapters. But I don’t see any LoRA models listed in API (v1/models), has anyone encountered the same issue?

Here are the relevant parameters:

export LOCAL_PEFT_DIRECTORY=/mnt/model_repo
export NIM_PEFT_SOURCE=/home/nvs/loras

This is my local directory of LoRAs

[ec2-user@ip-30-60-80-70 ~]$ ls -l /mnt/model_repo/
total 16
drwxrwxrwx 2 ec2-user ec2-user 6144 Aug 20 09:11 llama3-8b-instruct-lora_vhf-math-v1
drwxrwxrwx 2 ec2-user ec2-user 6144 Aug 20 09:11 llama3-8b-instruct-lora_vhf-squad-v1
drwxrwxrwx 2 ec2-user ec2-user 6144 Aug 20 09:10 llama3-8b-instruct-lora_vnemo-math-v1
drwxrwxrwx 2 ec2-user ec2-user 6144 Aug 20 09:10 llama3-8b-instruct-lora_vnemo-squad-v1

and here is the LoRA directory inside the container:

I have no name!@c5f68f7d2f49:/$ ls -l /home/nvs/loras/
total 16
drwxrwxrwx 2 1000 1000 6144 Aug 20 09:11 llama3-8b-instruct-lora_vhf-math-v1
drwxrwxrwx 2 1000 1000 6144 Aug 20 09:11 llama3-8b-instruct-lora_vhf-squad-v1
drwxrwxrwx 2 1000 1000 6144 Aug 20 09:10 llama3-8b-instruct-lora_vnemo-math-v1
drwxrwxrwx 2 1000 1000 6144 Aug 20 09:10 llama3-8b-instruct-lora_vnemo-squad-v1
I have no name!@c5f68f7d2f49:/$

The command that I launch NIM is

sudo docker run -it --name=$CONTAINER_NAME --runtime=nvidia --gpus all --shm-size=16GB -e NGC_API_KEY=$NGC_API_KEY -e NIM_PEFT_SOURCE -e NIM_PEFT_REFRESH_INTERVAL -v $NIM_CACHE_PATH:/opt/nim/.cache -v $LOCAL_PEFT_DIRECTORY:$NIM_PEFT_SOURCE -p 8000:8000 nvcr.io/nim/meta/llama3-8b-instruct:1.0.0

and when I use API to list all the models, it only shows base model, all the LoRAs are not listed

[ec2-user@ip-30-60-80-70 ~]$ curl -s http://0.0.0.0:8000/v1/models | jq
{
  "object": "list",
  "data": [
    {
      "id": "meta/llama3-8b-instruct",
      "object": "model",
      "created": 1724164793,
      "owned_by": "system",
      "root": "meta/llama3-8b-instruct",
      "parent": null,
      "permission": [
        {
          "id": "modelperm-086a0fd69d37425a80de6db7d2ace0f6",
          "object": "model_permission",
          "created": 1724164793,
          "allow_create_engine": false,
          "allow_sampling": true,
          "allow_logprobs": true,
          "allow_search_indices": false,
          "allow_view": true,
          "allow_fine_tuning": false,
          "organization": "*",
          "group": null,
          "is_blocking": false
        }
      ]
    }
  ]
}
[ec2-user@ip-30-60-80-70 ~]$

In addition to that, I also observed that LoRAs are not loaded by NIM at initialization phase as the following log:

INFO 08-20 13:46:12.625 ngc_profile.py:217] Running NIM without LoRA. Only looking for compatible profiles that do not support LoRA.

Hi @ryan_sg can you check to see if the environment variables are getting passed over correctly? Try running env inside the container and looking at the value of NIM_PEFT_SOURCE

As long as NIM_PEFT_SOURCE is set it should at least be trying to load the profiles that support LoRA

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.