Hi
I’m following this doc to setup LLAMA3 8B NIM with multi LoRA Adapters. But I don’t see any LoRA models listed in API (v1/models), has anyone encountered the same issue?
Here are the relevant parameters:
export LOCAL_PEFT_DIRECTORY=/mnt/model_repo
export NIM_PEFT_SOURCE=/home/nvs/loras
This is my local directory of LoRAs
[ec2-user@ip-30-60-80-70 ~]$ ls -l /mnt/model_repo/
total 16
drwxrwxrwx 2 ec2-user ec2-user 6144 Aug 20 09:11 llama3-8b-instruct-lora_vhf-math-v1
drwxrwxrwx 2 ec2-user ec2-user 6144 Aug 20 09:11 llama3-8b-instruct-lora_vhf-squad-v1
drwxrwxrwx 2 ec2-user ec2-user 6144 Aug 20 09:10 llama3-8b-instruct-lora_vnemo-math-v1
drwxrwxrwx 2 ec2-user ec2-user 6144 Aug 20 09:10 llama3-8b-instruct-lora_vnemo-squad-v1
and here is the LoRA directory inside the container:
I have no name!@c5f68f7d2f49:/$ ls -l /home/nvs/loras/
total 16
drwxrwxrwx 2 1000 1000 6144 Aug 20 09:11 llama3-8b-instruct-lora_vhf-math-v1
drwxrwxrwx 2 1000 1000 6144 Aug 20 09:11 llama3-8b-instruct-lora_vhf-squad-v1
drwxrwxrwx 2 1000 1000 6144 Aug 20 09:10 llama3-8b-instruct-lora_vnemo-math-v1
drwxrwxrwx 2 1000 1000 6144 Aug 20 09:10 llama3-8b-instruct-lora_vnemo-squad-v1
I have no name!@c5f68f7d2f49:/$
The command that I launch NIM is
sudo docker run -it --name=$CONTAINER_NAME --runtime=nvidia --gpus all --shm-size=16GB -e NGC_API_KEY=$NGC_API_KEY -e NIM_PEFT_SOURCE -e NIM_PEFT_REFRESH_INTERVAL -v $NIM_CACHE_PATH:/opt/nim/.cache -v $LOCAL_PEFT_DIRECTORY:$NIM_PEFT_SOURCE -p 8000:8000 nvcr.io/nim/meta/llama3-8b-instruct:1.0.0
and when I use API to list all the models, it only shows base model, all the LoRAs are not listed
[ec2-user@ip-30-60-80-70 ~]$ curl -s http://0.0.0.0:8000/v1/models | jq
{
"object": "list",
"data": [
{
"id": "meta/llama3-8b-instruct",
"object": "model",
"created": 1724164793,
"owned_by": "system",
"root": "meta/llama3-8b-instruct",
"parent": null,
"permission": [
{
"id": "modelperm-086a0fd69d37425a80de6db7d2ace0f6",
"object": "model_permission",
"created": 1724164793,
"allow_create_engine": false,
"allow_sampling": true,
"allow_logprobs": true,
"allow_search_indices": false,
"allow_view": true,
"allow_fine_tuning": false,
"organization": "*",
"group": null,
"is_blocking": false
}
]
}
]
}
[ec2-user@ip-30-60-80-70 ~]$
In addition to that, I also observed that LoRAs are not loaded by NIM at initialization phase as the following log:
INFO 08-20 13:46:12.625 ngc_profile.py:217] Running NIM without LoRA. Only looking for compatible profiles that do not support LoRA.