What image do I need to run the "nvidia/llama/mistral-7b-int4-chat:1.2" model?

I’ve downloaded the mistral-7b-int4-chat_v1.2 model onto a PVC and will like to run the model in Kserve. What image should I be using for this model?

Thank you.

@rigoberto.corujo we don’t support int4 models in NIM at the moment, but you can deploy mistral 7B in fp8 or fp16 using the following container:

nvcr.io/nim/mistralai/mistral-7b-instruct-v03:1.0.0

I believe the specific model you are referring to was built for usage with the ChatRTX app.

Thank you. I have my model in a PVC. What argument do I need to pass to the container so that it runs the model?

% kubectl exec -it pod/model-store-pod -- ls -l /mnt/models/mistral-7b-int4-chat_v1.2/
total 4123108
-rw-r--r-- 1  504 staff         63 Jul 24 21:26 README.txt
-rw-r--r-- 1  504 staff        891 Jul 24 21:26 config.json
-rw-r--r-- 1  504 staff        143 Jul 24 21:26 license.txt
drwxr-xr-x 2 root root        4096 Jul 24 21:26 mistral7b_hf_tokenizer
drwxr-xr-x 2 root root        4096 Jul 24 21:26 mistral_kv_int8_scales
-rw-r--r-- 1  504 staff 4222035384 Jul 24 21:31 rank0.safetensors

Hi @rigoberto.corujo, you can’t run this model with NIM.

Thank you. Perhaps I picked the wrong model. To run the Mistral-7B-v0.1 model from a PVC, what would be the image for that?

Thank you.

We technically don’t support Mistral-7B-v0.1 for deployment with NIM, only the Mistral-7B-v0.3 model.

You can deploy the mistral-7b-instruct-v03 model with the nvcr.io/nim/mistralai/mistral-7b-instruct-v03:1.0.0 container – check out the Kserve deployment instructions here: nim-deploy/kserve at main · NVIDIA/nim-deploy · GitHub

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.