ASGI Exception Error in Hybrid-RAG using Mistral-7B-Instruct-V0.2 Model

Hello,

I am working on the NVIDIA Hybrid-RAG project and have configured the NVCF_RUN_KEY using the model mistral-7b-instruct-v0.2. However, when I upload a PDF and ask questions, I encounter an ASGI error. I have also tried with other models, but the same issue persists. Below are the details of the problem:

  • Setup Information:
    • Model: mistral-7b-instruct-v0.2 (and others tested)
    • Configuration: Generated the NVCF_RUN_KEY as per the instructions.
    • NVIDIA Account: I have active credits on my NVIDIA account.
  • Error Message:
*** ERR: Unable to process query. ***  
Message: Response ended prematurely  
ERROR: Exception in ASGI application  
Traceback (most recent call last):  
File "/home/workbench/.conda/envs/api-env/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi  
...  
Screenshots of the error and configuration are attached for additional context.

I would appreciate any guidance on resolving this issue. Is there a specific configuration or troubleshooting step I might have missed?

Thank you in advance for your help!

Bug or Error
Feature Request
Documentation Issue
Other

Hi, thanks for reaching out. In that final screenshot, can you scroll down in the logs until you see the error message? That will help us pinpoint the source of the issue. Thanks!

I am getting the same error. Lines in the above match the log I have. Last lines show error (OPENAI error I believe)

File “/home/workbench/.conda/envs/api-env/lib/python3.10/site-packages/openai/_utils/_proxy.py”, line 55, in get_proxied
return self.load()
File “/home/workbench/.conda/envs/api-env/lib/python3.10/site-packages/openai/_module_client.py”, line 12, in load
return _load_client().chat
File “/home/workbench/.conda/envs/api-env/lib/python3.10/site-packages/openai/init.py”, line 327, in _load_client
_client = _ModuleClient(
File “/home/workbench/.conda/envs/api-env/lib/python3.10/site-packages/openai/_client.py”, line 105, in init
raise OpenAIError(
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

Hi, thanks for reaching out!

Trying to reproduce the issue on my end and it looks like it is working for me:

Looks like you are using a gated model for local inference with Hugging Face TGI. Have you:

  1. Configured the HUGGING_FACE_HUB_TOKEN in AIWB with your Hugging Face API Key?
  2. Have you accepted the T&C on the Hugging Face model card for mistral v0.2? You need to apply for access (which should be quick)

In the Chat app, note the prerequisites on the right hand side. Make sure they are fulfilled, especially if using a gated model.

Thanks for the follow up. I was trying this using a cloud endpoint. Do you recommend using local instance? I do have a development desktop with an RTX4060 GPU. Is running locally do I also need an inference server?

I was able to get this running. In the logs i noticed that I needed to have a hugging face token with write capabilities. Correcting that I was able to load the model and start the inference server and it is now working great. Thanks for the assistance