Example-hybrid-rag

Trying to run the microservies again on the example-hybrid-rag. I load the llama-3.1-8b-instruct container without any issue. I use the api key that I create on the build.nvidia site. I also put that key in the NVCF secrets area. Once I run the container I am able to get to the IP address with the curl test. That works perfectly. I’m also able to get to it with the NIM ChatUI.html. However when I try to get to it with the project using either remote or local I get issues. I know the NIM is running correctly because I can get to it with the Curl test and the NIM ChatUI. This used to work great, now I have issues. I’ve reinstalled Ubuntu 22.04. I have 196GB of memory on my system and two x A6000 GPUs. This is the error that I get "
*** ERR: Unable to process query. ***
Message: Response ended prematurely"

I should also mention that I am running all of this locally. The NIM and the AI workbench project are on the same maching.

More error informaiotn. This is from a fresh start. Looks like a docker issue as well. However I’m able to get Local to work with ungated model loads.

Hi - just seeing this.

I believe that there’ve been difficulties in running the NIM locally on a Windows system.

Best thing would be to follow the directions to run the NIM on a remote.

We may consider refactoring the project to run the NIM locally via Docker Compose instead and I think this would resolve the issue.

1 Like

I just tried to run this from a remote system. I’m getting proxies errors again. I’ve ping the address to the system that has the NIM running and I do in fact get a reply. I even used the NIM ChatUI and that works as well. Just a reminder this used to work flawlessly.

Thanks for letting us know, yes I have been able to reproduce the issue and have pushed a fix. See here.

Also when using a NIM microservice option locally, be sure to add the right configurations into the project for your particular host environment. See readme here.