Example-hybrid-rag

sstarrett4jc · December 2, 2024, 12:20am

Trying to run the microservies again on the example-hybrid-rag. I load the llama-3.1-8b-instruct container without any issue. I use the api key that I create on the build.nvidia site. I also put that key in the NVCF secrets area. Once I run the container I am able to get to the IP address with the curl test. That works perfectly. I’m also able to get to it with the NIM ChatUI.html. However when I try to get to it with the project using either remote or local I get issues. I know the NIM is running correctly because I can get to it with the Curl test and the NIM ChatUI. This used to work great, now I have issues. I’ve reinstalled Ubuntu 22.04. I have 196GB of memory on my system and two x A6000 GPUs. This is the error that I get "
*** ERR: Unable to process query. ***
Message: Response ended prematurely"

sstarrett4jc · December 2, 2024, 12:21am

I should also mention that I am running all of this locally. The NIM and the AI workbench project are on the same maching.

sstarrett4jc · December 2, 2024, 12:31am

More error informaiotn. This is from a fresh start. Looks like a docker issue as well. However I’m able to get Local to work with ungated model loads.

twhitehouse · December 2, 2024, 2:36am

Hi - just seeing this.

I believe that there’ve been difficulties in running the NIM locally on a Windows system.

Best thing would be to follow the directions to run the NIM on a remote.

We may consider refactoring the project to run the NIM locally via Docker Compose instead and I think this would resolve the issue.

sstarrett4jc · December 2, 2024, 2:53am

I just tried to run this from a remote system. I’m getting proxies errors again. I’ve ping the address to the system that has the NIM running and I do in fact get a reply. I even used the NIM ChatUI and that works as well. Just a reminder this used to work flawlessly.

edwli · December 2, 2024, 8:23pm

Thanks for letting us know, yes I have been able to reproduce the issue and have pushed a fix. See here.

Also when using a NIM microservice option locally, be sure to add the right configurations into the project for your particular host environment. See readme here.

Topic		Replies	Views
Workbench Example Project: Hybrid RAG - Stuck at setting up backend polling inference server NVIDIA AI Workbench	10	156	October 22, 2024
Running NIM llama-3_1-8b-instruct fails in On-Prem deployment Models nim , llama	7	282	April 10, 2025
Getting Started With NVIDIA NIM Tutorial Issues with NGC Registry Access/Accounts ubuntu , nim , llm , llama3-8b-instruct	7	1341	July 24, 2024
Unable to Run NIM on H100 GPU Due to Profile Compatibility Issue Despite Sufficient GPU Resources Models nim , llama-31-8b-instruct , llama	1	186	November 12, 2024
Workbench-example-hybrid-rag Microservice error NVIDIA AI Workbench	1	29	December 2, 2024
Launch the Reranker NIM : Failing to create container for Visual AI Agent nim , llama	7	90	April 22, 2025
Aunch NVIDIA NIM (llama3-8b-instruct) for LLMs locally Access/Accounts nim , llama3-8b-instruct	3	106	November 8, 2024
NVIDIA NIM Container with CUDA out of Memory Problem Docker and NVIDIA Docker cuda , ubuntu , docker , nim , llama3-8b-instruct	2	476	September 20, 2024
Reusing a stored model (llama-3.1-8b-instruct) with a proper profile Models nim , llama-31-8b-instruct , llama	0	146	October 30, 2024
NIM API key not Found Models nim , llama-31-8b-instruct , llama	4	522	September 21, 2024

Example-hybrid-rag

Related topics