[SUPPORT] Workbench Example Project: Hybrid RAG

twhitehouse · July 19, 2024, 2:25pm

You might have the experience you want if you work with this Project on GitHub.

It’s a multi-container Project.

I think you need to register for an Early Access Program or something to get ahold of the containers.

I think Edward posted it before.

joe173 · July 20, 2024, 11:30pm

I like the multi-container part of that. Interesting how Reddis, MilvusDB and the models all run in their own containers. Something about that project doesn’t run on my Titan RTX because it requires bfloat16 which isn’t supported on my Titan RTX card. I tried some other models that were supposed to work with float16 but the project always runs the mode with dtype=torch.bfloat16 At least i think that is the problem statement.

Looks like AI Workstation is Ampere and up when it comes to models.

twhitehouse · July 22, 2024, 8:15pm

Workbench doesn’t have a dependency on the GPU architecture. This kind of dependency happens at the container level, i.e. do the software libraries installed require a generation vs another.

If the code is backwards compatible and there’s a sufficient CUDA version, you should be good.

We will look to see if that particular setting for the precision can be changed to work with your GPU.

edwli · July 22, 2024, 9:00pm

Hey, so it sounds like you are unable to run the model in the NIM container for the NIM Anywhere project.

According to the NIM docs GPU support matrix, you should be able to run a Llama3-8B Instruct NIM with a Titan RTX (compute capability 7.5, 24GB VRAM).

This looks like an issue on the NIM side where it’s not recognizing the hardware compute capability and bumping it down from bf16. We will raise the issue to the NIM team to look into the issue. Can you provide the following information on your issue so we can forward them appropriately? Thanks!

Details & Steps to Reproduce
NIM Software version (1.0.0?) + Model used
Any relevant logs (eg. AI Workbench window > Output > NIM/Chat/etc.)
Any relevant screenshots

joe173 · July 23, 2024, 3:44am

I’ll open an issue

on the nim-anywhere github Does not run on Titan RTX - demands Bfloat16 · Issue #5 · NVIDIA/nim-anywhere · GitHub
on workbench example hybrid rag Does this project run on an RTX Titan? · Issue #14 · NVIDIA/workbench-example-hybrid-rag · GitHub

edwli · July 30, 2024, 5:20pm

(07/30) Added cloud support for Llama 3.1 family (8B, 70B, 405B), Mamba Codestral, and Mistral-NeMo 12B Instruct

playwithai · August 2, 2024, 9:47pm

Hi - the hybrid-rag app can only be accessed with localhost or 127.0.0.1. How can I make the app accessible via the server address eg 192.168.1.100?

twhitehouse · August 4, 2024, 3:21pm

Hi - You cannot do that, yet…

playwithai · August 6, 2024, 12:58am

Not directly. Though it should be a fairly simply change in the launch script.

Here is how I did it using reverse SSH tunnel (the host is a Ubuntu server running the AI Workbench and the hybrid-chta project)

ssh -L 20000:127.0.0.1:10000 user@192.168.1.122

Explanation:
Hybrid Rag chat runs on 192.168.1.122 but only accessible using 127.0.0.1:10000/projects/hybrid-rag/applications/chat/ using remote desktop.

Execute in the commandline of my windows laptop or Mac terminal:

ssh -L 20000:127.0.0.1:10000 user@192.168.1.122

After login using your SSH credential to 192.168.1.122, you can access the rag at http://localhost:8000/projects/hybrid-rag/applications/chat/ from your laptop.

twhitehouse · August 6, 2024, 12:21pm

sorry. i meant there is no feature for it yet.

Workbench let you access the application on the remote already, so I’m curious as to your motivation for this approach.

Are you trying to share the application with someone else?

playwithai · August 6, 2024, 6:39pm

Yes, I am implementing every workbench example project available in github on an OVX server with 8x L40S GPU running Ubuntu as the backend.

Then I am setting up a windows laptop as demo station for each project in our AI lab to entertain visiting customers. After all, windows is what enterprises use for work.

I am also setting up all available LLM NIMs from NGC ie Llama3 and Mixtral to show case as well.

We are a NVIDIA Elite partner and this AI Workbench so far is the best effort IMHO to help us to better engage enterprise customers. Great work!

twhitehouse · August 7, 2024, 4:21pm

Thank you!

Might be good to schedule a demo of some advanced features that will be coming in a future release.

Hit me at aiworkbench-ea@nvidia.com and let’s get something on the calendar.

skropp · August 12, 2024, 3:11pm

Hi! I’m on an M1 mac and the build is failing with this error:
#0 building with “desktop-linux” instance using docker driver

#1 [internal] load build definition from Containerfile
#1 transferring dockerfile: 1.85kB done
#1 DONE 0.0s

#2 [auth] huggingface/text-generation-inference:pull token for ghcr.io
#2 DONE 0.0s

#3 [internal] load metadata for Package text-generation-inference · GitHub
#3 ERROR: no match for platform in manifest: not found

[internal] load metadata for ghcr.io/huggingface/text-generation-inference:latest:

Containerfile:1

1 | >>> FROM Package text-generation-inference · GitHub
2 |
3 | WORKDIR /opt/project/build/

ERROR: failed to solve: ghcr.io/huggingface/text-generation-inference:latest: failed to resolve source metadata for ghcr.io/huggingface/text-generation-inference:latest: no match for platform in manifest: not found

I was able to pull the image locally using a platform flag:
docker pull --platform linux/x86_64 Package text-generation-inference · GitHub

edwli · August 12, 2024, 3:41pm

Hi,

This project utilizes the Hugging Face TGI container as the base container for the project environment. Unfortunately, it appears that the Hugging Face team does not support a compatible container with MacOS according to the GitHub issue here.

Do you have a remote system you can connect to and run this project on as a workaround?

edwli · August 14, 2024, 4:18pm

8/14/2024

Fixed a bug when parsing the IP/Hostname for Remote NIM.

“http://” was previously hardcoded for both IP and Hostname inputs, and https-based endpoints required manual change in the code. Now, smarter parsing has been implemented to differentiate between IPv4 and hostname inputs as well as perform some basic input validation. See the following table for usage examples.

User Input	Parsed Endpoint
10.123.45.678	http://10.123.45.678:(port)/v1/
http://10.123.45.678	http://10.123.45.678:(port)/v1/
http://10.123.45.678/	http://10.123.45.678:(port)/v1/
https://10.123.45.678	https://10.123.45.678:(port)/v1/
https://10.123.45.678/	https://10.123.45.678:(port)/v1/
my-nim.hostname.com	https://my-nim.hostname.com:(port)/v1/
http://my-nim.hostname.com	http://my-nim.hostname.com:(port)/v1/
http://my-nim.hostname.com/	http://my-nim.hostname.com:(port)/v1/
https://my-nim.hostname.com	https://my-nim.hostname.com:(port)/v1/
https://my-nim.hostname.com/	https://my-nim.hostname.com:(port)/v1/

edwli · August 15, 2024, 4:20pm

8/15/24

Hotfix–the previous parsing function breaks local NIM deployments. This has been fixed.
Timeout has been increased for local NIM health check to better accommodate for model weight download times

edwli · August 22, 2024, 11:35pm

8/22/2024

Hotfix pushed for an issue where users would not be able to properly upload their PDF documents and therefore would not have access to RAG generation. Fix is to pin the nltk package dependency in the environment to fix a bug with the unstructured package. See GitHub issue here for details.

jainpoojan05 · August 23, 2024, 8:15am

when building the project, I get the Docker Desktop - WSL distro terminated abruptly error, I even try restarting the container from Docker Desktop, but the project never proceeds ahead in building and never finishes building, workbench doesn’t even let me to stop the build as shown in the second screenshot saying There was a problem stopping the build for hybrid-rag. It doesn’t allow me to delete the project either since it’s building. Is there a way to delete the project completely in my computer locally so that I can try to build it again.

edwli · August 26, 2024, 6:37pm

(8/26/2024) Readme updated with deeplinking

edwli · August 26, 2024, 6:45pm

Hi, thanks for reaching out.

There is a manual option you can take to remove a project. Close out of your AI Workbench and follow these steps:

Delete the project folder under wsl > NVIDIA-Workbench > home > workbench > nvidia-workbench. This removes the contents of the project and its git repository from your machine.

Remove the project’s json entry from the list of projects in wsl > NVIDIA-Workbench > home > workbench > .nvwb > inventory.json. This clears out the project metadata from the AI Workbench application.

// inventory.json with no projects
{
 "Projects": null
}

Restart your AI Workbench application. The project should be removed from the application UI.

Topic		Replies	Views
[SUPPORT] Workbench Example Project: Agentic RAG NVIDIA AI Workbench	41	556	June 5, 2025
NIM embedding model downloads but fails with auth error on startup Access/Accounts nim , nv-embedqa-e5-v5	29	693	April 10, 2025
Batch processing using NVIDIA NIM \| Docker \| Self-hosted Models python , nim , llama3-8b-instruct , llama-31-8b-instruct , llama	11	288	January 29, 2025
RTX 4090 shows as "non-free GPU" when running NIM model in docker AI Foundation Models and Endpoints nim	8	2038	October 21, 2024
Blueprint RAG v2.0.0 NVIDIA Blueprints nim , llama-31-70b-instruct , llama , blueprints	1	55	April 24, 2025
VSS blueprint 2.2.0 - ERROR Failed to load VIA stream handler - Failed to generate TRT-LLM engine Visual AI Agent nim , llama-31-70b-instruct , llama	16	301	April 22, 2025
Quick build guide for visual studio code on the nano Jetson Nano	29	15813	October 14, 2021
Deepstream_pose_estimation repository and provided libnvds_osd.so DeepStream SDK	7	622	April 24, 2023
Tao toolkit version5 is getting error when comes to training part TAO Toolkit	45	1750	August 22, 2023
NVIDIA AI Workbench Simplifies Using GPUs on Windows Technical Blog	6	53	September 6, 2024

[SUPPORT] Workbench Example Project: Hybrid RAG

#3 [internal] load metadata for Package text-generation-inference · GitHub #3 ERROR: no match for platform in manifest: not found

Containerfile:1

1 | >>> FROM Package text-generation-inference · GitHub 2 | 3 | WORKDIR /opt/project/build/

Related topics

#3 [internal] load metadata for Package text-generation-inference · GitHub
#3 ERROR: no match for platform in manifest: not found

1 | >>> FROM Package text-generation-inference · GitHub
2 |
3 | WORKDIR /opt/project/build/