[SUPPORT] Workbench Example Project: Hybrid RAG

(8/26/2024)

New NVIDIA-hosted Cloud Endpoints have been added for continued parity with the NVIDIA API Catalog.

  • Phi-3 Medium (128k): Longer context window version of the existing Phi-3 Medium (4k) model
  • Phi-3.5 Mini Instruct: Lightweight multilingual LLM powering AI applications in latency bound and memory/compute constrained environments.
  • Phi-3.5 MoE Instruct: Advanced LLM based on MOE architecture to deliver compute efficient content generation.
  • Nemotron Mini 4B: A distilled model, optimized for on-device inference and fine-tuned for roleplay, RAG and function calling capabilities for game characters
  • Jamba-1.5 Mini Instruct: This MoE model takes advantage of the transformer and Mamba architectures to deliver superior efficiency, latency, and long context handling
  • Jamba-1.5 Large Instruct: This MoE model takes advantage of the transformer and Mamba architectures to deliver superior efficiency, latency, and long context handling

This brings the total number of NVIDIA hosted cloud endpoints supported by this Hybrid RAG example project to 35 different models!

I have finished building the project. I’m now getting this warning according to the screenshot attached, can anyone tell me if it’s safe enough to continue without GPUs. Also My laptop doesn’t meet the minimum RAM requirement (8 GB RAM) for Nvidia AI Benchmark according to the requirements (16 GB RAM). Can anyone give me some clarity on if it’s safe to run this program cause of the low RAM on my laptop. Thanks!

It has been 2 hours for the build, still stuck at 1/19. How long is it supposed to take?
External Image

My CPU is only at 16%-20% also. Is there setting anywhere to improve this situation. Feel like the build will crash at some point as well. wsl support bundle .zip sent to Nvidia team.

I ran into the very same issue - build was stuck at 1/19 for hours. Couldn’t get the build process to stop. Ended up uninstalling EVERYTHING to do with Workbench and starting from scratch. Second time I installed Workbench with Podman option instead of Docker and it did finally finish building.

you should be fine. the 16 GB of ram is a general guideline, not a hard requirement.

the popup you are seeing has to do with the project needing a gpu.

you can always select “continue without GPU” and you will be fine.

it seems strange that changing the runtime would fix it.

regardless, you don’t have to uninstall to change the container runtime.

you can swap it out directly by editing a config file.

see here: AI Workbench Container Runtimes - NVIDIA Docs

For builds “stuck” at step 1/19, keep in mind that this step typically deals with pulling the layers of the base container. You can track progress with logs under Output > Build.

As long as you see progress in the layers being pulled in these logs (eg. 393.22MB/3.66GB), the application is working normally, and the slowness you may be experiencing may be due to a slow network connection. This step should go by faster with a more improved connection.

Hello I am trying to install the hybrid RAG application on a macOS Sonoma 14.6.1 with podman version 5.2.0.
During step 27 of the build process I can continuously get the failure below.
Can you advise how I can resolve this?
I created a support bundle and attached to this reply. Thanks in advance for your guidance
ai-workbench-support-bundle.zip (282.2 KB)

Downloading grpcio-1.58.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.3/5.3 MB 17.7 MB/s eta 0:00:00
Installing collected packages: grpcio, anyio, transformers, pymilvus
Attempting uninstall: grpcio
Found existing installation: grpcio 1.65.1
Uninstalling grpcio-1.65.1:
Successfully uninstalled grpcio-1.65.1
Attempting uninstall: transformers
Found existing installation: transformers 4.43.1
Uninstalling transformers-4.43.1:
Successfully uninstalled transformers-4.43.1
ERROR: pip’s dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
chromadb 0.4.22 requires fastapi>=0.95.2, which is not installed.
chromadb 0.4.22 requires onnxruntime>=1.14.1, which is not installed.
chromadb 0.4.22 requires uvicorn[standard]>=0.18.3, which is not installed.
grpcio-reflection 1.62.2 requires grpcio>=1.62.2, but you have grpcio 1.58.0 which is incompatible.
grpcio-status 1.62.2 requires grpcio>=1.62.2, but you have grpcio 1.58.0 which is incompatible.
grpcio-tools 1.62.2 requires grpcio>=1.62.2, but you have grpcio 1.58.0 which is incompatible.
text-generation-server 2.0.5.dev0 requires transformers<5.0,>=4.43, but you have transformers 4.40.0 which is incompatible.
text-generation-server 2.0.5.dev0 requires typer<0.7.0,>=0.6.1, but you have typer 0.12.3 which is incompatible.
Successfully installed anyio-4.3.0 grpcio-1.58.0 pymilvus-2.3.1 transformers-4.40.0
WARNING: Running pip as the ‘root’ user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: 12. Virtual Environments and Packages — Python 3.12.5 documentation
chown: invalid group: ‘workbench:workbench’
Error: building at STEP “RUN /bin/bash /opt/project/build/postBuild.bash”: while running runtime: exit status

Hi, thanks for reaching out. Hmm, this does seem strange:

  1. In the past we learned that the base container we are using for this project (Hugging Face Text Generation Inference) doesn’t seem to be compatible with M-series Macs, so I’m assuming you are on an Intel-based Mac?
  2. The build process should automatically take care of provisioning a user called “workbench” for each project. See here for my machine–it occurs on step [3/19]:

Are you seeing anything sufficiently similar or different in your own logs? Open the build logs with Output > Build

Perhaps you can try commenting out the problematic lines here in postBuild.bash and trying again?

Hello,

I’m trying to get the basic example working as defined under “Tutorial: Using a Cloud Endpoint”.

Everything builds fine and the chatui launches. Problems start when I try to “Setup the RAG Backend”.

Here is the last few lines of the Chat log output.

pydantic.errors.PydanticSchemaGenerationError: Unable to generate pydantic-core schema for <class ‘starlette.requests.Request’>. Set arbitrary_types_allowed=True in the model_config to ignore this error or implement __get_pydantic_core_schema__ on your type to fully support it.

If you got this error by calling handler() within __get_pydantic_core_schema__ then you likely need to call handler.generate_schema(<some type>) since we do not call __get_pydantic_core_schema__ on <some type> otherwise to avoid infinite recursion.

For further information visit Redirecting...

If someone could tell me how to produce a log bundle I can upload that too. I’m just not seeing anything in the Chat logs that lead me to the solution.

Hi, thanks for reaching out. Yes, we are currently tracking an issue that is breaking some Gradio builds (GitHub issue). Seems like one workaround is to upgrade/pin the gradio package version to 4.43.0 in requirements.txt.

1 Like

Thanks! I updated the package version in requirements.txt and rebuilt. Still getting the same errors. I’ll keep following the Github issue and hopefully find a solution soon.

Use ```
gradio==4.43.0

In requirements.txt, then for the project clear cache and rebuild. Can you please provide the output from the Chat Log in the AI WOrkbench Desktop App output window.

Hey, I have pinpointed the issue after reading this thread.

At a high level, installing gradio also installs the fastapi package as a dependency. The problem is the latest version of fastapi breaks gradio which gives the error you are seeing. The solution/workaround for the time being involves pinning an older working version of fastapi.

You can work around this by doing the following:

  • Stop the project container if running
  • Open Environment > Scripts > postBuild.sh
  • Add “fastapi==0.112.2” to the end of this line in the file (I’ve already made this change in the upstream repo)
  • Clear cache and rebuild
  • Start the chat app

This worked when I reproduced the issue, let me know if it fixes things for you as well.

1 Like

(9/9/2024)

Pinned fastapi==0.112.2 in postBuild.sh to resolve a breaking change in fastapi when using gradio. Github issues for gradio are here and here.

Looks like that fixed it. Thanks for your help with this!

1 Like