Hi! This is the support thread for the Agentic RAG Example Project on GitHub. Any major updates we push to the project will be announced here. Further, feel free to discuss, raise issues, and ask for assistance in this thread.
Please keep discussion in this thread project-related. Any issues with the Workbench application should be raised as a standalone thread. Thanks!
(8/28/2024) Hotfix pushed for an issue where users would not be able to properly upload their PDF documents and therefore would not have access to RAG generation. Fix is to pin the nltk package dependency in the environment to fix a bug with the unstructured package. See GitHub issue here for details.
Currently tracking an issue that is breaking some Gradio builds (GitHub issue). Seems like one workaround is to upgrade/pin the gradio package version to 4.43.0 in requirements.txt.
Updated the README to make it clear that to do RAG, you need to (1) upload the documents of interest AND (2) change the Router prompt to point the pipeline to the topics of your own uploaded docs.
Donât forget to do the latter! Otherwise your RAG pipeline will still be focused on the topic of those default documents we provide in the project.
Iâve got the Agentic RAG project up and running in AI Workbench, but when I try to use it, I get a âConnection errored outâ message on the UI. I checked the output in AIWB to see the full error and looks like a pydantic issue.
Hereâs a snippet:
pydantic.errors.PydanticSchemaGenerationError: Unable to generate pydantic-core schema for <class âstarlette.requests.Requestâ>. Set arbitrary_types_allowed=True in the model_config to ignore this error or implement __get_pydantic_core_schema__ on your type to fully support it.
If you got this error by calling handler() within __get_pydantic_core_schema__ then you likely need to call handler.generate_schema(<some type>) since we do not call __get_pydantic_core_schema__ on <some type> otherwise to avoid infinite recursion.
For further information visit Redirecting...
Hi, thanks for reaching out! I was unable to reproduce this on my workstation, but I was able to reproduce on my local windows laptop.
According to the Github issue here, it seems like a dependency of pydantic is acting up and pinning the fastapi package to a previous version fixes the issue. Just tried it out and it seems to work for me on my laptop.
The update has been pushed to the upstream repo. Make sure you pull the changes down to your local repo and rebuild the project. Hope this helps!
Basic question, and likely a simple answer: Under Documents, when listing Webpages, will it include all sub-pages to a URL or do you have to enter every single one in manually? If manually, any simple * to include all sub-pages? Trying to test a search/focus on a public site with hundreds of sub-ages to an agreements URL and would love to catch-all every agreement.
Hi, thanks for reaching out! At the moment, the project just does basic parsing of a list of URLs and ingests them into the vector store. This project, like all example projects in the catalog, is meant as a starting point for developers to fork, build and extend upon.
Adding in additional logic to include wildcards and subpages is definitely a logical extension to this project, so feel free to build it in!
time=â2024-10-29T04:59:52+05:30â level=warning msg=âSHELL is not supported for OCI image format, [/bin/bash -c] will be ignored. Must use docker formatâ
Temporarily starting the container so Podman can copy the container image and fix file permissions. This operation may take several minutes and will freeze Podman, i.e all Podman commands may hang during this operation. This is due to a known Podman issue. Learn more here: Troubleshoot AI Workbench - NVIDIA Docs
Assuming you are on podman, there is a limitation that requires the container to start to fix certain permissions. This step is expected and can take a few minutes to complete.
âI need to temporarily make a change while the app is running so I can test somethingâ.
You can do this in your running application. Expand the Router section on the model settings. Expand the Router prompt accordion. You can edit the textbox in-app. Once edited, just submit another query.
âI need to make a more permanent change to the template to persistâ
You open a Jupyterlab app or code editor, navigate to code/chatui/prompts and edit those files directly with your preferred prompt. Save the file. Now, every time you open the app, the default prompt is now your updated prompt.
A sample of the prompt is provided as part of the project:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are an expert at routing a user question to a vectorstore or web search. Use the vectorstore for questions on LLM agents, prompt engineering, and adversarial attacks. You do not need to be stringent with the keywords in the question related to these topics. Otherwise, use web-search. Give a binary choice âweb_searchâ or âvectorstoreâ based on the question. Your response format is non-negotiable: you must return a JSON with a single key âdatasourceâ and no premable or explanation.
Question to route: {question}
<|eot_id|><|start_header_id|>assistant<|end_header_id|>
I recommend updating this line (which currently points to the sample database provided as part of the project by default):
Use the vectorstore for questions on LLM agents, prompt engineering, and adversarial attacks.
to whatever topic you want your RAG to be, eg. whatever custom documents you want to upload to the vector database. This will point your chatbot to the retrieval pipeline when the router detects a query related to what you have stored/uploaded.
Hello, I am following along the demo AI Workbench Quickstart Hybrid RAG, but I was not able to successfully build it due to the following error:
STEP 1/32: FROM ghcr.io/huggingface/text-generation-inference:2.3.0 Trying to pull ghcr.io/huggingface/text-generation-inference:2.3.0... Error: creating build container: choosing an image from manifest list docker://ghcr.io/huggingface/text-generation-inference:2.3.0: no image found in image index for architecture "arm64", variant "v8", OS "linux"
I saw others were dealing with a similar issue as well with no fixes posted GIT ISSUES. I am on a Macbook M3 PRO with 36GB of shared memory using Podman as my container runtime.
Hi. I assume from this answer that a NIM Microservice is not a requirement to get this demo running. Just a web url and some uploaded documents (This was referring to one of the answers earlier in this thread on October 7th)
I provided a different URL and some documents in the documents tab and edited the router prompt as you suggested to test.
I am using mistralai/mixtral-8x22b-instruct-v0.1 which I have gotten approval to use on hugging face.
Have never has a question answered (I do get an error when trying to upload PDF files in the ChatUI - but the files appear to be correctly uploaded)
This log snippet is in AI workbench
File â/usr/lib/python3.10/urllib/request.pyâ, line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
And this is the error I get in the ChatUI
*** ERR: Unable to process query. Check the Monitor tab for details. ***
Exception: [401] Unauthorized
Authentication failed
Please check or regenerate your API key.
I do have both the NVIDIA API Key and the Tavily key loaded.
This is the forum support thread for the Agentic RAG example project, for future queries related to the Hybrid RAG example, please refer to the support thread here.
It looks like you are building the project on an ARM based MacOS system. Unfortunately, Hugging Face currently does not support its text generation inference container, which this project utilizes, on ARM-based Macs (see here).
Do you have a linux box you can run the project on? Fortunately AIWB makes it easy to connect your other systems as remote locations to work in.