[SUPPORT] Workbench Example Project: Agentic RAG

Hi! This is the support thread for the Agentic RAG Example Project on GitHub. Any major updates we push to the project will be announced here. Further, feel free to discuss, raise issues, and ask for assistance in this thread.

Please keep discussion in this thread project-related. Any issues with the Workbench application should be raised as a standalone thread. Thanks!

1 Like

Not going to lie. This one is my favorite.

(8/26/2024) Readme updated with deeplinking

(8/28/2024) Hotfix pushed for an issue where users would not be able to properly upload their PDF documents and therefore would not have access to RAG generation. Fix is to pin the nltk package dependency in the environment to fix a bug with the unstructured package. See GitHub issue here for details.

Currently tracking an issue that is breaking some Gradio builds (GitHub issue). Seems like one workaround is to upgrade/pin the gradio package version to 4.43.0 in requirements.txt.

1 Like

Updated the README to make it clear that to do RAG, you need to (1) upload the documents of interest AND (2) change the Router prompt to point the pipeline to the topics of your own uploaded docs.

Don’t forget to do the latter! Otherwise your RAG pipeline will still be focused on the topic of those default documents we provide in the project.

I’ve got the Agentic RAG project up and running in AI Workbench, but when I try to use it, I get a “Connection errored out” message on the UI. I checked the output in AIWB to see the full error and looks like a pydantic issue.

Here’s a snippet:

pydantic.errors.PydanticSchemaGenerationError: Unable to generate pydantic-core schema for <class ‘starlette.requests.Request’>. Set arbitrary_types_allowed=True in the model_config to ignore this error or implement __get_pydantic_core_schema__ on your type to fully support it.
If you got this error by calling handler() within __get_pydantic_core_schema__ then you likely need to call handler.generate_schema(<some type>) since we do not call __get_pydantic_core_schema__ on <some type> otherwise to avoid infinite recursion.
For further information visit Redirecting...

Hi, thanks for reaching out! I was unable to reproduce this on my workstation, but I was able to reproduce on my local windows laptop.

According to the Github issue here, it seems like a dependency of pydantic is acting up and pinning the fastapi package to a previous version fixes the issue. Just tried it out and it seems to work for me on my laptop.

The update has been pushed to the upstream repo. Make sure you pull the changes down to your local repo and rebuild the project. Hope this helps!

(10/02) Updated deep link landing page

Basic question, and likely a simple answer: Under Documents, when listing Webpages, will it include all sub-pages to a URL or do you have to enter every single one in manually? If manually, any simple * to include all sub-pages? Trying to test a search/focus on a public site with hundreds of sub-ages to an agreements URL and would love to catch-all every agreement.

Hi, thanks for reaching out! At the moment, the project just does basic parsing of a list of URLs and ingests them into the vector store. This project, like all example projects in the catalog, is meant as a starting point for developers to fork, build and extend upon.

Adding in additional logic to include wildcards and subpages is definitely a logical extension to this project, so feel free to build it in!

Copy that. Thank you. I’m a nube to this world and coding so likely not a development I can handle, but one I will look into. Appreciate you!