Difficulty with Building RAG Agents with LLMs (Gradio)

Hi,

I am working on the course ‘Building RAG Agents with LLMs’ and I am having trouble finishing out the course to the point where I even referenced the solutions and it still isn’t working. My issues are the exact same as referenced here:

However, I still can’t seem get it working. I even decided to start the whole course from scratch using just the provided solutions and it still doesn’t seem to work. When I connect to port 8090 in Gradio, and type a message, I get the error ‘Gradio Stream failed: [Errno 111] Connection refused’

One thing I’d like to note is that I was never able to get the activities on port 9012 working either. I figured since it wasn’t necessary for the assessment, it wouldn’t be a big deal (I’ll be spending my time trying to understand this until I get a response).

I had port 8090 working the first time without using the solutions, I had just forgot to add an up to date article. Now, regardless of whether there’s an up to date article to read from, I keep getting the same error. Any help would be appreciated!

Update: I was able to get some activities on port 9012 working. I am able to run 9012/basic_chat and 9012:/generator with the following code

However, when I try to call the .invoke() method on any of them (specifically the 9012/retriever endpoint since I believe that is what is needed) I get the following error: Client error ‘404 Not Found’ for url ‘http://0.0.0.0:9012/retriever/invoke

I’d also like to note that the Basic chat option is now working on port 8090, but when I switch to RAG, I get “Expected response header Content-Type to contain 'text/event-stream, got application/json”

If anyone sees anything I should adjust, please let me know.

I have encountered exactly same problem with you!!!

@luziferangle @malley
Hey yall! Sorry about the delay; feel free to @ me directly as necessary.
A blanket recommendation is to try to replicate the RemoteRunnable’s usage in the frontend python server, and see if it still works if the chain were inserted completely as-is. For example, if the frontend server has the usage:

chain = some_prompt | remote_chain | StrOutputParser()

then the chain that is being deployed should just be the LLM.

I also feel like maybe invoke isn’t directly implemented for just the retriever component (didn’t confirm, just gut feeling). Its use in a chain uses the default call as part of the invocation passthrough. Perhaps you’ll have better luck if you bake the RemoteRunnable in a RunnableLambda?

Hi!!! My hero, you’ve finally arrived. The issues I’ve encountered are outlined in this article. Would you please lend me a hand? Thanks ; )

Hi @vkudlay I’ve been trying to reach out to the dli help desk via email. Unfortunately I reached the max number of restarts for this certification despite only using 12/32 hours on the GPUs. Is there anyone else I can contact to have this overridden? I still have at least 15 hours of time left to try to complete the assessment and I’d really like to figure out what I am doing wrong. Currently I do not have access.