Dont understand how to finish - DLI Course ‘Building RAG Agents for LLMs’

Hi i cannot finish the Assessment
i’ve finsihed both notebooks 7,8
it is not clear and there is no informative explanations of what need to be added/solved except generic message of - u didn’t finish the assesment
@TomNVIDIA @tdahlin

Hello,

This will require one of the DLI engineers to investigate.
@tdahlin please assist when you can.

Thanks,
Tom

@user157430 I’ve forwarded this thread along to our course development team to provide you with assistance.

Hey @user157430! Thanks for reaching out.

The final assessment instructions are found at the end of notebook 8, in the Objective section:

Hi @vkudlay thanks for your response
i’ve already did all the make sure.
i’ts not clear what to do (if neccasary on the objective and evaluation) sections, and there is not specific error messages so cant debug it

I see. Just to verify, did you get the following screen?

  • If yes, then I’d like to see where things might have gone wrong. Did you maybe accidentally override the running frontend_server instance with your own deployment (which is now harder to do than it used to be, but still)
  • If no, then let me know where you are along the path and I can help out.
    image

Hi
I dont get this screen
as i share before i completed both notebooks7,8
and when i press the “Link To Gradio Frontend” at notebook 8 i get connection error when intetacting witht the chat.
but currently there is a new error in the notebook at section:" Step 3: Generating Synthetic Question-Answer Pairs"
the cell throes an exception: (attached)
exception.txt (7.1 KB)

please note the exception issue is a new issue from now
but anyway it is not clear what need to changed also except notebooks 7&8 because there is no logs or informative debugging info

@TomNVIDIA @vkudlay @tdahlin can u assist?

Hey @user157430

Completing notebooks 7 and 8 is important for understanding how to solve the assessment, but they themselves do not make the frontend service work. If you just run through notebooks 7 and 8, the “no connection” issue in the frontend is the expected result.

Please check over notebooks 3 and 3.5.

  • The end of Notebook 3 shows how to interact with the frontend and deploy endpoints for it to use.
  • Notebook 3.5 shows an example of implementing features. If you run through the process outlined in Notebook 3.5, you should be able to get a response in the frontend via the “Basic” route (i.e. your LLM should at least be working in the frontend).

To implement the “RAG” route, deploy the additional endpoints as described at the end of Notebook 3. The definitions of these are constructed over the course of Notebooks 7 and 8.

thanks for your response.
can u advice regarding the exception issuee?

Oh! I see that (actually… I’m glad the issue is now showing up since this means they changed the server specs… but I’ll need to push an update to fix the course defaults now). The following modification should fix it:

import random

num_questions = 3
synth_questions = []
synth_answers = []

## simple_prompt = ChatPromptTemplate.from_messages([('system', '{system}'), ('user', '{input}')])  ## OLD
simple_prompt = ChatPromptTemplate.from_messages([('user', 'INSTRUCTION:\n{system}\n\n\nINPUT:\n{input}')])

for i in range(num_questions):
    doc1, doc2 = random.sample(docs, 2)
    sys_msg = (
        "Use the documents provided by the user to generate an interesting question-answer pair."
        " Try to use both documents if possible, and rely more on the document bodies than the summary."
        " Use the format:\nQuestion: (good question, 1-3 sentences, detailed)\n\nAnswer: (answer derived from the documents)"
    )
    usr_msg = (
        f"Document1: {format_chunk(doc1)}\n\n"
        f"Document2: {format_chunk(doc2)}"
    )

    qa_pair = (simple_prompt | llm).invoke({'system': sys_msg, 'input': usr_msg})
    ## ...

If you’re curious, they’ve updated the mixtral-8x22b prompt template to not accept system messages. Which is the same as the mixtral original prompt template.

im not sure i understand:
i get: “Error404: Session not found.”
i put:
add_routes(
app,
llm,
path=“/basic_chat”,
)

add_routes(
app,
llm,
path=“/retriever”,
)

add_routes(
app,
llm,
path=“/generator”,
)

what i am missing?

That’s a reasonable start! The first cell you run merely writes a file (the %%writefile cell). After which point you should run the fastapi kickstart cell. (!python server_app.py). When you do that, what kinds of logs do you get? And just to confirm, the issue is happening in the “Basic” chat, not the “RAG” chat.

can u assist @tdahlin @vkudlay @TomNVIDIA
i get:
Overwriting server_app.py

Works, but will block the notebook.

!python server_app.py

Will technically work, but not recommended in a notebook.

You may be surprised at the interesting side effects…

import os

os.system(“python server_app.py &”)

INFO: Started server process [437]
INFO: Waiting for application startup.

__ ___ .__ __. _______ . _______ . ____ ____ _______
| | / \ | \ | | / ____| / || ____|| _ \ \ \ / / | ___|
| | / ^ \ | | | | | __ | (----| |__ | |_) | \ \/ / | |__ | | / /_\ \ | . | | | |
| \ \ | | | / \ / | |
| ----./ _____ \ | |\ | | |__| | .----) | | |____ | |\ \----. \ / | |____ |_______/__/ \__\ |__| \__| \______| |_______/ |_______|| _| .
| _
/ |_______|

LANGSERVE: Playground for chain “/retriever/” is live at:
LANGSERVE: │
LANGSERVE: └──> /retriever/playground/
LANGSERVE:
LANGSERVE: Playground for chain “/generator/” is live at:
LANGSERVE: │
LANGSERVE: └──> /generator/playground/
LANGSERVE:
LANGSERVE: Playground for chain “/basic_chat/” is live at:
LANGSERVE: │
LANGSERVE: └──> /basic_chat/playground/
LANGSERVE:
LANGSERVE: See all available routes at /docs/

LANGSERVE: ⚠️ Using pydantic 2.7.1. OpenAPI docs for invoke, batch, stream, stream_log endpoints will not be generated. API endpoints and playground should work as expected. If you need to see the docs, you can downgrade to pydantic 1. For example, pip install pydantic==1.10.13. See Working with Pydantic v1 while having v2 installed · Issue #10360 · tiangolo/fastapi · GitHub for details.

INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:9012 (Press CTRL+C to quit)
INFO: 127.0.0.1:50186 - “POST /basic_chat/stream HTTP/1.1” 200 OK
INFO: 127.0.0.1:50190 - “POST /retriever/stream HTTP/1.1” 200 OK
INFO: 127.0.0.1:47804 - “POST /generator/stream HTTP/1.1” 200 OK
INFO: 172.18.0.7:59872 - “POST /basic_chat/stream HTTP/1.1” 200 OK
INFO: 172.18.0.7:42562 - “POST /basic_chat/stream HTTP/1.1” 200 OK
INFO: 172.18.0.7:57824 - “POST /basic_chat/stream HTTP/1.1” 200 OK

can u assist @tdahlin @vkudlay @TomNVIDIA

Hey @user157430. This is exactly correct! The basic chain should work. The other chains need to be delivered per the specification in the front-end. Not much more I can do to help there. So instead of just sending over an LLM to try and function as all three components, the retriever route should deliver a retriever chain and the generator should deliver a generator chain.

## This is using the LLM client as your basic_chat chain when accessed in the frontend. 
add_routes(app, llm, path=“/basic_chat”)
## But you're still sending in your LLM as your retriever and generator
add_routes(app, llm, path=“/retriever”)
add_routes(app, llm, path=“/generator”)

This is part of the assignment, and I’m not sure how much more I can say without implementing it for you.

  • The frontend code shows how the routes are being incorporated server-side.
  • Notebooks 7 and 8 show how the RAG pipeline can be incorporated.
  • The assignment is to replicate the pipeline just by sending over a few things to the frontend.

hi @vkudlay please note i’ve already put this code in my earlier reply of Jun 5
that’s the reason i guess i missing something

@user157430 Yup. The main issues are in bold:

add_routes(app, llm, path=“/basic_chat”) ## ← Line 1 (this one) is ok
add_routes(app, llm, path=“/retriever”)
add_routes(app, llm, path=“/generator”)

The bolded chains should be replaced with ones that satisfy the connecting code in server_blocks.py:

## Necessary Endpoints
chains_dict = {
    'basic' : RemoteRunnable("http://lab:9012/basic_chat/"),  ## Line 1 solves this already
    'retriever' : RemoteRunnable("http://lab:9012/retriever/"),  ## Line 2
    'generator' : RemoteRunnable("http://lab:9012/generator/"),  ## Line 3
}

basic_chain = chains_dict['basic']  ## <- Your code handles this fine

## Retrieval-Augmented Generation Chain

retrieval_chain = (
    {'input' : (lambda x: x)}
    | RunnableAssign({
        'context' : itemgetter('input') 
        | chains_dict['retriever']      ## Chain from line 2 needs to slot in here
        | LongContextReorder().transform_documents
        | docs2str
    })
)

output_chain = RunnableAssign({
    "output" : chains_dict['generator']  ## Chain from line 3 needs to slot in here
}) | output_puller

rag_chain = retrieval_chain | output_chain

H @vkudlay thanks for your response.
again:
in my code i’ve already:

add_routes(app, llm, path=“/basic_chat”)
add_routes(app, llm, path=“/retriever”)
add_routes(app, llm, path=“/generator”)

and for my understanding the other code that contains: chains_dict
is generated alraedy and i dont need to modify it manually

so after both notebooks 7,8 i still get the mentioned exception

Hey @user157430. Right, I saw how you wrote your code before, and what I’m saying is that the second argument (llm ) should be replaced with something else:

That “something else” is laid out in Notebook 8.