DLI Course 'Building RAG Agents for LLMs' - Assessment Support

Hey @ricordo.yan! Sorry about the delay, and congrats on finishing!
I checked in our records and it does look like you got it! Does it not show up when you look over the web interface? My Learning | NVIDIA

Interesting. I do believe you actually. I raised the issue through platform, but was also able to find your certificate through the interface. (EDIT: Removed link and DM’d it in case you didn’t want it posted. Feel free to use it as you’d like :D)

Very sorry about that, and let me know if you can’t access it for some reason!

hi can u assist me with how to finish this asssesments? @leonardo.ti.bruno

Have you managed to solve your problem? Have same issue but cannot solve

no please feel free to let me know if u can also

Hi @vkudlay,

I managed to get it up and running, the gradio ran once completely, but said I did not achieve a high enough score to pass, but now it only runs a single pair ‘generating synthetic answer’ with no actual assessment being done, nor can I keep using gradio at that point.

Is there a way to fix this issue?

For reference, I have not edited the frontend block file at all, all i did was run code with

add_routes(
app,
llm,
path=“/basic_chat”,
)

and then two routes for respectively the generator and retriever from the chains_dict and have that hosted and started on port 9012.

I don’t really get why it worked once, and then didnt seem to work again?

Hi, can you pls help me with the solution

Hey @bkwakkel. Interesting… I’m curious if things improve when you use :8090 instead of /8090? If the issue persists, feel free to DM me and I can try a live debug.

Hi,
Thank you for introducing this course to the trainees. I have an issue passing the assessment.

I have gone through notebooks 7 and 8, added new papers from Archive and obtained score of 0.666 in notebook 8. I am now in the assessment part and I am unable to make the UI running properly.

Here are the steps I have been taking:
I did not modify anything in the course, except that I am running the following code in my 35_langserve-Copy1.ipynb notebook:

import nest_asyncio
import asyncio
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain_community.chat_models import ChatAnthropic, ChatOpenAI
from langserve import add_routes
import uvicorn
import threading
from operator import itemgetter

# Define composable operation classes
class RemoteRunnable:
    def __init__(self, url):
        self.url = url

    def run(self, input_data):
        # Validate and prepare input data
        input_data = validate_and_prepare_input(input_data)
        # Add logic to call the remote service
        print(f"Calling {self.url} with input: {input_data}")  # Add logging here
        # Simulate a response for debugging
        response = {"output": f"Response from {self.url} for input {input_data}"}
        return response

    def __call__(self, *args, **kwargs):
        return self.run(*args, **kwargs)

class RunnableAssign:
    def __init__(self, mapping):
        self.mapping = mapping

    def run(self, input_data):
        result = {}
        for key, operation in self.mapping.items():
            result[key] = operation.run(input_data)
        return result

    def __or__(self, other):
        return Chain(self, other)

class Chain:
    def __init__(self, *operations):
        self.operations = operations

    def run(self, input_data):
        data = input_data
        for operation in self.operations:
            print(f"Running operation {operation} with data: {data}")  # Add logging
            data = operation.run(data)
        return data

# Placeholder definitions
class LongContextReorder:
    def transform_documents(self, documents):
        # Transformation logic here
        print(f"Transforming documents: {documents}")  # Add logging here
        return documents

def docs2str(documents):
    result = " ".join(documents)
    print(f"Converted documents to string: {result}")  # Add logging here
    return result

def output_puller(output):
    print(f"Pulling output: {output}")  # Add logging here
    return output["output"]  # Ensure the correct key is accessed

def validate_and_prepare_input(input_data):
    # Log the input data
    print(f"Validating input data: {input_data}")
    # Ensure input_data is a dictionary
    if not isinstance(input_data, dict):
        raise ValueError("Input data must be a dictionary.")
    # Add any additional validation logic here
    return input_data

# Patch asyncio to allow nested event loops
nest_asyncio.apply()

# Explicitly set the default event loop policy to avoid using uvloop
asyncio.set_event_loop_policy(asyncio.DefaultEventLoopPolicy())

# Initialize the language model
llm = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1")

# Initialize the FastAPI app
app = FastAPI(
    title="LangChain Server",
    version="1.0",
    description="A simple API server using LangChain’s Runnable interfaces",
)

# Add routes for basic chat, retriever, and generator
add_routes(app, llm, path="/basic_chat")
add_routes(app, llm, path="/retriever")
add_routes(app, llm, path="/generator")

# Define the chains dictionary with necessary endpoints
chains_dict = {
    'basic': RemoteRunnable("http://lab:9012/basic_chat/"),
    'retriever': RemoteRunnable("http://lab:9012/retriever/"),
    'generator': RemoteRunnable("http://lab:9012/generator/"),
}

# Define the basic chain
basic_chain = chains_dict['basic']

# Define a lambda to chain operations
def chain_operations(*operations):
    def chained(input_data):
        data = input_data
        for operation in operations:
            print(f"Running operation {operation} with data: {data}")  # Add logging
            data = operation.run(data) if callable(operation) else operation(data)
        return data
    return chained

# Define the retrieval chain
retrieval_chain = RunnableAssign({
    'context': chain_operations(
        itemgetter('input'),
        chains_dict['retriever'],
        LongContextReorder().transform_documents,
        docs2str
    )
})

# Define the output chain
output_chain = RunnableAssign({"output": chains_dict['generator']}) | output_puller

# Define the retrieval-augmented generation chain
rag_chain = Chain(retrieval_chain, output_chain)

# Function to run the app in a separate thread
def run_server():
    config = uvicorn.Config(app, host="0.0.0.0", port=9012, log_level="info", loop="asyncio")
    server = uvicorn.Server(config)
    loop = asyncio.get_event_loop()
    loop.run_until_complete(server.serve())

# Start the server in a separate thread
thread = threading.Thread(target=run_server)
thread.start()

When I run basic, I obtain low score. When I run RAG, I get the attached message. Can you please help?

Any recommendation on the issue? Thanks,

Can anyone direct me to an example of the retreiver stream data being correctly passed to the generator?

results in a client-side error, rag_chain is never called.

# https://python.langchain.com/docs/langserve#server
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langserve import add_routes
from langchain_community.document_transformers import LongContextReorder
from langchain_core.runnables.passthrough import RunnableAssign
from langchain_core.runnables import RunnablePassthrough
from operator import itemgetter
from langserve import RemoteRunnable

from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
from langchain_community.vectorstores import FAISS
from langchain_core.runnables import RunnableLambda
from langchain_core.output_parsers import StrOutputParser

from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_community.chat_message_histories import ChatMessageHistory

from langchain.chains import create_retrieval_chain


#####################################################################

# NVIDIAEmbeddings.get_available_models(base_url="http://llm_client:9000/v1")
embedder = NVIDIAEmbeddings(model="nvidia/embed-qa-4", truncate="END")


llm = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1")

chat_prompt = ChatPromptTemplate.from_template(
    "You are a document chatbot. Help the user as they ask questions about documents."
    " User messaged just asked you a question: {input}\n\n"
    " The following information may be useful for your response: "
    " Document Retrieval:\n{context}\n\n"
    " (Answer only from retrieval. Only cite sources that are used. Make your response conversational)"
    "\n\nUser Question: {input}"
)

def docs2str(docs, title="Document"):
    """Useful utility for making chunks into context string. Optional, but useful"""
    out_str = ""
    for doc in docs:
        doc_name = getattr(doc, 'metadata', {}).get('Title', title)
        if doc_name: out_str += f"[Quote from {doc_name}] "
        out_str += getattr(doc, 'page_content', str(doc)) + "\n"
    return out_str

def output_puller(inputs):
    """If you want to support streaming, implement final step as a generator extractor."""
    for token in inputs:
        if token.get('output'):
            yield token.get('output')


long_reorder = RunnableLambda(LongContextReorder().transform_documents)  
docstore = FAISS.load_local("docstore_index", embedder, allow_dangerous_deserialization=True)

retrieval_chain = (
    {"context": docstore.as_retriever() | long_reorder, "input": RunnablePassthrough()}
    | chat_prompt
    | llm
    | StrOutputParser()

)

contextualize_q_system_prompt = (
    "Given a chat history and the latest user question "
    "which might reference context in the chat history, "
    "formulate a standalone question which can be understood "
    "without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)

system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
    "\n\n"
    "{context}"
)


contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}")
    ]
)

history_aware_retriever = create_history_aware_retriever(llm, docstore.as_retriever(), contextualize_q_prompt)

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)

question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

rg_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)

### Statefully manage chat history ###
store = {}


def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key="input",
    history_messages_key="chat_history",
    output_messages_key="answer",
)


app = FastAPI(
  title="LangChain Server",
  version="1.0",
  description="A simple api server using Langchain's Runnable interfaces",
)

add_routes(
    app,
    llm,
    path="/basic_chat",
)

add_routes(
    app,
    retrieval_chain,
    path="/retriever",
)

add_routes(
    app,
    rag_chain,
    path="/generator",
)

## Might be encountered if this were for a standalone python file...
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=9012)

I am getting the same error; have not been able to resolve it yet.

I was able to run the basic version and get a low score. However, when I run RAG through the gradio UI, I get this error.

1 Like

I have the same 422 error. Could you solve it?

Hello @vkudlay I had the same issue than leonardo, Do you think you could support me adding more time? my email is omar.diaz1@dell.com

Hey @omar.diaz1. I’ve added some time. Let me know if you’re still experiencing issues after a while, and I can try and provide a hint.