Building RAG Agents with LLMs stack with final test

Hello everyone

Current I’m lost the way what need to be done to complete the course
Could you please assist me?

I tried many ways to solve the final assets, even copied paste from frontend microservice, even copied what I have done in notebook 7 and 8 about retriever and generate

so there is unclear requirements what exactly I need to done for retriever and generator endpoints? use already existed implementation? implement it exactly how I did in notebook 7/8?

I tried to do this, but it doesn’t work

%%writefile server_app.py
# https://python.langchain.com/docs/langserve#server
from fastapi import FastAPI
from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
from langserve import add_routes

## May be useful later
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.prompt_values import ChatPromptValue
from langchain_core.runnables import RunnableLambda, RunnableBranch, RunnablePassthrough
from langchain_core.runnables.passthrough import RunnableAssign
from langchain_community.document_transformers import LongContextReorder
from functools import partial
from operator import itemgetter

from langchain_community.vectorstores import FAISS

## TODO: Make sure to pick your LLM and do your prompt engineering as necessary for the final assessment
embedder = NVIDIAEmbeddings(model="nvidia/nv-embed-v1", truncate="END")
instruct_llm = ChatNVIDIA(model="meta/llama3-8b-instruct")

app = FastAPI(
  title="LangChain Server",
  version="1.0",
  description="A simple api server using Langchain's Runnable interfaces",
)


## Necessary Endpoints
chains_dict = {
    'basic' : RemoteRunnable("http://lab:9012/basic_chat/"),
    'retriever' : RemoteRunnable("http://lab:9012/retriever/"),
    'generator' : RemoteRunnable("http://lab:9012/generator/"),
}

basic_chain = chains_dict['basic']
retriever = chains_dict['retriever']
generator = chains_dict['generator']


## Retrieval-Augmented Generation Chain

def assert_docs(d):
    if isinstance(d, list) and len(d) and isinstance(d[0], (Document, dict)):
        return d
    gr.Warning(f"Retriever outputs should be a list of documents, but instead got {str(d)[:100]}...")
    return []


retrieval_chain = (
    {'input' : (lambda x: x)}
    | RunnableAssign(
        {'context' : itemgetter('input') 
        | chains_dict['retriever'] 
        | assert_docs
        | LongContextReorder().transform_documents
        | docs2str
    })
)

output_chain = RunnableAssign({"output" : chains_dict['generator']}) | output_puller
rag_chain = retrieval_chain | output_chain

## PRE-ASSESSMENT: Run as-is and see the basic chain in action

add_routes(
    app,
    instruct_llm,
    path="/basic_chat",
)

## ASSESSMENT TODO: Implement these components as appropriate

add_routes(
    app,
    retriever,
    path="/generator",
)

add_routes(
    app,
    generator,
    path="/retriever",
)

## Might be encountered if this were for a standalone python file...
if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=9012)

Hey @dmitrii.verbetchii. You’re pretty close actually! The solution is somewhere in between. The frontend microservice “partially” implements much of the functionality. You can see the source code of the server, and will notice how much of the pipeline is already implemented. Then, you just need to make sure that what you’re passing in slots into the pipeline correctly. Specifically, you need to deploy the routes that would be accessed via this syntax in the server:

chains_dict = {
    'basic' : RemoteRunnable("http://lab:9012/basic_chat/"),
    'retriever' : RemoteRunnable("http://lab:9012/retriever/"),
    'generator' : RemoteRunnable("http://lab:9012/generator/"),
}

To put another way, retriever endpoint launch is almost asking you to solve the following question using the syntax at the end of Notebook 3:

retrieval_chain = (
    {'input' : (lambda x: x)}
    | RunnableAssign(
        {'context' : itemgetter('input') 
        | ## <- <what goes here??>
        | assert_docs
        | LongContextReorder().transform_documents
        | docs2str
    })
)

Make sure you register:

add_routes(
app,
retriever, # This should be your retriever logic
path=“/retriever”,
)

add_routes(
app,
generator, # This should be your generator logic
path=“/generator”,
)