DLI Course "Building RAG Agents for LLMs" - Help With Assessment

Hello,

I’m at the end of the “Building RAG Agents for LLM” DLI course and need some help with the final assessment to get credit for the course.

My understanding is that while we’re in the course environment, we…

  1. Update the vector store with an Arxiv paper less than 30 days old
  2. Launch the Gradio UI with new RAG components
  3. Click “Evaluate” within the Gradio UI and hopefully pass this
  4. Click “Assess Task” from where the course is launched

Is that correct so far?

When I launch the UI at http://<>.aws.labs.courses.nvidia.com:8090 I get this error:

  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/usr/local/lib/python3.11/site-packages/pydantic/v1/main.py", line 716, in validate
    |     value_as_dict = dict(value)
    |                     ^^^^^^^^^^^
    | ValueError: dictionary update sequence element #0 has length 1; 2 is required
    | 
    | The above exception was the direct cause of the following exception:
    | 
    | Traceback (most recent call last):
    |   File "/usr/local/lib/python3.11/site-packages/langserve/api_handler.py", line 870, in stream
    |     config, input_ = await self._get_config_and_input(
    |                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/usr/local/lib/python3.11/site-packages/langserve/api_handler.py", line 639, in _get_config_and_input
    |     input_ = schema.validate(body.input)
    |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |   File "/usr/local/lib/python3.11/site-packages/pydantic/v1/main.py", line 718, in validate
    |     raise DictError() from e
    | pydantic.v1.errors.DictError: value is not a valid dict
    | 
    | The above exception was the direct cause of the following exception:
    | 
    | Traceback (most recent call last):
    |   File "/usr/local/lib/python3.11/site-packages/sse_starlette/sse.py", line 258, in wrap
    |     await func()
    |   File "/usr/local/lib/python3.11/site-packages/sse_starlette/sse.py", line 245, in stream_response
    |     async for data in self.body_iterator:
    |   File "/usr/local/lib/python3.11/site-packages/langserve/api_handler.py", line 902, in _stream
    |     raise AssertionError(
    | AssertionError: Internal server error
+------------------------------------

Would anyone be able to help me with this? Here’s my approach:

%%writefile server_app.py
# https://python.langchain.com/docs/langserve#server
from fastapi import FastAPI
from langserve import add_routes

from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
from langchain.prompts import ChatPromptTemplate

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda, RunnableBranch
from langchain_core.runnables.passthrough import RunnableAssign
from langchain.document_transformers import LongContextReorder
from langchain_community.vectorstores import FAISS

from operator import itemgetter

app = FastAPI(
  title="LangChain Server",
  version="1.0",
  description="A simple api server using Langchain's Runnable interfaces",
)

llm = ChatNVIDIA(model='mixtral_8x7b')

chat_prompt = ChatPromptTemplate.from_messages([("system",
    "You are a document chatbot. Help the user as they ask questions about documents."
    " User messaged just asked you a question: {input}\n\n"
    " The following information may be useful for your response: "
    " Document Retrieval:\n{context}\n\n"
    " (Answer only from retrieval. Only cite sources that are used. Make your response conversational)"
), ('user', '{input}')])

embedder = NVIDIAEmbeddings(model='nvolveqa_40k')

docstore = FAISS.load_local("docstore_index", embedder)
docs = list(docstore.docstore._dict.values())

def docs2str(docs, title="Document"):
    """Useful utility for making chunks into context string. Optional, but useful"""
    out_str = ""
    for doc in docs:
        doc_name = getattr(doc, 'metadata', {}).get('Title', title)
        if doc_name: out_str += f"[Quote from {doc_name}] "
        out_str += getattr(doc, 'page_content', str(doc)) + "\n"
    return out_str

def output_puller(inputs):
    """"Output generator. Useful if your chain returns a dictionary with key 'output'"""
    for token in inputs:
        if token.get('output'):
            yield token.get('output')

long_reorder = RunnableLambda(LongContextReorder().transform_documents)  ## GIVEN
context_getter = itemgetter('input') | docstore.as_retriever() | long_reorder | docs2str
retrieval_chain = {'input' : (lambda x: x)} | RunnableAssign({'context' : context_getter})

generator_chain = RunnableAssign({"output" : chat_prompt | llm })  ## TODO
generator_chain = generator_chain | output_puller  ## GIVEN

rag_chain = retrieval_chain | generator_chain

add_routes(
    app,
    llm,
    path="/basic_chat",
)

add_routes(
    app,
    retrieval_chain,
    path="/retriever",
)

add_routes(
    app,
    generator_chain,
    path="/generator",
)

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=9012)

@nreamaro Heyo! Please feel free to @ me directly. I don’t usually check the forums ( -_-').

W.r.t. the chain misbehaving in the front-end, try checking out the frontend python server and see how the remoterunnable is being used. You’ll see that it implements most of the important orchestration features as part of its implementation, so some of the efforts (i.e. variable naming, output parsing, etc) get duplicated.

Hello, I seem to be having similar problems to other people.

I have the following code in my server_app.py

%%writefile server_app.py

🦜️🏓 LangServe | 🦜️🔗 LangChain

from fastapi import FastAPI
from langserve import add_routes

from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
from langchain.prompts import ChatPromptTemplate

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda, RunnableBranch
from langchain_core.runnables.passthrough import RunnableAssign
from langchain.document_transformers import LongContextReorder
from langchain_community.vectorstores import FAISS

from operator import itemgetter
llm = ChatNVIDIA(model=“ai-mixtral-8x7b-instruct”)

app = FastAPI(
title=“LangChain Server”,
version=“1.0”,
description=“A simple api server using Langchain’s Runnable interfaces”,
)

chat_prompt = ChatPromptTemplate.from_messages([(“system”,
“You are a document chatbot. Help the user as they ask questions about documents.”
" User messaged just asked you a question: {input}\n\n"
" The following information may be useful for your response: "
" Document Retrieval:\n{context}\n\n"
" (Answer only from retrieval. Only cite sources that are used. Make your response conversational)"
), (‘user’, ‘{input}’)])

embedder = NVIDIAEmbeddings(model=‘ai-embed-qa-4’)

docstore = FAISS.load_local(“docstore_index”, embedder, allow_dangerous_deserialization=True)
docs = list(docstore.docstore._dict.values())

def docs2str(docs, title=“Document”):
“”“Useful utility for making chunks into context string. Optional, but useful”“”
out_str = “”
for doc in docs:
doc_name = getattr(doc, ‘metadata’, {}).get(‘Title’, title)
if doc_name: out_str += f"[Quote from {doc_name}] "
out_str += getattr(doc, ‘page_content’, str(doc)) + “\n”
return out_str

def output_puller(inputs):
“”““Output generator. Useful if your chain returns a dictionary with key ‘output’””"
for token in inputs:
if token.get(‘output’):
yield token.get(‘output’)

long_reorder = RunnableLambda(LongContextReorder().transform_documents) ## GIVEN
context_getter = itemgetter(‘input’) | docstore.as_retriever() | long_reorder | docs2str
retrieval_chain = {‘input’ : (lambda x: x)} | RunnableAssign({‘context’ : context_getter})

generator_chain = RunnableAssign({“output” : chat_prompt | llm }) ## TODO
generator_chain = generator_chain | output_puller ## GIVEN

rag_chain = retrieval_chain | generator_chain

add_routes(
app,
llm,
path=“/basic_chat”,
)

add_routes(
app,
retrieval_chain,
path=“/retriever”,
)

Might be encountered if this were for a standalone python file…

if name == “main”:
import uvicorn
uvicorn.run(app, host=“0.0.0.0”, port=9012)

and the following code in my sample from end interface.

from langserve import RemoteRunnable
from langchain_core.output_parsers import StrOutputParser

llm = RemoteRunnable(“http://0.0.0.0:9012/basic_chat/”) | StrOutputParser()

for token in llm.stream(“Hello World! How is it going?”):
print(token, end=‘’)

retriever = RemoteRunnable(“http://0.0.0.0:9012/retriever/”) | StrOutputParser()

retriever.invoke({“input”: “Tell me something interesting”})

I am getting the following error.

pydantic.v1.errors.DictError: value is not a valid dict
INFO: 127.0.0.1:41344 - “POST /basic_chat/stream HTTP/1.1” 200 OK
INFO: 127.0.0.1:41346 - “POST /retriever/invoke HTTP/1.1” 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File “/usr/local/lib/python3.11/site-packages/uvicorn/protocols/http/httptools_impl.py”, line 426, in run_asgi
result = await app( # type: ignore[func-returns-value]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/uvicorn/middleware/proxy_headers.py”, line 84, in call
return await self.app(scope, receive, send)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/fastapi/applications.py”, line 1054, in call
await super().call(scope, receive, send)
File “/usr/local/lib/python3.11/site-packages/starlette/applications.py”, line 116, in call
await self.middleware_stack(scope, receive, send)
File “/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py”, line 186, in call
raise exc
File “/usr/local/lib/python3.11/site-packages/starlette/middleware/errors.py”, line 164, in call
await self.app(scope, receive, _send)
File “/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py”, line 62, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File “/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py”, line 55, in wrapped_app
raise exc
File “/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py”, line 44, in wrapped_app
await app(scope, receive, sender)
File “/usr/local/lib/python3.11/site-packages/starlette/routing.py”, line 746, in call
await route.handle(scope, receive, send)
File “/usr/local/lib/python3.11/site-packages/starlette/routing.py”, line 288, in handle
await self.app(scope, receive, send)
File “/usr/local/lib/python3.11/site-packages/starlette/routing.py”, line 75, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File “/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py”, line 55, in wrapped_app
raise exc
File “/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py”, line 44, in wrapped_app
await app(scope, receive, sender)
File “/usr/local/lib/python3.11/site-packages/starlette/routing.py”, line 70, in app
response = await func(request)
^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/fastapi/routing.py”, line 299, in app
raise e
File “/usr/local/lib/python3.11/site-packages/fastapi/routing.py”, line 294, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/fastapi/routing.py”, line 191, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langserve/server.py”, line 481, in invoke
return await api_handler.invoke(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langserve/api_handler.py”, line 729, in invoke
output = await self.runnable.ainvoke(input, config=config)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/runnables/base.py”, line 2536, in ainvoke
input = await step.ainvoke(
^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/runnables/passthrough.py”, line 483, in ainvoke
return await self._acall_with_config(self._ainvoke, input, config, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/runnables/base.py”, line 1675, in _acall_with_config
output: Output = await asyncio.create_task(coro, context=context) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/runnables/passthrough.py”, line 470, in _ainvoke
**await self.mapper.ainvoke(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/runnables/base.py”, line 3174, in ainvoke
results = await asyncio.gather(
^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/runnables/base.py”, line 2536, in ainvoke
input = await step.ainvoke(
^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py”, line 227, in ainvoke
return await self.aget_relevant_documents(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py”, line 384, in aget_relevant_documents
raise e
File “/usr/local/lib/python3.11/site-packages/langchain_core/retrievers.py”, line 377, in aget_relevant_documents
result = await self._aget_relevant_documents(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/vectorstores.py”, line 716, in _aget_relevant_documents
docs = await self.vectorstore.asimilarity_search(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/faiss.py”, line 555, in asimilarity_search
docs_and_scores = await self.asimilarity_search_with_score(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/faiss.py”, line 436, in asimilarity_search_with_score
embedding = await self._aembed_query(query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/faiss.py”, line 160, in _aembed_query
return await self.embedding_function.aembed_query(text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/embeddings/embeddings.py”, line 25, in aembed_query
return await run_in_executor(None, self.embed_query, text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/site-packages/langchain_core/runnables/config.py”, line 514, in run_in_executor
return await asyncio.get_running_loop().run_in_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/local/lib/python3.11/concurrent/futures/thread.py”, line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/dli/ai-endpoints/langchain_nvidia_ai_endpoints/embeddings.py”, line 76, in embed_query
return self._embed([text], model_type=self.model_type or “query”)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/dli/ai-endpoints/langchain_nvidia_ai_endpoints/embeddings.py”, line 60, in _embed
response = self.client.get_req(
^^^^^^^^^^^^^^^^^^^^
File “/dli/ai-endpoints/langchain_nvidia_ai_endpoints/_common.py”, line 394, in get_req
response, session = self._post(invoke_url, payload)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/dli/ai-endpoints/langchain_nvidia_ai_endpoints/_common.py”, line 215, in _post
self._try_raise(response)
File “/dli/ai-endpoints/langchain_nvidia_ai_endpoints/_common.py”, line 305, in _try_raise
raise Exception(f"{header}\n{body}") from None
Exception: [400] Bad Request
Inference error
RequestID: fb314007-aad2-483d-9b0d-b92a56cbc51c

I have tried a lot of different combinations but I am still struggling to get teh retriever working.

Any help would be great.

OK, after a long few hours I have now got all this working. So, for my 35_langserve.ipynb I have

%%writefile server_app.py

🦜️🏓 LangServe | 🦜️🔗 LangChain

from fastapi import FastAPI
from langserve import add_routes

from langchain_nvidia_ai_endpoints import ChatNVIDIA, NVIDIAEmbeddings
from langchain.prompts import ChatPromptTemplate

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableLambda, RunnableBranch
from langchain_core.runnables.passthrough import RunnableAssign
from langchain.document_transformers import LongContextReorder
from langchain_community.vectorstores import FAISS

from operator import itemgetter

app = FastAPI(
title=“LangChain Server”,
version=“1.0”,
description=“A simple api server using Langchain’s Runnable interfaces”,
)

llm = ChatNVIDIA(model=‘ai-mixtral-8x7b-instruct’)

chat_prompt = ChatPromptTemplate.from_messages([(“system”,
“You are a document chatbot. Help the user as they ask questions about documents.”
" User messaged just asked you a question: {input}\n\n"
" The following information may be useful for your response: "
" Document Retrieval:\n{context}\n\n"
" (Answer only from retrieval. Only cite sources that are used. Make your response conversational)"
), (‘user’, ‘{input}’)])

embedder = NVIDIAEmbeddings(model=‘ai-embed-qa-4’)

docstore = FAISS.load_local(“docstore_index”, embedder, allow_dangerous_deserialization=True)
docs = list(docstore.docstore._dict.values())

def docs2str(docs, title=“Document”):
“”“Useful utility for making chunks into context string. Optional, but useful”“”
out_str = “”
for doc in docs:
doc_name = getattr(doc, ‘metadata’, {}).get(‘Title’, title)
if doc_name: out_str += f"[Quote from {doc_name}] "
out_str += getattr(doc, ‘page_content’, str(doc)) + “\n”
return out_str

def output_puller(inputs):
“”““Output generator. Useful if your chain returns a dictionary with key ‘output’””"
for token in inputs:
if token.get(‘output’):
yield token.get(‘output’)

long_reorder = RunnableLambda(LongContextReorder().transform_documents) ## GIVEN
context_getter = itemgetter(‘input’) | docstore.as_retriever() | long_reorder | docs2str
retrieval_chain = {‘input’ : (lambda x: x)} | RunnableAssign({‘context’ : context_getter})

generator_chain = RunnableAssign({“output” : chat_prompt | llm }) ## TODO
generator_chain = generator_chain | output_puller ## GIVEN

rag_chain = retrieval_chain | generator_chain

add_routes(
app,
llm,
path=“/basic_chat”,
)

add_routes(
app,
retrieval_chain,
path=“/retriever”,
)

add_routes(
app,
generator_chain,
path=“/generator”,
)

add_routes(
app,
rag_chain,
path=“/rag”,
)

if name == “main”:
import uvicorn
uvicorn.run(app, host=“0.0.0.0”, port=9012)

====================================================================
I then run my langserve

Works, but will block the notebook.

!python server_app.py

====================================================================
I can then use this same code to access the different routes

from langserve import RemoteRunnable
from langchain_core.output_parsers import StrOutputParser

llm = RemoteRunnable(“http://0.0.0.0:9012/basic_chat/”) | StrOutputParser()
for token in llm.stream(“Hello World! How is it going?”):
print(token, end=‘’)

from langserve import RemoteRunnable
from langchain_core.output_parsers import StrOutputParser

retriever = RemoteRunnable(“http://0.0.0.0:9012/retrieval/”) | StrOutputParser()
for token in llm.stream(“Tell me something about attention is all you need”):
print(token, end=‘’)

generator = RemoteRunnable(“http://0.0.0.0:9012/generator/”) | StrOutputParser()
for token in llm.stream(“Tell me something spatialVLM”):
print(token, end=‘’)

All of these endpoints work fine.

====================================================================

I then open up my 01_microservices Gradio interface

%%js
var url = ‘http://’+window.location.host+‘:8090’;
element.innerHTML = ‘

< Link To Gradio Frontend >

’;

====================================================================

In gradio when Basic is checked I can put in “Tell me about spatialLVM” and it works. When I select RAG I get the following error:

Gradio Stream failed: Internal Server Error

When I click the evaluate button in with any of the checked radio buttions I get the following error message.

Please upload a fresh paper (<30 days) inside your saved docstore_index directory that so we can ask our chain some questions

In the docstore_index directory I have index.faiss and index pkl which has a paper in is within 30 days. I was not sure whether this paper had to be vectorised or just the pdf placed in the directory so i also added a pdf paper in the directory as well.

Is there something that I am doing wrong

Kind Regards

Paul