Originally published at: Tips for Building a RAG Pipeline with NVIDIA AI LangChain AI Endpoints | NVIDIA Technical Blog
Retrieval-augmented generation (RAG) is a technique that combines information retrieval with a set of carefully designed system prompts to provide more accurate, up-to-date, and contextually relevant responses from large language models (LLMs). By incorporating data from various sources such as relational databases, unstructured document repositories, internet data streams, and media news feeds, RAG can significantlyâŠ
Good morning,
I am trying RAG implementation.
I follow the instructions on your blog
I canât connect to: from langchain_nvidia_ai_endpoints import ChatNVIDIA
I only get it as specified by âNVIDIA API catalogâ, through from openai import OpenAI
If I make the invocation with ChatNVIDIA and the url they provide me in âNVIDIA API catalogâ
from langchain_nvidia_ai_endpoints import ChatNVIDIA
llm = ChatNVIDIA(model=âmeta/llama3-70b-instructâ, nvidia_api_key=nvapi_key)
llm.invoke(âWhat interfaces does Triton support?â)
print(result.content)
answer: âValueError: Unknown model name meta/llama3-70b-instruct specified.â available models ai-llama3-70b - a88f115a-4a47-4381-ad62-ca25dc33dc1b âŠ
If I make the invocation with model=âai-llama3-70bâ
llm = ChatNVIDIA(model=âai-llama3-70bâ, nvidia_api_key=nvapi_key)
answer: Exception: [404] Not Found \n The model gpt43b
does not exist.
Note the blog uses the model âai-llama2-70bâ
You can follow a full notebook version of the blog here: Build a RAG chain by generating embeddings for NVIDIA Triton documentation â NVIDIA Generative AI Examples 24.4.0 documentation
In order to see the supported models, call ChatNVIDIA.get_available_models()
Hi - I am facing issue with langchain using ChatNVIDIA with locally deployed LLM with NIM.
My Code:
from langchain_nvidia_ai_endpoints import ChatNVIDIA
llm = ChatNVIDIA(
base_url=âhttps://ashish-mistral-nim-deploy-1-predictor.**********/v1/completionsâ,
model=âmistralai/mistral-7b-instruct-v0.3â
)
result = llm.invoke(âWrite a ballad about LangChain.â)
This code is throwing below error
SSLError: HTTPSConnectionPool(host=âashish-mistral-nim-deploy-1-predictor.***********â, port=443): Max retries exceeded with url: /v1/completions/chat/completions (Caused by SSLError(SSLCertVerificationError(1, â[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)â)))
Am not sure how to apply the ssl certificate. Any guidaince here?
Following curl based api request works perfectly fine though.
curl --cacert test.crt -X âPOSTâ âhttps://ashish-mistral-nim-deploy-1-predictor.**********/v1/completionsâ -H âaccept: application/jsonâ -H âContent-Type: application/jsonâ -d â{âmodelâ: âmistralai/mistral-7b-instruct-v0.3â,âpromptâ: âWrite a ballad about LangChain.â,âmax_tokensâ: 64}â
Hi Ashish,
When running in LangChain you need to remove â/completionsâ from the base_url
See this line in the notebook example: llm = ChatNVIDIA(base_url=âhttp://0.0.0.0:8000/v1â, model=âmeta/llama3-8b-instructâ, temperature=0.1, max_tokens=1000, top_p=1.0)
Thanks,
Amit
Hi - Thanks for response. Your hints did not help here.
Below API Call failsâŠ
llm = ChatNVIDIA(
base_url=âhttps://ashish-mistral-nim-deploy-1-predictor.******/v1â,
model=âmistralai/mistral-7b-instruct-v0.3â
)
Error
SSLError: HTTPSConnectionPool(host=âashish-mistral-nim-deploy-1-*********â, port=443): Max retries exceeded with url: /v1/chat/completions (Caused by SSLError(SSLCertVerificationError(1, â[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)â)))
#Following piece of code works perfectly fine as well.
import requests
langchain_nvidia_mistral_api_url=âhttps://ashish-mistral-nim-deploy-1-*******/v1/completionsâ
tokenheaders={âContent-Typeâ:âapplication/jsonâ, âacceptâ: âapplication/jsonâ}
payload={âmodelâ: âmistralai/mistral-7b-instruct-v0.3â,âpromptâ: âTell me something about Langchain.â,âmax_tokensâ: 64}
response = requests.post(langchain_nvidia_mistral_api_url, json=payload, headers=tokenheaders, verify=False)
print(response)
print(response.json())
How do i pass my ssl ca certificate to langchain api call? or how can i mention verify=False in NVIDIAChat API Call? Just adding verify=False does not work.
Please consider my open query as closed. Here is the code that works for me and am posting here for benefit of others as well
- Create right chain of certificates
- mention the certificate in the api call as below
os.environ[âSSL_CERT_FILEâ]=â/usr/local/share/ca-certificates/chain.pemâ
llm = ChatNVIDIA(
base_url=âhttps://ashish-mistral-nim-deploy-1-predictor.************/v1â,
model=âmistralai/mistral-7b-instruct-v0.3â,
verify=â/usr/local/share/ca-certificates/chain.pemâ
)
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.
Thank you Ashish for sharing this solution with the developer community!