Hi,
I have deployed nv-rerankqa-mistral-4b-v3:1.0.1 model using helm chart available in NGC Catalog. Deployment has been successful.
Status of deployed model
curl -X ‘GET’ ‘http://10.209.219.165:31893/v1/health/ready’
{“ready”:true}
I am also able to query using curl
curl -X ‘POST’
‘http://10.209.219.165:31893/v1/ranking’
-H ‘accept: application/json’
-H ‘Content-Type: application/json’
-d ‘{
“query”: {“text”: “which way should i go?”},
“model”: “nvidia/nv-rerankqa-mistral-4b-v3”,
“passages”: [
{
“text”: “two roads diverged in a yellow wood, and sorry i could not travel both and be one traveler, long i stood and looked down one as far as i could to where it bent in the undergrowth;”
},
{
“text”: “then took the other, as just as fair, and having perhaps the better claim because it was grassy and wanted wear, though as for that the passing there had worn them really about the same,”
},
{
“text”: “and both that morning equally lay in leaves no step had trodden black. oh, i marked the first for another day! yet knowing how way leads on to way i doubted if i should ever come back.”
}
]
}’
Output of execution of above script is as follows
{“rankings”:[{“index”:0,“logit”:0.93212890625},{“index”:2,“logit”:-3.07421875},{“index”:1,“logit”:-4.9921875}]}
However, when i am using langchain apis for reranking then i am getting error
vector_store = Milvus(embedding_function=query_embedder, connection_args={“host”: “10.209.219.165”, “port”: “32060”}, collection_name=“LangChainCollection” )
retriever = vector_store.as_retriever()
reranker = NVIDIARerank(base_url=“http://10.209.219.165:31893/v1",model="nvidia/nv-rerankqa-mistral-4b-v3”)
reranking_retriever = ContextualCompressionRetriever(base_compressor=reranker, base_retriever=retriever)
llm = ChatNVIDIA(base_url=“https://ashish-mistralai-deployment-1/v1",model="mistralai/mistral-7b-instruct-v0.3”)
chain = ({“context”: reranking_retriever, “question”: RunnablePassthrough()}
| prompt_template
| llm
| StrOutputParser()
)
chain.invoke(“Tell me something about langchain”)
Error is as follows
2024-08-12T15:15:59Z INFO: uvicorn.access - 10.209.219.165:59328 - “GET /health HTTP/1.1” 200
2024-08-12T15:15:59Z INFO: uvicorn.access - 10.209.219.165:59328 - “GET /health HTTP/1.1” 200
2024-08-12T15:16:00Z ERROR: root - Type: 404: Unknown model. Available models are: [‘nvidia/nv-rerankqa-mistral-4b-v3’]
2024-08-12T15:16:00Z ERROR: root - Type: 404: Unknown model. Available models are: [‘nvidia/nv-rerankqa-mistral-4b-v3’]
10.209.219.165:4828 - “POST /v1/ranking HTTP/1.1” 404
2024-08-12T15:16:00Z INFO: uvicorn.access - 10.209.219.165:4828 - “POST /v1/ranking HTTP/1.1” 404
2024-08-12T15:16:00Z INFO: uvicorn.access - 10.209.219.165:4828 - “POST /v1/ranking HTTP/1.1” 404