A 401 error typically means a incorrect, expired, malformed or missing API Key (as opposed to a 403 error, which indicates a correct API key but with improper permissions).
Do you know if your keys have recently expired, or if you rotated them recently? You may need to regenerate your API key and try again. If you had regenerated your key for something else, all previous keys typically become invalidated.
Thanks for the follow up I can check that out. Does each model require a separate key? I tried using a key I had generated the basic Hybrid RAG demo which is working fine.
But googling and seems this might be an NVCF RUN Key when an NGC API Key is needed?
The personal keys I am generating start with __autogenerated_playgrounds with a set of numbers added. I can only save the secret in AI workbench if I enter it that way
The log file from when I start the ChatUI app is lengthy but ends with the 403 error below
File â/usr/lib/python3.10/urllib/request.pyâ, line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden
I have Secrets Manager and NGC enabled as services. I am not able to add Public API Endpoints as a service
I have continued to work on resolving this without much luck.
It seems like there is something missing in the requests.py file that prevents the successful retrieval of the URLs.
Researching the 403 errors this seems to be the case that many sites reject requests from api based applications and the solution is to add headers to make the web site believe it is just a browser based request.
I can get two different errors from the UI
if I request an http:// site I get a 401 error
if I request an https://site I get a 403 error
I wanted to try and edit that requests.py file but the demo seems to have file permissions locked down.
Any suggestions appreciated and if this is beyond the scope of this support thread please just let me know and I will only reply if I am able to resolve.
Apologies for the delay. Just did a clean clone of the project and unfortunately Iâm unable to reproduce your issue.
A 403 error typically means your key is recognized but does not have access to the resource (Build API Endpoints). Some troubleshooting tips:
Make sure you are using a key that starts with nvapi-....
If using a personal key generated from NGC, make sure you had selected API Endpoints under the scope of the key.
Make sure your key is current and not expired/rotated.
If you are still seeing an error, please provide the error stacktrace from the Chat app logs (on the bottom left corner go to Outputs > Chat from the dropdown)
I was having a similar issue uploading PDFs, using a personal NVIDIA API key and setting to Public API Endpoints seemed to do the trick (after a clean clone). I also, for safety, named mine to NVIDIA_API_KEY, to match the AI Workbench Environment.
Follow-up: How do you store the uploaded documents (PDFs) so that when you open AI Workbench the next time they are still there (and you do not need to upload them again)?
I uploaded a PDF and modified the Router prompt to best fit the information in the RAG now, but when I asked a question whose answer I knew was in one PDF, it is clear that the chat pulled the answer from the web. How do I broaden the Router prompt so that it first looks for keywords from the question in the vectorstore rather than only if specific topic areas are mentioned first? (Example: My Chat is focused on Dementia Caregiving, when a question is asked about how a specific technique works, without specifically mentioning dementia or Alzheimerâs or caregiving, how do I still get it to try to pull from the vectorstore primarily/first?)
Also: How do you store the uploaded documents (PDFs) so that when you open AI Workbench the next time they are still there (and you do not need to upload them again)?
How do I broaden the Router prompt so that it first looks for keywords from the question in the vectorstore rather than only if specific topic areas are mentioned first?
The Router prompt is fully customizable; we just provide a âsearch RAG if these topics are mentionedâ clause as an example. You can customize it however best works for your use case, eg. something like âsearch RAG if the following keywords are presentâ, etc.
In your example, the Router LLM is making an evaluation between what is in the prompt and what the user query is, so your query and prompt may not be matching up to route to RAG if you donât specifically mention certain domain-specific keywords about what it is youâre looking for.
Once you are happy with your custom prompt, feel free to solidify it in code/chatui/prompts so that it becomes the default prompt every time you open the app.
How do you store the uploaded documents (PDFs) so that when you open AI Workbench the next time they are still there (and you do not need to upload them again)?
Uploaded documents should be persistent (check the data directory). Whenever you re-open the app, the previously uploaded documents may not show up on a fresh UI since it is rendered per-session, but any previously uploaded documents are still in the vector store until you physically delete them from the data directory.
You can always empty the database by clearing out the data directory.
Thanks. I have been extremely busy too. I do believe I had the wrong type of certificate error.
I am very close to having it working.
I am doing a simple request about world series winners
I edited my router prompt to say
You are an expert at routing a user question to a vectorstore or web search. Use the vectorstore for questions about world series winners
and I have uploaded a PDF called world series winners
When I ask who won the world series in 2018 I get this in the monitor tab
who won the world series in 2018
{âdatasourceâ: âvectorstoreâ}
âROUTE QUESTION TO RAGâ
âRETRIEVEâ
âCHECK DOCUMENT RELEVANCE TO QUESTIONâ
âASSESS GRADED DOCUMENTSâ
âDECISION: ALL DOCUMENTS ARE NOT RELEVANT TO QUESTION, INCLUDE WEB SEARCHâ
âWEB SEARCHâ
âGENERATEâ
âCHECK HALLUCINATIONSâ
âDECISION: GENERATION IS GROUNDED IN DOCUMENTSâ
âGRADE GENERATION vs QUESTIONâ
I would think it would use the vector store given I have a pdf called world series winners but is doesnât seem to use it
Exception: [429] Too Many Requests
{âstatusâ: 429, âtitleâ: âToo Many Requestsâ}
The NVIDIA API Catalog recently moved from a limited-credit system to an unlimited-credit system. However, there are rate-limiting controls implemented.
Are you using the Llama or the Mixtral model? I would recommend the Llama since that model appears to be rate limited to 7 calls/sec while the Mixtral one is limited to 1 call/sec.
When I tested it myself on the public endpoints, the Mixtral was throwing the rate-limiting error, while the Llama model worked fine.
Just something to keep in mind as you are navigating the model endpoints to hit moving forward, apologies for the inconvenience.