Build an Agentic RAG Pipeline with Llama 3.1 and NVIDIA NeMo Retriever NIMs

Originally published at: https://developer.nvidia.com/blog/build-an-agentic-rag-pipeline-with-llama-3-1-and-nvidia-nemo-retriever-nims/

Employing retrieval-augmented generation (RAG) is an effective strategy for ensuring large language model (LLM) responses are up-to-date and not hallucinated.  While various retrieval strategies can improve the recall of documents for generation, there is no one-size-fits-all approach. The retrieval pipeline depends on your data, from hyperparameters like the chunk size, and number of documents returned,…