RAGJet: Retrieval-Augmented Generation on Jetson Xavier AGX (LLaMA + FastApi)

Robotics & Edge Computing Jetson & Embedded Systems Jetson Projects

diogo.lima September 3, 2024, 2:00pm 1

As a proof of concept (PoC), I have specialized a large language model (LLM) on the reference documentation of the cuDNN library. This specialization enables the model to provide precise and context-aware responses specific to cuDNN. Furthermore, the model has been deployed within a FastAPI application, allowing for integration and interaction in a real-world scenario on the Jetson Xavier hardware. Github repo.

1 Like

Topic		Replies	Views
Seeking Advice on Running Quantized Large Language Models on Jetson AGX Xavier Jetson AGX Xavier generative_ai	2	789	March 19, 2024
State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU Technical Blog	1	579	April 5, 2023
Jetson AI Lab - Agent Controller LLM Jetson Projects generative_ai	1	1118	April 30, 2024
NLP Examples for Xavier NX Jetson Xavier NX jetson-inference , natural-language-processing-nlp	2	526	June 19, 2023
Want to run a Local LLM on Nvidia Jetson AGX Orin Jetson AGX Orin generative_ai	3	2849	July 17, 2024
Inference speed optimization on Jetson AGX Jetson AGX Xavier jetson-inference	3	914	February 9, 2022
비용 효율적인 LLM 라우팅을 위한 NVIDIA AI Blueprint 배포하기 Technical Blog - South Korea	1	7	April 6, 2025
LLaMa 2 LLMs w/ NVIDIA Jetson and textgeneration-web-ui Jetson Projects generative_ai	86	24145	May 10, 2024
Available with Small Language Model on tutorial Jetson Orin Nano generative_ai	3	671	May 3, 2024
Running LLMs with TensorRT-LLM on Nvidia Jetson AGX Orin Dev Kit Jetson Projects jetson , generative_ai	1	491	December 8, 2024

RAGJet: Retrieval-Augmented Generation on Jetson Xavier AGX (LLaMA + FastApi)

Related topics