As a proof of concept (PoC), I have specialized a large language model (LLM) on the reference documentation of the cuDNN library. This specialization enables the model to provide precise and context-aware responses specific to cuDNN. Furthermore, the model has been deployed within a FastAPI application, allowing for integration and interaction in a real-world scenario on the Jetson Xavier hardware. Github repo.
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Seeking Advice on Running Quantized Large Language Models on Jetson AGX Xavier | 2 | 789 | March 19, 2024 | |
State-of-the-Art Language Modeling Using Megatron on the NVIDIA A100 GPU | 1 | 579 | April 5, 2023 | |
Jetson AI Lab - Agent Controller LLM | 1 | 1118 | April 30, 2024 | |
NLP Examples for Xavier NX | 2 | 526 | June 19, 2023 | |
Want to run a Local LLM on Nvidia Jetson AGX Orin | 3 | 2849 | July 17, 2024 | |
Inference speed optimization on Jetson AGX | 3 | 914 | February 9, 2022 | |
비용 효율적인 LLM 라우팅을 위한 NVIDIA AI Blueprint 배포하기 | 1 | 7 | April 6, 2025 | |
LLaMa 2 LLMs w/ NVIDIA Jetson and textgeneration-web-ui | 86 | 24145 | May 10, 2024 | |
Available with Small Language Model on tutorial | 3 | 671 | May 3, 2024 | |
Running LLMs with TensorRT-LLM on Nvidia Jetson AGX Orin Dev Kit | 1 | 491 | December 8, 2024 |