New demo of Jetson Orin running LLaVA vision-language models on live video streams! This multimodal pipeline has been optimized with 4-bit quantization and tuned CUDA kernels to achieve interactive latency onboard edge devices. Try it yourself with the tutorial on Jetson AI Lab!
Next up will be extracting constrained JSON output from Llava and using it to trigger user-promptable alerts/actions for always-on applications.
Jetson AI Lab: Live LLaVA 🆕 - NVIDIA Jetson Generative AI Lab
Jetson Containers: jetson-containers/packages/llm/local_llm at master · dusty-nv/jetson-containers · GitHub