Running LMdeploy inference engine on the NVIDIA Jetson AGX Orin Devkit

Greetings to all,

Here are demo videos of running LMdeploy on the NVIDIA Jetson AGX Orin Devkit

#1 Vanilla Llama-3.1-8B-Instruct model

#2 AWQ version of Llama-3.1-8B-Instruct model

1 Like

Running INT8 Weight Quantization using LMDeploy on the NVIDIA Jetson AGX Orin:

Running VLM model Phi-3 Vision using OpenAI-Compatible Endpoint: