Can an NVIDIA Jetson AGX Orin 64GB developer kit run Deepseek-V3 given it’s a MoE model, that it only activates 37B of the total 671B parameters for each token during inference? Has anyone benchmarked this?
Hi,
Here are some suggestions for the common issues:
1. Performance
Please run the below command before benchmarking deep learning use case:
$ sudo nvpmodel -m 0
$ sudo jetson_clocks
2. Installation
Installation guide of deep learning frameworks on Jetson:
- TensorFlow: Installing TensorFlow for Jetson Platform - NVIDIA Docs
- PyTorch: Installing PyTorch for Jetson Platform - NVIDIA Docs
We also have containers that have frameworks preinstalled:
Data Science, Machine Learning, AI, HPC Containers | NVIDIA NGC
3. Tutorial
Startup deep learning tutorial:
- Jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson
- TensorRT sample: Jetson/L4T/TRT Customized Example - eLinux.org
4. Report issue
If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.
Thanks!
Hi @sprime01, sparse MoE still requires all the model weights to be loaded in memory, so I’m going to go with ‘unlikely’, alas we have the Deepseek-R1-Llama-70B running here.
Thanks for following up and the info. 70B it is then
@dusty_nv New Jetson user here. I have a Jetson AGX Orin and after flashing to the latest Jetpack I have a ~57G drive with 30GB left after the dustynv/mlc:r36.4.0 is downloaded. Then it starts pulling the safetensors looks like it’s rapidly going exceed the remaining drive space. Is that your experience as well? What is the recommended system requirements for running Deepseek-R1-Llama-70B?