Hi ALL,
Is it possible to use Kubernetes to leverage two AGX Orin(connected via ethernet) GPU resource and running the LLM inference?
Are there any references on this topic we can study ?
Hi,
Here are some suggestions for the common issues:
1. Performance
Please run the below command before benchmarking deep learning use case:
$ sudo nvpmodel -m 0
$ sudo jetson_clocks
2. Installation
Installation guide of deep learning frameworks on Jetson:
- TensorFlow: Installing TensorFlow for Jetson Platform - NVIDIA Docs
- PyTorch: Installing PyTorch for Jetson Platform - NVIDIA Docs
We also have containers that have frameworks preinstalled:
Data Science, Machine Learning, AI, HPC Containers | NVIDIA NGC
3. Tutorial
Startup deep learning tutorial:
- Jetson-inference: Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson
- TensorRT sample: Jetson/L4T/TRT Customized Example - eLinux.org
4. Report issue
If these suggestions don’t help and you want to report an issue to us, please attach the model, command/step, and the customized app (if any) with us to reproduce locally.
Thanks!
Hi,
Kubernetes should work on Orin.
But you might need a workaround for the NVIDIA Container Toolkit.
Please find the below topic for more info:
Thanks.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.