I created an environment using Conda in Jetson AGX Orin to run MiniCPM - O 2.6. In this virtual environment, the torch package I downloaded is the pre - compiled wheel file by NV, “torch - 2.5.0 - cp310 - cp310 - linux_aarch64.whl”. Torch can detect CUDA, but why isn’t the GPU effectively utilized when running the model?
Hi,
The PyTorch on the conda environment has used the GPU (33%-44%).
The percentage of GPU utilization relies on the implementation.
It’s recommended to convert the model into ONNX and deploy it with TensorRT.
TensorRT contains some optimal algorithms for the Jetson GPU.
Thanks.
Thank you very much. From your advice, i think that conda is not the factor which causes the low efficiency of GPU utilization. I once used the jetson-containers ,and use ollama to deploy the LLM, the GPU utilization is very high, but i don’t know the reason.
My friend , offer you my sincere thanks
Hi,
This is related to the implementation and the way to use GPU.
We have put lots of effort into optimizing the usage of GPU.
So that’s why we recommend using TensorRT or TensorRT-LLM to run a model on Jetson.
Thanks.
Thank you very much. i will try it
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.