Is the Jetson Nano Developer Kit capable of loading LLMs like LLaMA 3?

I am trying to load LLaMA 3 on the Jetson Nano, which has 4GB of VRAM. However, I am unsure if it is capable of handling such a large model. I managed to load the model after swapping 8GB of memory, but the response time was extremely slow.

I am considering upgrading the board and would like confirmation on whether the Jetson Nano is truly capable of handling and loading such a large model. Any suggestions or insights would be appreciated.

Hi,

We don’t have LLM test data for Jetson Nano.
But you can find some info with the Orin series:

Thanks.

OK, thank you. But may I know what kind of tasks the Jetson Nano is preferable for, especially when we talk about small AI tasks?

Hi,

You can check our benchmark sample.
It contains some models that can run on the Jetson Nano.

Thanks.

Hi,
Thank you so much for your reply. I have one last question: does the GPU in the Jetson Nano need to be enabled manually, or does it activate automatically? If the GPU was not enabled, could that explain the slow response time while running an LLM? I’d appreciate it if you could provide an answer.

Hi,

Sorry for the late update.

Usually, the LLM sample/source will by default run the tasks on GPU.
The long latency might be more related to memory as Nano’s resources are quite limited.

To confirm this, you can run tegrastats concurrently to check if GPU is in use (utilization > 0%).

$ sudo tegrastats

Thanks.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.