Running large onnx model is getting killed automatically due to insufficient memory?

Hello,

I am using Jetson Orin nano with 36.4.3 firmware, Jetpack 6.2, and an SD card of size 512GB.

I am trying to run a large onnx model (say LLM) but it is getting automatically killed. It appears swap memory issue.

free -h
total used free shared buff/cache available
Mem: 7.4Gi 792Mi 6.4Gi 2.0Mi 251Mi 6.5Gi
Swap: 3.7Gi 1.0Gi 2.7Gi

2025-03-21 16:18:59.239897845 [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 81 Memcpy nodes are added to the graph main_graph for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message.
2025-03-21 16:18:59.253917028 [W:onnxruntime:, session_state.cc:1168 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2025-03-21 16:18:59.253963430 [W:onnxruntime:, session_state.cc:1170 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
Killed

Hi,

Which LLM do you use?
We recommend using the model with weight <4B as Orin Nano has relatively limited memory.

You can find our testing for different LLM models in the below link:

Thanks.

Hello @AastaLLL,

I am currently working on NLP model (of size 2GB) with 546M parameters.
Total Parameters: 546135121
Total Size: 2083.34 MB

Hi,

Could you share more information about which model you are based on?
And how do you infer it? Any quantization is applied.

For example, we need to use 4-bit group quantization (q4f16_1) to load and run a Gemma 2B model on Orin Nano.

More, could you also try to infer the model outside of the container to see if this is a docker-related issue.

Thanks.

Hi @AastaLLL, The issue is with the swap file size. I tried increasing it to 4GB, and it worked.

For anyone who is facing a similar issue, check out the Link to create/update the swap file.

Hi,

Good to know it works now.
Thanks for sharing the fix.

Thanks.