Regarding the issue with memory usage on jtop

Hi everyone,
Could you please help me understand what “used” and “GPU sh” in jtop mean? I noticed that when my service starts, the GPU sh shows 5.4GB, but the “used” value doesn’t increase by the same 5.4GB. Why is this happening?

From my understanding, “used” represents the memory usage (CPU + GPU). So, when my GPU usage increases by 5.4GB, I would expect the “used” value to also increase by 5.4GB. However, this doesn’t seem to be the case. Can anyone explain why?


SW : jetpack6
HW : AGX orin 64GB

Hi,
GPU Sh stands for GPU shared memory. Please check

GitHub - rbonghi/jetson_stats: 📊 Simple package for monitoring and control your NVIDIA Jetson [Orin, Xavier, Nano, TX] series

And the code is here:

jetson_stats/jtop/core/memory.py at 45dbf289596db88c0a27edf21a901d6f12bcb850 · rbonghi/jetson_stats · GitHub

So the shared memory looks to be static and is not exposed to GPU when no GPU task is running.

Hi @DaneLLL
So, the shared memory seems to be static and not exposed to the GPU when no GPU tasks are running. Does this mean that even though memory has already been allocated to the GPU shared memory, it won’t show up as ‘used’ if nothing is actively running? Has it actually already been allocated? In my case, my service first loads the LLM—wouldn’t that count as being ‘in use’?

Hi,
Would be great if you can share the steps to show the memory usage on developer kit. So that we can replicate it and do further analysis.

Hi @DaneLLL
These are steps:

  1. Run Docker to load an LLM model.
  2. Stop the Docker container.
  3. Use jtop to observe changes in memory usage.

Based on my observations, Figure 1 shows memory usage without loading the LLM, while Figure 2 shows the usage after loading the LLM. As you can see, there is a significant increase in GPU shared memory (GPU sh). However, the increase in used memory is not proportional to the rise in GPU shared memory.
Figure 1

Figure 2

Hi,

These are rough and it would be great if you can share the commands directly.

@DaneLLL

  • Run Docker to load an LLM model:
    Use the following code to load the model:
load_model(
    model_name="hugging-quants/Meta-Llama-3.1-8B-Instruct-AWQ-INT4",
    device='cuda',
    torch_dtype=torch.float16,
    quantization_config=quantization_config
)
  • Stop the Docker container:
    Run the following command to stop the container:
docker rm -f <container name>
  • Monitor memory usage with jtop:
    Open jtop and switch to the MEM tab to observe memory usage changes.

Hi @kerokerokero

Thank you for your detailed post. I created jtop and am happy to support you as much as possible :-)

The memory usage is defined from these rules: jtop - jetson-stats 4.3.0

The Shared RAM for the NVIDIA Jetson refers to the RAM utilized by the GPU. This value is directly accessible from the board, while other RAM measurements are obtained from the file /proc/meminfo for each field.

Name Description
tot MemTotal
used MemTotal - (Buffers + Cached)
free MemFree
buffers Buffers
cached Cached + SReclaimable

Let me know if I helped you.

Best,
Raffaello

Hi @Raffaello
Based on the formula you provided, it seems that “Used” does not include “GPU sh,” which means my actual memory usage might be “Used + GPU sh,” right? What I’d like to understand is how to interpret these memory statistics to determine the actual resources required to run my services. For example, in the following case:

  • Used: 28.1 G
  • GPU sh: 12.3 G
  • Buffers: 287 M
  • Cached: 15.1 G
  • Free: 18.2 G
  • TOT: 61.4 G

According to the formula in the code:
Used = TOT - Free - (Buffers + Cached) = 61.4 - 18.2 - (0.287 + 15.1)

This raises some questions:

  1. It seems that “Used” cannot fully represent the memory required to run services. Would the actual required memory be Used + GPU sh?
  2. If not, does “GPU sh” mean memory is allocated but not actively used?
  3. If “GPU sh” is actively used, where would it be reflected? For example, would “Free” decrease or “Cached” increase?
  4. If there isn’t enough memory available for “GPU sh” allocation, would the program crash?

I apologize for the many questions—it’s a bit overwhelming to figure this out.

Hi @kerokerokero

I’m sorry, but could you please open a GitHub issue on jetson-stats? Issues · rbonghi/jetson_stats · GitHub This will allow me to assist you better.

Best,
Raffaello

Hi @Raffaello
I’ve created a new issue, but I’m not sure if the tag I used is appropriate since I couldn’t find any tags specifically for asking questions.