I am using Jetson Orin Nano Dev Kit (8GB) with JetPack 6.
I noticed that default installation has configured Zram as swap partitions.
htop shows 8GB of avaiable RAM plus around 3.7GB of swap memory.
Question
Isn’t it going to be more efficient to place swap file/partition on disk (in my case on SD Card)? What about NVMem SDD disk?
Is it worth keeping both types of swap (prioritize one over another)?
My use-case
I tried running one of the jetson-containers, Whisper. When running the default notebook it stucked at the very end while running the inference. I noticed that memory was exhausted (both heap and swap). Adding extra swap space on disk (sd card in my case) was needed to run inference successfully.
Model page shows Jetson Orin Nano (8GB) as a supported platform.
Disk is far slower than RAM. In the case of swapping to solid state disk you are going to greatly reduce the lifetime of the storage media if constantly writing to disk.
Some applications require there to be swap, which is virtual memory. No kernel space driver will use swap though, as drivers tend to use physical addresses most of the time.
Realize in what follows that Jetsons have an integrated GPU (iGPU) wired directly to the memory controller, and do not have their own RAM. This is contrasted with a PCI-based GPU which is discrete (dGPU) and has its own RAM.
Inference and GPU memory are a special case. These drivers, like many, cannot use virtual memory and require physical RAM. By making swap available to user space programs it is possible that more physical RAM would be made available to the GPU, but of course since it is ZRAM it is kind of a tradeoff. On the other hand, ZRAM is compressed, so a user space app using ZRAM will take less RAM than one just using RAM. The GPU would also have requirements for contiguous memory addresses at times, and so even if you have enough RAM for inference, it might still not be enough contiguous RAM.
There are good reasons why people will pay more money for a slower PCI-based GPU if it has a lot more VRAM. Things like training can benefit from this.
I’m guessing @dusty_nv can give you some good advise when running out of inference memory.
@paaabl0 in the case of these large models, I prefer to mount the swap on NVME and disable ZRAM, as shown in the setup docs:
As @linuxdev mentioned, this can cause premature wear on the storage media, particularly noticeable with SD cards (use high-endurance SD cards if not NVME)