Hi! While investigating how to accommodate a rather memory demanding application into Jetson Nano 4gb of RAM, I wrote a test program that just allocates memory (host, device, pinned, unified types) and initializes it, so that I could understand how allocation behaves on the Jetson.
What I got is interesting and I’d like to ask:
-
If I allocate large amounts of memory (let’s say 2-3 gb) via pinned/unified/device method, I also get CPU RAM consumption that raises for about 400mb. Are those page/allocation tables in the driver? (It doesn’t matter how many allocations I use: one or many, it’ll be around +400mb of CPU RAM usage).
-
Does GPU memory oversubscription (via mallocmanaged) work on Jetson?
-
When the test app goes up around 3gb, device allocations become REALLY slow. Is that because zram swap activates and eats up time?
-
It looks like I can’t allocate more than ~3gb or GPU ram (+/- some), the app is just get killed by oom. Is there a way to squeeze a little more by tuning OS? For example I need to allocate 3.2gb, but only ~3.0-3.1 is doable. Compact
Thank you!
Env: Jetson Nano (4gb RAM), Ubuntu 18.04, Jetpack 32.4.4, jetson_stats for collecting information.