Memory for GPU so small?

Hi,

When I use tensorflow, why the memory allocated to GPU is so small as below?

name: GP10B
major: 6 minor: 2 memoryClockRate (GHz) 1.3005
pciBusID 0000:00:00.0
Total memory: 7.67GiB
Free memory: 734.84MiB

From system monitor, I found there is still 4GB free memory to use. But why only 700MB free for GPU?

Thanks for your helping me!

It’s unified memory, so GPU memory maps directly to any subset of system memory…it isn’t fixed like in a desktop PC. If GPU needs 5GB, and if 5GB is available (I think it has to be a contiguous 5GB), then GPU will get 5GB. You wouldn’t want to allocate memory for GPU which the GPU doesn’t need.

Chengjiu , how did you get this list of details about the ram usage on jetson ?

linuxdev, If the gnu memory varies , why Chengjiu can not get more than 700MB for his application?

By the way , I run the openpose (https://github.com/CMU-Perceptual-Computing-Lab/openpose) which uses caffe to recognize human body movements and I experience this issue:

------------- console message: –

Starting pose estimation demo.
Starting thread(s)
F0615 14:54:58.332474 27856 syncedmem.cpp:71] Check failed: error == cudaSuccess (2 vs. 0) out of memory
*** Check failure stack trace: ***
@ 0x7fb5f8b718 google::LogMessage::Fail()
@ 0x7fb5f8d614 google::LogMessage::SendToLog()
@ 0x7fb5f8b290 google::LogMessage::Flush()
@ 0x7fb5f8deb4 google::LogMessageFatal::~LogMessageFatal()
@ 0x7fb53be988 caffe::SyncedMemory::mutable_gpu_data()
@ 0x7fb53c258c caffe::Blob<>::mutable_gpu_data()
@ 0x7fb60a589c op::NmsCaffe<>::Forward_gpu()
@ 0x7fb6093c18 op::PoseExtractorCaffe::forwardPass()
@ 0x7fb609905c op::WPoseExtractor<>::work()
@ 0x7fb60578e4 op::Worker<>::checkAndWork()
@ 0x7fb605ab50 op::SubThread<>::workTWorkers()
@ 0x7fb6063e9c op::SubThreadQueueInOut<>::work()
@ 0x7fb605faa8 op::Thread<>::threadFunction()
@ 0x7fb5e7b280 (unknown)
@ 0x7fb4974fc4 start_thread
Aborted (core dumped)


So , I suspect “out of memory” means the GPU runs out of memory.

When I run System Monitor in order to track the RAM usage it never goes more that 3.5GB (almost 38% RAM usage before the openpose crashes)

So, regarding Chengjiu’s report , and my observation I suspect the GPU shared memory is limited and fixed.

I would appreciate and I would be glad to here different opinion or comment.

If you agree or see my point , could you please provide me with a solution to increase the GPU RAM limit ?

Thank you for your help anyway!!!

The memory in a system is normally fragmented and the memory controller creates “virtual” contiguous blocks, but there are cases where the physical memory itself must be contiguous…MMU pretending the memory is contiguous does not always suffice. When your GPU is out of memory it may be telling you that you are out of contiguous physical memory, as opposed to memory the MMU can remap to look contiguous. If this is the case, then reserving a block for the GPU at boot time (perhaps as an argument to the kernel in U-Boot’s extlinux.conf file) may solve the issue.

Thank you very much for your help…

So , my last question , do you know what it he argument that i should change or add in extlinux.cong file ?

I see this:


DEFAULT primary

MENU TITLE p2771-0000 eMMC boot options

LABEL primary
MENU LABEL primary kernel
LINUX /boot/Image
APPEND fbcon=map:0 net.ifnames=0 console=tty0 OS=l4t console=ttyS0,115200n8 memtype=0 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x03100000 gpt tegraid=18.1.2.0.0 tegra_keep_boot_clocks maxcpus=6 android.kerneltype=normal androidboot.serialno=0335115020673 vpr_resize root=/dev/mmcblk0p1 rw rootwait


thanks again !!

Someone else will have to answer…I remember looking at this about a year ago, but it was for the 3.x kernels and I don’t recall how to bind that memory to the GPU (I just remember it was used on the APPEND parameter of extlinux.conf). Anyone here remember what the APPEND extlinux.conf entry is for reserving a specific amount of memory for the GPU at boot time?

Is there a difference between reserving the block of memory for the GPU on 4.x kernels versus 3.x kernels? I suspect this has changed in part because caching may have changed going from the TX1 to TX2 (TX2 may get to use caching in some cases where a TX1 required disabling cache for pinned memory).

Hi,

We equally divide memory into two area, one for big-page and the other for small.
Current, our driver doesn’t allow malloc a cross-area memory.(larger than half physical memory size)
For tx2, the maximal allocated memory is ~4G. (cudaMalloc. cudaAllocHost, cudaMallocManaged)

The only exception is that cudaMalloc still can use small-page area if all the memory in big-page is in-use.
But this is only for the allocated size small than 4G.

We already remove this limitation in our next release.
Please wait for our announcement and update.

Thanks.

Hi AastaLLL,

Thanks for your reply.

How do you distinguish big-page and small allocation?

In my case ( https://devtalk.nvidia.com/default/topic/1013464/jetson-tx2/gpu-out-of-memory-when-the-total-ram-usage-is-2-8g/post/5164646/#5164646 ), before the “out of memory” error, each time the size cudaMalloc request is less than 60MB, some are less than 1MB. Are these allocated memory all in the half for small allocation?

Thanks.

Hi,

Could you get device memory information before and after each cudaMalloc function?

For example, use the function ‘printMemInfo’ in this comment:
https://devtalk.nvidia.com/default/topic/1013464/jetson-tx2/gpu-out-of-memory-when-the-total-ram-usage-is-2-8g/post/5168834/#5168834

Hi AastaLLL,

These are all memory information before each cudaMalloc function:

Memory need is 2 MB
GPU is 0
GPU memory used 3035.39 MB
GPU memory free 4818.67 MB
GPU memory total 7854.06 MB
Memory need is 58 MB
GPU is 0
GPU memory used 3045.33 MB
GPU memory free 4808.73 MB
GPU memory total 7854.06 MB
Memory need is 58 MB
GPU is 0
GPU memory used 3118.68 MB
GPU memory free 4735.38 MB
GPU memory total 7854.06 MB
Memory need is 14 MB
GPU is 0
GPU memory used 3179.35 MB
GPU memory free 4674.71 MB
GPU memory total 7854.06 MB
Memory need is 14 MB
GPU is 0
GPU memory used 3194.12 MB
GPU memory free 4659.94 MB
GPU memory total 7854.06 MB
Memory need is 29 MB
GPU is 0
GPU memory used 3209.38 MB
GPU memory free 4644.68 MB
GPU memory total 7854.06 MB
Memory need is 29 MB
GPU is 0
GPU memory used 3239.2 MB
GPU memory free 4614.86 MB
GPU memory total 7854.06 MB
Memory need is 7 MB
GPU is 0
GPU memory used 3268.76 MB
GPU memory free 4585.3 MB
GPU memory total 7854.06 MB
Memory need is 7 MB
GPU is 0
GPU memory used 3276.14 MB
GPU memory free 4577.91 MB
GPU memory total 7854.06 MB
Memory need is 14 MB
GPU is 0
GPU memory used 3285.59 MB
GPU memory free 4568.47 MB
GPU memory total 7854.06 MB
Memory need is 14 MB
GPU is 0
GPU memory used 3302.16 MB
GPU memory free 4551.89 MB
GPU memory total 7854.06 MB
Memory need is 14 MB
GPU is 0
GPU memory used 3319.36 MB
GPU memory free 4534.7 MB
GPU memory total 7854.06 MB
Memory need is 14 MB
GPU is 0
GPU memory used 3337.04 MB
GPU memory free 4517.02 MB
GPU memory total 7854.06 MB
Memory need is 3 MB
GPU is 0
GPU memory used 3352.54 MB
GPU memory free 4501.52 MB
GPU memory total 7854.06 MB
Memory need is 3 MB
GPU is 0
GPU memory used 3352.54 MB
GPU memory free 4501.52 MB
GPU memory total 7854.06 MB
Memory need is 7 MB
GPU is 0
GPU memory used 3355.45 MB
GPU memory free 4498.61 MB
GPU memory total 7854.06 MB
Memory need is 7 MB
GPU is 0
GPU memory used 3382.93 MB
GPU memory free 4471.12 MB
GPU memory total 7854.06 MB
Memory need is 3 MB
GPU is 0
GPU memory used 3396.01 MB
GPU memory free 4458.05 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3400.86 MB
GPU memory free 4453.2 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3400.86 MB
GPU memory free 4453.2 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3401.1 MB
GPU memory free 4452.96 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3401.1 MB
GPU memory free 4452.96 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3403.28 MB
GPU memory free 4450.78 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3405.21 MB
GPU memory free 4448.84 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3409.33 MB
GPU memory free 4444.73 MB
GPU memory total 7854.06 MB
Memory need is 7 MB
GPU is 0
GPU memory used 3411.51 MB
GPU memory free 4442.55 MB
GPU memory total 7854.06 MB
Memory need is 7 MB
GPU is 0
GPU memory used 3427.81 MB
GPU memory free 4426.25 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3435.32 MB
GPU memory free 4418.74 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3435.32 MB
GPU memory free 4418.74 MB
GPU memory total 7854.06 MB
Memory need is 2 MB
GPU is 0
GPU memory used 3435.32 MB
GPU memory free 4418.74 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3442.59 MB
GPU memory free 4411.47 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3449.73 MB
GPU memory free 4404.33 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3455.3 MB
GPU memory free 4398.76 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3460.63 MB
GPU memory free 4393.43 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3465.96 MB
GPU memory free 4388.1 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3471.53 MB
GPU memory free 4382.53 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3476.86 MB
GPU memory free 4377.2 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3482.18 MB
GPU memory free 4371.88 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3487.75 MB
GPU memory free 4366.3 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3493.08 MB
GPU memory free 4360.98 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3495.5 MB
GPU memory free 4358.55 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3497.44 MB
GPU memory free 4356.62 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3499.62 MB
GPU memory free 4354.44 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3499.74 MB
GPU memory free 4354.32 MB
GPU memory total 7854.06 MB
Memory need is 2 MB
GPU is 0
GPU memory used 3499.74 MB
GPU memory free 4354.32 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3506.89 MB
GPU memory free 4347.17 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3513.91 MB
GPU memory free 4340.15 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3519.48 MB
GPU memory free 4334.58 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3524.81 MB
GPU memory free 4329.25 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3530.14 MB
GPU memory free 4323.92 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3535.46 MB
GPU memory free 4318.59 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3540.79 MB
GPU memory free 4313.27 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3546 MB
GPU memory free 4308.06 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3551.45 MB
GPU memory free 4302.61 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3556.82 MB
GPU memory free 4297.23 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3559.25 MB
GPU memory free 4294.81 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3561.21 MB
GPU memory free 4292.84 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3563.39 MB
GPU memory free 4290.66 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3565.33 MB
GPU memory free 4288.73 MB
GPU memory total 7854.06 MB
Memory need is 2 MB
GPU is 0
GPU memory used 3565.33 MB
GPU memory free 4288.73 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3572.9 MB
GPU memory free 4281.16 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3579.46 MB
GPU memory free 4274.59 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3584.9 MB
GPU memory free 4269.16 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3590.23 MB
GPU memory free 4263.83 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3595.8 MB
GPU memory free 4258.26 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3601.16 MB
GPU memory free 4252.9 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3606.37 MB
GPU memory free 4247.69 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3611.7 MB
GPU memory free 4242.36 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3617.02 MB
GPU memory free 4237.04 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3622.35 MB
GPU memory free 4231.71 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3624.77 MB
GPU memory free 4229.29 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3626.59 MB
GPU memory free 4227.47 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3628.77 MB
GPU memory free 4225.29 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3628.77 MB
GPU memory free 4225.29 MB
GPU memory total 7854.06 MB
Memory need is 2 MB
GPU is 0
GPU memory used 3628.77 MB
GPU memory free 4225.29 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3636.04 MB
GPU memory free 4218.02 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3642.86 MB
GPU memory free 4211.2 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3648.43 MB
GPU memory free 4205.63 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3653.75 MB
GPU memory free 4200.3 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3659.08 MB
GPU memory free 4194.98 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3664.41 MB
GPU memory free 4189.65 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3669.62 MB
GPU memory free 4184.44 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3674.95 MB
GPU memory free 4179.11 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3680.28 MB
GPU memory free 4173.78 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3685.64 MB
GPU memory free 4168.42 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3687.82 MB
GPU memory free 4166.24 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3690.24 MB
GPU memory free 4163.82 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3692.18 MB
GPU memory free 4161.88 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3694.12 MB
GPU memory free 4159.94 MB
GPU memory total 7854.06 MB
Memory need is 2 MB
GPU is 0
GPU memory used 3694.12 MB
GPU memory free 4159.94 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3701.38 MB
GPU memory free 4152.68 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3707.92 MB
GPU memory free 4146.14 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3713.61 MB
GPU memory free 4140.45 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3718.58 MB
GPU memory free 4135.48 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3723.82 MB
GPU memory free 4130.23 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3728.91 MB
GPU memory free 4125.15 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3734.12 MB
GPU memory free 4119.94 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3739.46 MB
GPU memory free 4114.6 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3746.24 MB
GPU memory free 4107.82 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3751.57 MB
GPU memory free 4102.49 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3753.63 MB
GPU memory free 4100.43 MB
GPU memory total 7854.06 MB
Memory need is 1 MB
GPU is 0
GPU memory used 3755.69 MB
GPU memory free 4098.37 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3757.62 MB
GPU memory free 4096.43 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3757.62 MB
GPU memory free 4096.43 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3757.62 MB
GPU memory free 4096.43 MB
GPU memory total 7854.06 MB
Memory need is 52 MB
GPU is 0
GPU memory used 3759.6 MB
GPU memory free 4094.46 MB
GPU memory total 7854.06 MB
Memory need is 0 MB
GPU is 0
GPU memory used 3812.4 MB
GPU memory free 4041.66 MB
GPU memory total 7854.06 MB
Memory need is 52 MB
GPU is 0
GPU memory used 3812.4 MB
GPU memory free 4041.66 MB
GPU memory total 7854.06 MB
F0623 19:30:59.064242 1584 syncedmem.cpp:88] Check failed: error == cudaSuccess (2 vs. 0) out of memory

To be more clear, I print out the “Memory need” information first then the memory information before call cudaMalloc function.
But the memory space changes a little later than calling the cudaMalloc function.

Hi,

  1. Just confirmed that memory of cudaMalloc() can reach ~7G.
    Source can be found here:
    https://devtalk.nvidia.com/default/topic/1013464/jetson-tx2/gpu-out-of-memory-when-the-total-ram-usage-is-2-8g/post/5168834/#5168834

  2. There is also a ‘CaffeMallocHost’ function in syncedmem.cpp
    Could you check which allocation type causes the out of memory?

Thanks.

Could ZeroCopy memory allocation calls could help to resolve the issue by replacing the default cudaMalloc() with cudaHostAlloc() without waiting for new release ?

reference: https://arrayfire.com/zero-copy-on-tegra-k1/

Hi,

zero copy allocate memory with cudaHostAllocate, which hits the half size limitation.

Please check here for details:
https://devtalk.nvidia.com/default/topic/1013464/jetson-tx2/gpu-out-of-memory-when-the-total-ram-usage-is-2-8g/post/5172688/#5172688