Increase TX1 Free Memory

Hi,

I’m running a neural-network-based object detection algorithm using tensorflow on the TX1 and I’d like to increase the amount of free memory in hopes of improving the performance of the network.

When I run the program, I get the following message from tensorflow when it initializes the graph:

W tensorflow/core/common_runtime/bfc_allocator.cc:213] Ran out of memory trying to allocate 3.12GiB. The caller indicates that this is not a failure, but may mean that there could be performance gains if 
more memory is available.

Prior to that tensorflow reports the following:

name: NVIDIA Tegra X1
major: 5 minor: 3 memoryClockRate (GHz) 0.072
pciBusID 0000:00:00.0
Total memory: 3.90GiB
Free memory: 2.01GiB

I assume that if I can increase the free memory to > 3.12 GiB, I might avoid the OOM warning and improve the performance of the program. Right now it executes correctly, but is about 5x slower than it should be. Any help alleviating this problem would be appreciated.

You may add a few GigaBytes of swap on an external device (USB or SATA disk, SD Card), either with a swap partition to be mounted, or with a swapfile to be used with swapon.

Interesting. I actually have several GB of swap allocated on an external SSD. The TX1 definitely uses this for memory intensive tasks, however it doesn’t seem to be visible to tensorflow (which is maybe just using GPU memory?). I’m not sure swap would be fast enough to significantly speed things up, but I’d be willing to try. Any idea how you would make the swap visible to tensorflow?

I will guess that memory which is accessed directly by the GPU probably has to be actual physical memory. Swap would imply other tasks competing would not put as much pressure on use of actual physical memory, but swap would not directly participate in this.

Yeah, that makes sense to me. My sense is that, even if the GPU could access swap directly, the latency of accessing it would be a serious performance issue for something as memory-intensive as a neural network.

Any thoughts on how to increase free physical memory, or on what is using the unavailable 1.9 GiB?

I’d run htop and just watch as it runs out of memory, watch the “RES” column and “SHR” column, see if something grows (“VIRT” isn’t actual used memory, e.g., Xorg will show what it is capable of using, not what it actually uses). I don’t know anything about the internals of the GPU driver, maybe someone could comment on whether vmalloc might reserve memory for his purposes, or whether vmalloc is irrelevant to the GPU (perhaps the driver requires contiguous physical memory).

Hi iandreariley:

Have you figured out how to fix this?

I had got up to more than 3.7 GB RAM available on a TX1 (was on R23 or R24) by disabling lightdm (no X server running) and cups related services.
But performance was not so good for my cuda application, though.