GPU and RAM usage on Jetson TX2

Hi,

I am running YOLOv2 object detection algorithm of Darkflow version on Jetson TX2. Below is the code for that.

import cv2
from darkflow.net.build import TFNet
import numpy as np
import time
import os
import tensorflow as tf

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=conf ig)

options = {
‘model’: ‘cfg/yolov2-tiny.cfg’,
‘load’: ‘bin/yolov2-custom-tiny_280000.weights’,
‘threshold’: 0.5,
‘gpu’: 0.5
}

When I run the custom model, i get the following logs.

GPU mode with 0.5 usage
2020-09-08 14:20:55.360891: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2020-09-08 14:20:55.361222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3
pciBusID: 0000:00:00.0
2020-09-08 14:20:55.361332: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-09-08 14:20:55.361410: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-09-08 14:20:55.361477: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-09-08 14:20:55.361642: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-09-08 14:20:55.361749: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-09-08 14:20:55.361844: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-09-08 14:20:55.361939: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-09-08 14:20:55.362442: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2020-09-08 14:20:55.363104: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2020-09-08 14:20:55.363330: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2020-09-08 14:20:55.363488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-08 14:20:55.363544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2020-09-08 14:20:55.363589: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2020-09-08 14:20:55.364088: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2020-09-08 14:20:55.364683: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2020-09-08 14:20:55.364980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4385 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
Finished in 6.266222715377808s

2020-09-08 14:20:58.741729: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2020-09-08 14:21:06.414605: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.13GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.
> 2020-09-08 14:21:06.741963: W tensorflow/core/common_runtime/bfc_allocator.cc:237] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.13GiB with freed_by_count=0. The caller indicates that this is not a failure, but may mean that there could be performance gains if more memory were available.

I have following components installed on my TX2

  1. tensorflow-gpu version 1.14.0+nv19.10
  2. cuda 10.0
  3. libcudnn7
  4. libcudnn7-dev
  5. libopencv 3.3.1

Even though TX2 has 8GB RAM, all 8GB of RAM gets exhausted. It reaches around 99% usage. I have mentioned only 50% GPU usage in my code. What could be the reason for this?? Do I need to do any configurations while using Tensorflow?
Has anybody have idea how to optimise the memory usage and GPU usage??

Thanks in advance

One of the limitations of using a GPU (or most any physical device) is that the memory has to be contiguous physical memory, and virtually remapped memory will not work for that. I do not know enough about it to improve the situation, but what it comes down to is that if other devices don’t use physical RAM and swap out, then more is available to the GPU. Additionally, if a lot of programs take up small amounts of memory at different addresses, then the memory may have a lot of “spare”, and yet be fragmented, such that contiguous memory requirements cannot be met.

Someone else may be able to tell you how to reserve memory for your program during boot. Or if you simply run this program prior to other software running, then it might work without any extra effort.

EDIT: Adding swap does not technically help with GPU memory, but it means ordinary programs can swap out and leave more for the GPU which is contiguous physical address space.

Hi @linuxdev
Thanks for your reply.
I am not running any other software or application. I am running only this yolo model. If I run the same model on desktop (x64) architecture with ubuntu 18.04 then I don’t face this issue. But, on desktop I have installed tensorflow-cpu version where as I have installed tensorflow-gpu on the jetson TX2.

Also consider that on the desktop the GPU has its own RAM, and is not shared with the rest of the world (but of course running a CPU-only version does not use much GPU RAM :P ).

Someone else will probably be able to suggest how to reserve RAM (at boot) for the GPU. I’m pretty sure it can be done with nothing more than kernel command line arguments, but I do not know the specifics.

1 Like