Python code using tensorflow and cuda on Jetson TX2 is getting killed (logs below)

Could you please recommend what should be done based on the logs below:

2019-08-02 14:09:08.128243: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
WARNING: Logging before flag parsing goes to stderr.
2019-08-02 14:09:27.986443: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-08-02 14:09:28.036358: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2019-08-02 14:09:28.036555: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.02
pciBusID: 0000:00:00.0
2019-08-02 14:09:28.036653: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-02 14:09:28.036792: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-02 14:09:28.036910: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-08-02 14:09:28.073906: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-08-02 14:09:28.113267: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-08-02 14:09:28.139226: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-08-02 14:09:28.217861: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-08-02 14:09:28.218581: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2019-08-02 14:09:28.219103: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2019-08-02 14:09:28.219248: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-02 14:09:28.241159: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency
2019-08-02 14:09:28.242110: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x8efae50 executing computations on platform Host. Devices:
2019-08-02 14:09:28.242170: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): ,
2019-08-02 14:09:28.335767: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2019-08-02 14:09:28.336613: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x8eefe50 executing computations on platform CUDA. Devices:
2019-08-02 14:09:28.337567: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): NVIDIA Tegra X2, Compute Capability 6.2
2019-08-02 14:09:28.344185: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2019-08-02 14:09:28.345441: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.02
pciBusID: 0000:00:00.0
2019-08-02 14:09:28.345867: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-02 14:09:28.346247: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-08-02 14:09:28.346485: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-08-02 14:09:28.346846: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-08-02 14:09:28.347169: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-08-02 14:09:28.347463: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-08-02 14:09:28.347712: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-08-02 14:09:28.349084: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2019-08-02 14:09:28.350704: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2019-08-02 14:09:28.351229: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-08-02 14:09:28.351946: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-08-02 14:09:33.653791: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-08-02 14:09:33.653950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-08-02 14:09:33.654000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2019-08-02 14:09:33.655143: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2019-08-02 14:09:33.656047: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:972] ARM64 does not support NUMA - returning NUMA node zero
2019-08-02 14:09:33.656393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1700 MB memory) → physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
W0802 14:09:34.108056 548108099600 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/array_ops.py:1354: add_dispatch_support..wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
Killed

JFYI,
Cuda Test Passed (Version 10.0) using Samples
Tensorflow version 1.14.0
OpenCV version 3.4.0
Total Available space locally on Jetson = 1.4-1.8GB
The video I am working on is 3.6GB and is placed on an external HDD

That’s the out of memory kill. Even if you have swap some operations require physical RAM, but swap may still help since those operations not requiring RAM could swap. I couldn’t tell you how in Python to change the number of threads, but often using fewer threads implies less memory.

To demonstrate you could install htop (“sudo apt-get install htop”) and then watch memory as your program progresses. You’ll find memory use going up until the kill hits near the limits.

Yes I already did that. The gpu memory is getting overflowed and hence the kill.
I would like to know how can i reduce the number of threads in my system for this code.

I doubt the number of threads is the issue though…

Without sharing the code it’s really hard to say but I would start looking at batch size if this is a DL model. That is typically the #1 cause of OOM kills.

Yes it is a Deep Learning Model and when I checked on htop, it displayed all the 6 GPUs being filled and the memory crossing the max limit.
Can you guide me through how can I reduce the batch size of my DL model?

Hi,

TX2 only has one GPU.
To reduce memory, you can try to set the per_process_gpu_memory_fraction flag.

config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, ...)

Thanks.

Hi,
I’m having the same problem. I’m trying to convert a model trained with tensorflow to onnx with tf2onnx, for then convert it to tensorrt. And, excuse me, but I’m no an expert, I don’t know where I have to put this:
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.4
session = tf.Session(config=config, …)