Tf-trt conversion got killed

Description

A clear and concise description of the bug or issue.

Environment

TensorRT Version: 7.1.3
GPU Type: AGX Xavier
Nvidia Driver Version:
CUDA Version: 10.2
CUDNN Version: 8.0
Operating System + Version: Ubuntu 18.04
Python Version (if applicable): 3.6.9
TensorFlow Version (if applicable): 2.4.0
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): JETPACK 4.5

I was using tf-trt to convert my saved_model(ssd_resnet50) trained from the host machine into trt engine in Xavier. I was using the attached script. trtop.py (576 Bytes)

After runing python3 trtop.py, the command I encountered the following message:
2021-03-19 22:11:20.936508: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-19 22:11:30.623547: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libnvinfer.so.7
2021-03-21 21:03:25.999071: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-03-21 21:03:25.999438: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-03-21 21:03:26.149679: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:03:26.149923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.377GHz coreCount: 8 deviceMemorySize: 15.45GiB deviceMemoryBandwidth: 82.08GiB/s
2021-03-21 21:03:26.149992: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-21 21:03:26.150112: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-03-21 21:03:26.150184: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2021-03-21 21:03:26.237558: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-03-21 21:03:26.272863: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-03-21 21:03:26.318791: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-03-21 21:03:26.381611: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-03-21 21:03:26.381943: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-03-21 21:03:26.382198: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:03:26.382506: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:03:26.382651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1888] Adding visible gpu devices: 0
2021-03-21 21:03:26.384943: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-03-21 21:03:26.385234: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:03:26.385436: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.377GHz coreCount: 8 deviceMemorySize: 15.45GiB deviceMemoryBandwidth: 82.08GiB/s
2021-03-21 21:03:26.385524: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-21 21:03:26.385593: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-03-21 21:03:26.385639: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2021-03-21 21:03:26.385684: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-03-21 21:03:26.385731: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-03-21 21:03:26.385773: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-03-21 21:03:26.385818: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-03-21 21:03:26.385860: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-03-21 21:03:26.386048: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:03:26.386255: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:03:26.386327: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1888] Adding visible gpu devices: 0
2021-03-21 21:03:26.386441: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-21 21:03:32.315554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1287] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-03-21 21:03:32.315651: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1293] 0
2021-03-21 21:03:32.315686: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0: N
2021-03-21 21:03:32.315977: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:03:32.316317: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:03:32.316476: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:03:32.316612: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9990 MB memory) → physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2021-03-21 21:05:19.370619: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:19.371168: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2021-03-21 21:05:19.371768: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2021-03-21 21:05:19.372384: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-03-21 21:05:19.372707: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:19.372941: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Found device 0 with properties:
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.377GHz coreCount: 8 deviceMemorySize: 15.45GiB deviceMemoryBandwidth: 82.08GiB/s
2021-03-21 21:05:19.373201: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-21 21:05:19.373373: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-03-21 21:05:19.373459: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2021-03-21 21:05:19.373564: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-03-21 21:05:19.373665: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-03-21 21:05:19.373738: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-03-21 21:05:19.373802: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-03-21 21:05:19.373860: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-03-21 21:05:19.374068: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:19.374212: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:19.374316: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1888] Adding visible gpu devices: 0
2021-03-21 21:05:19.374470: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1287] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-03-21 21:05:19.374503: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1293] 0
2021-03-21 21:05:19.374530: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0: N
2021-03-21 21:05:19.374680: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:19.374857: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:19.375005: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9990 MB memory) → physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2021-03-21 21:05:19.376628: W tensorflow/core/platform/profile_utils/cpu_utils.cc:116] Failed to find bogomips or clock in /proc/cpuinfo; cannot determine CPU frequency
2021-03-21 21:05:22.152016: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:954] Optimization results for grappler item: graph_to_optimize
function_optimizer: Graph size after: 4997 nodes (4509), 10438 edges (9943), time = 538.579ms.
function_optimizer: function_optimizer did nothing. time = 5.175ms.

2021-03-21 21:05:57.064943: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:57.087998: I tensorflow/core/grappler/devices.cc:69] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 1
2021-03-21 21:05:57.103567: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2021-03-21 21:05:57.158908: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-03-21 21:05:57.159736: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:57.168780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Found device 0 with properties: 
pciBusID: 0000:00:00.0 name: Xavier computeCapability: 7.2
coreClock: 1.377GHz coreCount: 8 deviceMemorySize: 15.45GiB deviceMemoryBandwidth: 82.08GiB/s
2021-03-21 21:05:57.190819: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.2
2021-03-21 21:05:57.292866: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.10
2021-03-21 21:05:57.338429: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.10
2021-03-21 21:05:57.386678: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-03-21 21:05:57.391626: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-03-21 21:05:57.405938: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10
2021-03-21 21:05:57.420817: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.10
2021-03-21 21:05:57.435876: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-03-21 21:05:57.436637: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:57.437033: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:57.449504: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1888] Adding visible gpu devices: 0
2021-03-21 21:05:57.472591: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1287] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-03-21 21:05:57.472731: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1293]      0 
2021-03-21 21:05:57.472847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306] 0:   N 
2021-03-21 21:05:57.490212: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:57.490994: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] ARM64 does not support NUMA - returning NUMA node zero
2021-03-21 21:05:57.491447: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9990 MB memory) -> physical GPU (device: 0, name: Xavier, pci bus id: 0000:00:00.0, compute capability: 7.2)
2021-03-21 21:06:21.153449: I tensorflow/compiler/tf2tensorrt/segment/segment.cc:790] There are 619 ops of 52 different types in the graph that are not converted to TensorRT: Sum, GreaterEqual, Where, Reciprocal, ResizeBilinear, Split, Cast, StopGradient, Range, Less, Merge, TensorListGetItem, Pad, Slice, LogicalAnd, Mul, NextIteration, Switch, Select, Exit, LoopCond, Pack, NoOp, Size, Greater, GatherV2, ExpandDims, Identity, Assert, NonMaxSuppressionV5, Squeeze, Enter, TensorListFromTensor, AddV2, TensorListSetItem, Placeholder, TensorListStack, TensorListReserve, Const, Sub, Reshape, Transpose, Minimum, Shape, Maximum, StridedSlice, Fill, Unpack, ConcatV2, Exp, Equal, TopKV2, (For more information see https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#supported-ops).
2021-03-21 21:06:23.661063: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:757] Number of TensorRT candidate segments: 4
Killed

I think the process is killed because of memory used up (15.4GB). If that’s the issue, how to allocate the memory to solve the problem? Otherwise, any ideas? Thanks.

Hi,
We recommend you to check the below samples links, as they might answer your concern
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#samples
https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-722/quick-start-guide/index.html#framework-integration
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#integrate-ovr
https://docs.nvidia.com/deeplearning/frameworks/tf-trt-user-guide/index.html#usingtftrt

If issue persist, request you to share the model and script so that we can try reproducing the issue at our end.
Thanks!

Hi, I followed the steps above when learning to use tf-trt. Here are my script and model. Thanks!
trtop.py (578 Bytes)
Model: my_model.zip - Google Drive