Thanks for your answer it seems that’s using tensorflow on Jetson boards require too much GPU memory
That’s what i am having when i’am running tegrastats and executing my program at same time, the memory allocation of the RAM keep increasing :
RAM 1883/31927MB (lfb 7202x4MB) SWAP 0/15964MB (cached 0MB) CPU [12%@2188,6%@2188,33%@2188,15%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 6% AO@45C GPU@45.5C Tdiode@49.5C PMIC@100C AUX@45C CPU@46C thermal@45.45C Tboard@44C
RAM 2034/31927MB (lfb 7141x4MB) SWAP 0/15964MB (cached 0MB) CPU [28%@2188,23%@2188,47%@2188,22%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45C GPU@46C Tdiode@49.75C PMIC@100C AUX@45C CPU@46.5C thermal@45.6C Tboard@44C
RAM 2222/31927MB (lfb 7076x4MB) SWAP 0/15964MB (cached 0MB) CPU [12%@2188,13%@2188,52%@2188,24%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 13% AO@45C GPU@46C Tdiode@49.5C PMIC@100C AUX@45C CPU@46.5C thermal@45.75C Tboard@44C
RAM 2439/31927MB (lfb 7007x4MB) SWAP 0/15964MB (cached 0MB) CPU [8%@2188,7%@2188,55%@2188,19%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45C GPU@46C Tdiode@49.75C PMIC@100C AUX@44.5C CPU@46.5C thermal@45.6C Tboard@44C
RAM 2803/31927MB (lfb 6858x4MB) SWAP 0/15964MB (cached 0MB) CPU [4%@2188,4%@2188,40%@2188,4%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45C GPU@45.5C Tdiode@49.5C PMIC@100C AUX@44.5C CPU@46.5C thermal@45.25C Tboard@44C
RAM 3550/31927MB (lfb 6671x4MB) SWAP 0/15964MB (cached 0MB) CPU [10%@2188,6%@2188,100%@2188,6%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45C GPU@45.5C Tdiode@49.75C PMIC@100C AUX@44.5C CPU@46.5C thermal@45.25C Tboard@44C
RAM 3166/31927MB (lfb 6766x4MB) SWAP 0/15964MB (cached 0MB) CPU [6%@2188,3%@2188,98%@2188,2%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45C GPU@45.5C Tdiode@49.5C PMIC@100C AUX@45C CPU@46.5C thermal@45.6C Tboard@44C
RAM 4609/31927MB (lfb 6406x4MB) SWAP 0/15964MB (cached 0MB) CPU [1%@2188,3%@2188,100%@2188,0%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45.5C GPU@46.5C Tdiode@49.5C PMIC@100C AUX@45C CPU@46.5C thermal@45.75C Tboard@44C
RAM 5102/31927MB (lfb 6282x4MB) SWAP 0/15964MB (cached 0MB) CPU [35%@2188,21%@2188,77%@2188,20%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45C GPU@46C Tdiode@49.75C PMIC@100C AUX@45C CPU@47.5C thermal@45.75C Tboard@44C
RAM 5683/31927MB (lfb 6134x4MB) SWAP 0/15964MB (cached 0MB) CPU [63%@2188,36%@2188,100%@2188,32%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45C GPU@46.5C Tdiode@49.75C PMIC@100C AUX@45.5C CPU@48C thermal@46.2C Tboard@44C
RAM 6495/31927MB (lfb 5915x4MB) SWAP 0/15964MB (cached 0MB) CPU [100%@2188,62%@2188,75%@2188,37%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45.5C GPU@46C Tdiode@49.75C PMIC@100C AUX@45.5C CPU@48.5C thermal@46.55C Tboard@44C
RAM 6571/31927MB (lfb 5895x4MB) SWAP 0/15964MB (cached 0MB) CPU [94%@2188,65%@2188,98%@2188,61%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 5% AO@45.5C GPU@46C Tdiode@49.75C PMIC@100C AUX@45.5C CPU@48C thermal@46.55C Tboard@44C
RAM 6747/31927MB (lfb 5830x4MB) SWAP 0/15964MB (cached 0MB) CPU [57%@2188,72%@2188,85%@2188,64%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 2% AO@45.5C GPU@46C Tdiode@49.75C PMIC@100C AUX@45.5C CPU@48C thermal@46.55C Tboard@44C
RAM 6990/31927MB (lfb 5752x4MB) SWAP 0/15964MB (cached 0MB) CPU [30%@2188,100%@2188,40%@2188,45%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 20% AO@45.5C GPU@46.5C Tdiode@49.75C PMIC@100C AUX@45.5C CPU@48C thermal@46.4C Tboard@44C
RAM 8315/31927MB (lfb 5411x4MB) SWAP 0/15964MB (cached 0MB) CPU [15%@2188,100%@2188,50%@2188,78%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45.5C GPU@46C Tdiode@49.75C PMIC@100C AUX@45.5C CPU@48C thermal@46.4C Tboard@44C
RAM 12813/31927MB (lfb 4284x4MB) SWAP 0/15964MB (cached 0MB) CPU [14%@2188,69%@2188,34%@2188,100%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45.5C GPU@46C Tdiode@50C PMIC@100C AUX@45.5C CPU@48C thermal@46.4C Tboard@44C
RAM 17146/31927MB (lfb 3201x4MB) SWAP 0/15964MB (cached 0MB) CPU [83%@2188,91%@2188,77%@2188,100%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45.5C GPU@48.5C Tdiode@49.75C PMIC@100C AUX@45.5C CPU@48.5C thermal@46.4C Tboard@44C
RAM 21701/31927MB (lfb 2062x4MB) SWAP 0/15964MB (cached 0MB) CPU [46%@2188,100%@2188,23%@2188,100%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45.5C GPU@47.5C Tdiode@50C PMIC@100C AUX@45.5C CPU@48C thermal@46.55C Tboard@44C
RAM 26288/31927MB (lfb 914x4MB) SWAP 0/15964MB (cached 0MB) CPU [54%@2188,31%@2188,66%@2188,64%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45.5C GPU@48C Tdiode@50.25C PMIC@100C AUX@45.5C CPU@48C thermal@46.4C Tboard@44C
RAM 29597/31927MB (lfb 190x4MB) SWAP 1/15964MB (cached 0MB) CPU [100%@2188,23%@2188,100%@2188,41%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@46C GPU@46.5C Tdiode@50C PMIC@100C AUX@45.5C CPU@48.5C thermal@46.4C Tboard@44C
RAM 30805/31927MB (lfb 190x4MB) SWAP 15/15964MB (cached 0MB) CPU [100%@2188,95%@2188,74%@2188,68%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45.5C GPU@46C Tdiode@50C PMIC@100C AUX@45.5C CPU@48.5C thermal@46.7C Tboard@44C
RAM 31115/31927MB (lfb 177x4MB) SWAP 186/15964MB (cached 0MB) CPU [100%@2188,84%@2188,73%@2188,100%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@46C GPU@48.5C Tdiode@50C PMIC@100C AUX@45.5C CPU@48.5C thermal@46.7C Tboard@44C
RAM 31171/31927MB (lfb 163x4MB) SWAP 434/15964MB (cached 1MB) CPU [94%@2188,74%@2188,69%@2188,100%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@45.5C GPU@46.5C Tdiode@50.25C PMIC@100C AUX@46C CPU@48.5C thermal@46.7C Tboard@44C
RAM 31285/31927MB (lfb 136x4MB) SWAP 717/15964MB (cached 12MB) CPU [71%@2188,100%@2188,100%@2188,89%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@46C GPU@48.5C Tdiode@50.5C PMIC@100C AUX@45.5C CPU@48.5C thermal@47.3C Tboard@44C
RAM 31411/31927MB (lfb 108x4MB) SWAP 1025/15964MB (cached 12MB) CPU [99%@2188,100%@2188,100%@2188,99%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@46C GPU@49C Tdiode@50C PMIC@100C AUX@45.5C CPU@49C thermal@47.3C Tboard@44C
RAM 31585/31927MB (lfb 68x4MB) SWAP 1392/15964MB (cached 12MB) CPU [99%@2188,95%@2188,99%@2188,98%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@46C GPU@46.5C Tdiode@50.25C PMIC@100C AUX@45.5C CPU@48.5C thermal@46.7C Tboard@44C
RAM 31719/31927MB (lfb 29x4MB) SWAP 2037/15964MB (cached 12MB) CPU [80%@2188,96%@2188,95%@2188,82%@2188,off,off,off,off] EMC_FREQ 0% GR3D_FREQ 0% AO@46C GPU@46.5C Tdiode@50.5C PMIC@100C AUX@46C CPU@48.5C thermal@46.85C Tboard@44C
I’am using 32G AGX Xavier and i tried to free memory with the command :
free -h && sudo sysctl vm.drop_caches=3 && free -h
And that’s what i’am having as OUTPUT :
2021-04-22 17:00:38.831465: I tensorflow/core/common_runtime/bfc_allocator.cc:872] Bin (134217728): Total Chunks: 0, Chunks in use: 0. 0B allocated for chunks. 0B in use in bin. 0B client-requested in use in bin.
2021-04-22 17:00:38.831566: I tensorflow/core/common_runtime/bfc_allocator.cc:872] Bin (268435456): Total Chunks: 1, Chunks in use: 1. 509.19MiB allocated for chunks. 509.19MiB in use in bin. 412.30MiB client-requested in use in bin.
2021-04-22 17:00:38.831652: I tensorflow/core/common_runtime/bfc_allocator.cc:888] Bin for 1.98MiB was 1.00MiB, Chunk State:
2021-04-22 17:00:38.831694: I tensorflow/core/common_runtime/bfc_allocator.cc:901] Next region of size 533921792
2021-04-22 17:00:38.831746: I tensorflow/core/common_runtime/bfc_allocator.cc:908] InUse at 0x218a7a000 next 18446744073709551615 of size 533921792
2021-04-22 17:00:38.831787: I tensorflow/core/common_runtime/bfc_allocator.cc:917] Summary of in-use Chunks by size:
2021-04-22 17:00:38.831847: I tensorflow/core/common_runtime/bfc_allocator.cc:920] 1 Chunks of size 533921792 totalling 509.19MiB
2021-04-22 17:00:38.831895: I tensorflow/core/common_runtime/bfc_allocator.cc:924] Sum Total of in-use chunks: 509.19MiB
2021-04-22 17:00:38.831934: I tensorflow/core/common_runtime/bfc_allocator.cc:926] total_region_allocated_bytes_: 533921792 memory_limit_: 533921792 available bytes: 0 curr_region_allocation_bytes_: 1067843584
2021-04-22 17:00:38.832132: I tensorflow/core/common_runtime/bfc_allocator.cc:932] Stats:
Limit: 533921792
InUse: 533921792
MaxInUse: 533921792
NumAllocs: 1
MaxAllocSize: 533921792
2021-04-22 17:00:38.832198: W tensorflow/core/common_runtime/bfc_allocator.cc:427] *********************************************************************************xxxxxxxxxxxxxxxxxxx
2021-04-22 17:00:38.832288: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:41] DefaultLogger Requested amount of GPU memory (2076672 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
2021-04-22 17:00:38.832357: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:41] DefaultLogger /home/jenkins/workspace/TensorRT/helpers/rel-7.1/L1_Nightly_Internal/build/source/rtSafe/resources.h (181) - OutOfMemory Error in GpuMemory: 0
2021-04-22 17:00:39.150223: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:41] DefaultLogger Out of memory error during getBestTactic: (Unnamed Layer* 0) [Constant] + (Unnamed Layer* 1) [ElementWise]
2021-04-22 17:00:39.150387: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:41] DefaultLogger Try increasing the workspace size with IBuilderConfig::setMaxWorkspaceSize() if using IBuilder::buildEngineWithConfig, or IBuilder::setMaxWorkspaceSize() if using IBuilder::buildCudaEngine.
2021-04-22 17:00:39.150523: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:41] DefaultLogger ../builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node (Unnamed Layer* 0) [Constant] + (Unnamed Layer* 1) [ElementWise].)
2021-04-22 17:00:39.157632: E tensorflow/compiler/tf2tensorrt/utils/trt_logger.cc:41] DefaultLogger ../builder/tacticOptimizer.cpp (1715) - TRTInternal Error in computeCosts: 0 (Could not find any implementation for node (Unnamed Layer* 0) [Constant] + (Unnamed Layer* 1) [ElementWise].)
2021-04-22 17:00:39.160786: W tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:751] TensorRT node TRTEngineOp_0 added for segment 0 consisting of 397 nodes failed: Internal: Failed to build TensorRT engine. Fallback to TF...
2021-04-22 17:00:57.155215: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:748] TensorRT node detector/yolo-v3/TRTEngineOp_1 added for segment 1 consisting of 52 nodes succeeded.
2021-04-22 17:01:01.260716: I tensorflow/compiler/tf2tensorrt/convert/convert_graph.cc:748] TensorRT node detector/yolo-v3/TRTEngineOp_2 added for segment 2 consisting of 44 nodes succeeded.
2021-04-22 17:01:01.902294: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-04-22 17:01:01.983073: W tensorflow/compiler/tf2tensorrt/convert/trt_optimization_pass.cc:183] TensorRTOptimizer is probably called on funcdef! This optimizer must *NOT* be called on function objects.
2021-04-22 17:01:02.011498: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:821] Optimization results for grappler item: tf_graph
2021-04-22 17:01:02.011629: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] constant_folding: Graph size after: 484 nodes (0), 656 edges (0), time = 1292.104ms.
2021-04-22 17:01:02.011722: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] layout: Graph size after: 507 nodes (23), 679 edges (23), time = 677.796ms.
2021-04-22 17:01:02.011789: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] constant_folding: Graph size after: 505 nodes (-2), 672 edges (-7), time = 489.878ms.
2021-04-22 17:01:02.011825: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] TensorRTOptimizer: Graph size after: 411 nodes (-94), 552 edges (-120), time = 38363.5938ms.
2021-04-22 17:01:02.011869: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] constant_folding: Graph size after: 411 nodes (0), 552 edges (0), time = 299.142ms.
2021-04-22 17:01:02.011896: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:821] Optimization results for grappler item: detector/yolo-v3/TRTEngineOp_1_native_segment
2021-04-22 17:01:02.011920: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] constant_folding: Graph size after: 57 nodes (0), 69 edges (0), time = 25.955ms.
2021-04-22 17:01:02.011939: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] layout: Graph size after: 57 nodes (0), 69 edges (0), time = 30.07ms.
2021-04-22 17:01:02.011955: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] constant_folding: Graph size after: 57 nodes (0), 69 edges (0), time = 26.896ms.
2021-04-22 17:01:02.011992: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] TensorRTOptimizer: Graph size after: 57 nodes (0), 69 edges (0), time = 3.295ms.
2021-04-22 17:01:02.012017: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] constant_folding: Graph size after: 57 nodes (0), 69 edges (0), time = 27.544ms.
2021-04-22 17:01:02.012038: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:821] Optimization results for grappler item: detector/yolo-v3/TRTEngineOp_2_native_segment
2021-04-22 17:01:02.012058: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] constant_folding: Graph size after: 48 nodes (0), 58 edges (0), time = 10.13ms.
2021-04-22 17:01:02.012080: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] layout: Graph size after: 48 nodes (0), 58 edges (0), time = 10.448ms.
2021-04-22 17:01:02.012100: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] constant_folding: Graph size after: 48 nodes (0), 58 edges (0), time = 9.681ms.
2021-04-22 17:01:02.012119: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] TensorRTOptimizer: Graph size after: 48 nodes (0), 58 edges (0), time = 1.092ms.
2021-04-22 17:01:02.012138: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:823] constant_folding: Graph size after: 48 nodes (0), 58 edges (0), time = 9.298ms.
Is there any other solution to convert my frozen graph with tf-trt and avoiding the out of memory ? cuz i followed that are in the url you provided me and that problem of memory allocation is blocking the conversion
Thanks for your answer