New TensorFlow/TensorRT apis not working

wuxiekeji · April 4, 2018, 10:42pm

I’m trying to follow the directions given at:

I downloaded the code examples at:
https://developer.download.nvidia.com/devblogs/tftrt_sample.tar.xz

and ran ‘run_all.sh’, only to find that it spitting out invalid pointer errors.
Any idea what may be wrong? I’m on CUDA 9.0, TensorFlow 1.7, TensorRT 4.0.0.3, Ubuntu 16.04, all standard stuff, on an i7 with a GTX 1080, nothing weird.

:) ./run_all.sh 
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Namespace(FP16=True, FP32=True, INT8=True, batch_size=4, dump_diff=False, native=True, num_loops=10, topN=5, update_graphdef=False, with_timeline=False, workspace_size=2048)
Starting at 2018-04-04 23:16:11.195373
2018-04-04 23:16:11.209274: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-04 23:16:11.345285: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-04 23:16:11.345693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.05GiB
2018-04-04 23:16:11.345707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-04 23:16:11.615095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-04 23:16:11.615127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-04-04 23:16:11.615132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-04-04 23:16:11.615307: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4055 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
INFO:tensorflow:Starting execution
2018-04-04 23:16:12.494643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-04 23:16:12.494680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-04 23:16:12.494685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-04-04 23:16:12.494689: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-04-04 23:16:12.494825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4055 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
INFO:tensorflow:Starting Warmup cycle
INFO:tensorflow:Warmup done. Starting real timing
iter  0   0.0131747817993
iter  1   0.0132179403305
iter  2   0.0133423995972
iter  3   0.0131955814362
riter  4   0.0132367992401
iter  5   0.0132048225403
iter  6   0.013346657753
iter  7   0.0132423591614
iter  8   0.013192281723
iter  9   0.0131863164902
Comparison= True
INFO:tensorflow:Timing loop done!
images/s : 302.3 +/- 1.3, s/batch: 0.01323 +/- 0.00006
RES, Native, 4, 302.25, 1.34, 0.01323, 0.00006
2018-04-04 23:16:21.649057: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 1
2018-04-04 23:16:22.552676: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2624] Max batch size= 4 max workspace size= 2147483648
2018-04-04 23:16:22.552717: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2630] starting build engine
*** Error in `python': munmap_chunk(): invalid pointer: 0x00007ffd7180fee0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f7ab06c07e5]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x1a8)[0x7f7ab06cd698]
/usr/local/lib/python2.7/dist-packages/tensorflow/python/../libtensorflow_framework.so(_ZNSt10_HashtableISsSsSaISsENSt8__detail9_IdentityESt8equal_toISsESt4hashISsENS1_18_Mod_range_hashingENS1_20_Default_ranged_hashENS1_20_Prime_rehash_policyENS1_17_Hashtable_traitsILb1ELb1ELb1EEEE21_M_insert_unique_nodeEmmPNS1_10_Hash_nodeISsLb1EEE+0xfc)[0x7f7a747a8a0c]
/usr/lib/x86_64-linux-gnu/libnvinfer.so.4(_ZNSt10_HashtableISsSsSaISsENSt8__detail9_IdentityESt8equal_toISsESt4hashISsENS1_18_Mod_range_hashingENS1_20_Default_ranged_hashENS1_20_Prime_rehash_policyENS1_17_Hashtable_traitsILb1ELb1ELb1EEEE9_M_insertIRKSsNS1_10_AllocNodeISaINS1_10_Hash_nodeISsLb1EEEEEEEESt4pairINS1_14_Node_iteratorISsLb1ELb1EEEbEOT_RKT0_St17integral_constantIbLb1EE+0x96)[0x7f7a2be81d46]
/usr/lib/x86_64-linux-gnu/libnvinfer.so.4(_ZNK8nvinfer17Network8validateERKNS_5cudnn15HardwareContextEbbi+0x131)[0x7f7a2be7dab1]
/usr/lib/x86_64-linux-gnu/libnvinfer.so.4(_ZN8nvinfer17builder11buildEngineERNS_21CudaEngineBuildConfigERKNS_5cudnn15HardwareContextERKNS_7NetworkE+0x46)[0x7f7a2be66606]
/usr/lib/x86_64-linux-gnu/libnvinfer.so.4(+0x481ce1)[0x7f7a2beb7ce1]
/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tensorrt/_wrap_conversion.so(_ZN10tensorflow8tensorrt7convert32ConvertSubGraphToTensorRTNodeDefERNS1_14SubGraphParamsE+0x2020)[0x7f7a2b63a160]
/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tensorrt/_wrap_conversion.so(_ZN10tensorflow8tensorrt7convert25ConvertGraphDefToTensorRTERKNS_8GraphDefERKSt6vectorISsSaISsEEmmPS2_ii+0x200b)[0x7f7a2b619c5b]
/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tensorrt/_wrap_conversion.so(+0x4e25f)[0x7f7a2b61125f]
/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tensorrt/_wrap_conversion.so(+0x4e8ea)[0x7f7a2b6118ea]
python(PyEval_EvalFrameEx+0x5ca)[0x4bc3fa]
python(PyEval_EvalCodeEx+0x306)[0x4b9ab6]
python(PyEval_EvalFrameEx+0x58b7)[0x4c16e7]
python(PyEval_EvalCodeEx+0x306)[0x4b9ab6]
python(PyEval_EvalFrameEx+0x58b7)[0x4c16e7]
python(PyEval_EvalCodeEx+0x306)[0x4b9ab6]
python[0x4eb30f]
python(PyRun_FileExFlags+0x82)[0x4e5422]
python(PyRun_SimpleFileExFlags+0x186)[0x4e3cd6]
python(Py_Main+0x612)[0x493ae2]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f7ab0669830]
python(_start+0x29)[0x4933e9]
======= Memory map: ========
00400000-006de000 r-xp 00000000 08:02 40110610                           /usr/bin/python2.7
008dd000-008de000 r--p 002dd000 08:02 40110610                           /usr/bin/python2.7
008de000-00955000 rw-p 002de000 08:02 40110610                           /usr/bin/python2.7
00955000-00978000 rw-p 00000000 00:00 0 
01846000-4c531000 rw-p 00000000 00:00 0                                  [heap]
200000000-200200000 rw-s 00000000 00:06 499                              /dev/nvidiactl
200200000-200400000 ---p 00000000 00:00 0 
200400000-200404000 rw-s 00000000 00:06 499                              /dev/nvidiactl
200404000-200600000 ---p 00000000 00:00 0 
200600000-200a00000 rw-s 00000000 00:06 499                              /dev/nvidiactl
200a00000-201600000 ---p 00000000 00:00 0 
201600000-201604000 rw-s 00000000 00:06 499                              /dev/nvidiactl
201604000-201800000 ---p 00000000 00:00 0 
201800000-201c00000 rw-s 00000000 00:06 499                              /dev/nvidiactl
201c00000-202800000 ---p 00000000 00:00 0 
202800000-202804000 rw-s 00000000 00:06 499                              /dev/nvidiactl
202804000-202a00000 ---p 00000000 00:00 0 
202a00000-202e00000 rw-s 00000000 00:06 499                              /dev/nvidiactl

dusty_nv · April 5, 2018, 12:59am

Hi wuxiekeij, have you tried following this GitHub and webinar, which has been tailored for deployment on Jetson?

If you are running on discrete GPU and not Jetson TX1/TX2, you may want to try posting to the GPU-Accelerated Libraries forum.

Topic		Replies	Views
Tensorflow-TRT integration not working TensorRT	0	488	June 29, 2018
TF-TRT issue Jetson TX2	25	4253	February 22, 2019
[TF r1.8][TensorRT 4.0.0.3] Output "munmap_chunk(): invalid pointer" TensorRT	0	975	May 8, 2018
Runtime Error with NVIDIA TensorRT (Deep Learning) Jetson TX1	20	7125	July 27, 2017
TensorRT Integration Speeds Up TensorFlow Inference Technical Blog	40	1305	March 27, 2020
Jetson TX2 Tensorrt l4t-tensorflow NGC Segmentation fault at build trt graphconverterV2 Jetson TX2 tensorrt	3	573	April 21, 2023
Error about tensorrt in jetson tx2 Jetson TX2 tensorrt	1	747	April 21, 2020
Faster-RCNN engine (TensorRT-8.2) failed to run inference on Jetson TX2 NX Jetson TX2 tensorrt	4	974	September 6, 2023
Error while converting my model to a TensorRT model. Not found: Container TF-TRT does not exist. (Could not find resource: TF-TRT/TRTEngineOp_0_0) TensorRT tensorrt	1	2620	December 9, 2021
TensorRT-5.0.2.6 \| ImportError: /usr/local/lib/python3.5/dist-packages/tensorrt/tensorrt.so: cannot ... Jetson TX2	8	2868	January 21, 2019

New TensorFlow/TensorRT apis not working

Related topics