New TensorFlow/TensorRT apis not working

I’m trying to follow the directions given at:

I downloaded the code examples at:
https://developer.download.nvidia.com/devblogs/tftrt_sample.tar.xz

and ran ‘run_all.sh’, only to find that it spitting out invalid pointer errors.
Any idea what may be wrong? I’m on CUDA 9.0, TensorFlow 1.7, TensorRT 4.0.0.3, Ubuntu 16.04, all standard stuff, on an i7 with a GTX 1080, nothing weird.

:) ./run_all.sh 
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Use the retry module or similar alternatives.
Namespace(FP16=True, FP32=True, INT8=True, batch_size=4, dump_diff=False, native=True, num_loops=10, topN=5, update_graphdef=False, with_timeline=False, workspace_size=2048)
Starting at 2018-04-04 23:16:11.195373
2018-04-04 23:16:11.209274: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-04-04 23:16:11.345285: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-04-04 23:16:11.345693: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1344] Found device 0 with properties: 
name: GeForce GTX 1080 major: 6 minor: 1 memoryClockRate(GHz): 1.7335
pciBusID: 0000:01:00.0
totalMemory: 7.92GiB freeMemory: 7.05GiB
2018-04-04 23:16:11.345707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-04 23:16:11.615095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-04 23:16:11.615127: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-04-04 23:16:11.615132: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-04-04 23:16:11.615307: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4055 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
INFO:tensorflow:Starting execution
2018-04-04 23:16:12.494643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1423] Adding visible gpu devices: 0
2018-04-04 23:16:12.494680: I tensorflow/core/common_runtime/gpu/gpu_device.cc:911] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-04-04 23:16:12.494685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:917]      0 
2018-04-04 23:16:12.494689: I tensorflow/core/common_runtime/gpu/gpu_device.cc:930] 0:   N 
2018-04-04 23:16:12.494825: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4055 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)
INFO:tensorflow:Starting Warmup cycle
INFO:tensorflow:Warmup done. Starting real timing
iter  0   0.0131747817993
iter  1   0.0132179403305
iter  2   0.0133423995972
iter  3   0.0131955814362
riter  4   0.0132367992401
iter  5   0.0132048225403
iter  6   0.013346657753
iter  7   0.0132423591614
iter  8   0.013192281723
iter  9   0.0131863164902
Comparison= True
INFO:tensorflow:Timing loop done!
images/s : 302.3 +/- 1.3, s/batch: 0.01323 +/- 0.00006
RES, Native, 4, 302.25, 1.34, 0.01323, 0.00006
2018-04-04 23:16:21.649057: I tensorflow/core/grappler/devices.cc:51] Number of eligible GPUs (core count >= 8): 1
2018-04-04 23:16:22.552676: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2624] Max batch size= 4 max workspace size= 2147483648
2018-04-04 23:16:22.552717: I tensorflow/contrib/tensorrt/convert/convert_nodes.cc:2630] starting build engine
*** Error in `python': munmap_chunk(): invalid pointer: 0x00007ffd7180fee0 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f7ab06c07e5]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x1a8)[0x7f7ab06cd698]
/usr/local/lib/python2.7/dist-packages/tensorflow/python/../libtensorflow_framework.so(_ZNSt10_HashtableISsSsSaISsENSt8__detail9_IdentityESt8equal_toISsESt4hashISsENS1_18_Mod_range_hashingENS1_20_Default_ranged_hashENS1_20_Prime_rehash_policyENS1_17_Hashtable_traitsILb1ELb1ELb1EEEE21_M_insert_unique_nodeEmmPNS1_10_Hash_nodeISsLb1EEE+0xfc)[0x7f7a747a8a0c]
/usr/lib/x86_64-linux-gnu/libnvinfer.so.4(_ZNSt10_HashtableISsSsSaISsENSt8__detail9_IdentityESt8equal_toISsESt4hashISsENS1_18_Mod_range_hashingENS1_20_Default_ranged_hashENS1_20_Prime_rehash_policyENS1_17_Hashtable_traitsILb1ELb1ELb1EEEE9_M_insertIRKSsNS1_10_AllocNodeISaINS1_10_Hash_nodeISsLb1EEEEEEEESt4pairINS1_14_Node_iteratorISsLb1ELb1EEEbEOT_RKT0_St17integral_constantIbLb1EE+0x96)[0x7f7a2be81d46]
/usr/lib/x86_64-linux-gnu/libnvinfer.so.4(_ZNK8nvinfer17Network8validateERKNS_5cudnn15HardwareContextEbbi+0x131)[0x7f7a2be7dab1]
/usr/lib/x86_64-linux-gnu/libnvinfer.so.4(_ZN8nvinfer17builder11buildEngineERNS_21CudaEngineBuildConfigERKNS_5cudnn15HardwareContextERKNS_7NetworkE+0x46)[0x7f7a2be66606]
/usr/lib/x86_64-linux-gnu/libnvinfer.so.4(+0x481ce1)[0x7f7a2beb7ce1]
/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tensorrt/_wrap_conversion.so(_ZN10tensorflow8tensorrt7convert32ConvertSubGraphToTensorRTNodeDefERNS1_14SubGraphParamsE+0x2020)[0x7f7a2b63a160]
/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tensorrt/_wrap_conversion.so(_ZN10tensorflow8tensorrt7convert25ConvertGraphDefToTensorRTERKNS_8GraphDefERKSt6vectorISsSaISsEEmmPS2_ii+0x200b)[0x7f7a2b619c5b]
/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tensorrt/_wrap_conversion.so(+0x4e25f)[0x7f7a2b61125f]
/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/tensorrt/_wrap_conversion.so(+0x4e8ea)[0x7f7a2b6118ea]
python(PyEval_EvalFrameEx+0x5ca)[0x4bc3fa]
python(PyEval_EvalCodeEx+0x306)[0x4b9ab6]
python(PyEval_EvalFrameEx+0x58b7)[0x4c16e7]
python(PyEval_EvalCodeEx+0x306)[0x4b9ab6]
python(PyEval_EvalFrameEx+0x58b7)[0x4c16e7]
python(PyEval_EvalCodeEx+0x306)[0x4b9ab6]
python[0x4eb30f]
python(PyRun_FileExFlags+0x82)[0x4e5422]
python(PyRun_SimpleFileExFlags+0x186)[0x4e3cd6]
python(Py_Main+0x612)[0x493ae2]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f7ab0669830]
python(_start+0x29)[0x4933e9]
======= Memory map: ========
00400000-006de000 r-xp 00000000 08:02 40110610                           /usr/bin/python2.7
008dd000-008de000 r--p 002dd000 08:02 40110610                           /usr/bin/python2.7
008de000-00955000 rw-p 002de000 08:02 40110610                           /usr/bin/python2.7
00955000-00978000 rw-p 00000000 00:00 0 
01846000-4c531000 rw-p 00000000 00:00 0                                  [heap]
200000000-200200000 rw-s 00000000 00:06 499                              /dev/nvidiactl
200200000-200400000 ---p 00000000 00:00 0 
200400000-200404000 rw-s 00000000 00:06 499                              /dev/nvidiactl
200404000-200600000 ---p 00000000 00:00 0 
200600000-200a00000 rw-s 00000000 00:06 499                              /dev/nvidiactl
200a00000-201600000 ---p 00000000 00:00 0 
201600000-201604000 rw-s 00000000 00:06 499                              /dev/nvidiactl
201604000-201800000 ---p 00000000 00:00 0 
201800000-201c00000 rw-s 00000000 00:06 499                              /dev/nvidiactl
201c00000-202800000 ---p 00000000 00:00 0 
202800000-202804000 rw-s 00000000 00:06 499                              /dev/nvidiactl
202804000-202a00000 ---p 00000000 00:00 0 
202a00000-202e00000 rw-s 00000000 00:06 499                              /dev/nvidiactl

Hi wuxiekeij, have you tried following this GitHub and webinar, which has been tailored for deployment on Jetson?

If you are running on discrete GPU and not Jetson TX1/TX2, you may want to try posting to the GPU-Accelerated Libraries forum.