Data type for TensorRT engine created from UFF model with DataType.HALF


I use python 3.5.2 to work with TensorRT.
Modules installed are:
-tensorrt (3.0.4)
-pycuda (2017.1.1)
-uff (0.2.0)
-tensorflow (1.6.0rc1)

The problem I have is that even though I create tensorrt engine with DataType.HALF I must provide input in float32. If I provide input in float16 I got the error back during context.execute(1, self._device_bindings):
[TensorRT] ERROR: cudnnEngine.cpp (420) - Cuda Error in execute: 74
[TensorRT] ERROR: cudnnEngine.cpp (420) - Cuda Error in execute: 74
pycuda._driver.LogicError: cuFuncSetBlockShape failed: misaligned address
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuStreamDestroy failed: misaligned address

I create tensorrt engine via call to trt.utils.uff_to_trt_engine as follows:

uff_model = conversion_helpers.from_tensorflow(output_graph_def, "output layers names go here")
parser = uffparser.create_uff_parser()
#register some inputs on parser
#register some outputs on parser
_trt_engine = trt.utils.uff_to_trt_engine(G_LOGGER,
                                                1 << 20,

then I create array of in\out bindings
by doing:

foreach i in trt_engine.get_nb_bindings():
  dims = trt_engine.get_binding_dimensions(i).to_DimsCHW()
  elt_count = dims.C() * dims.H() * dims.W()
  device_bindings.append(cuda.mem_alloc(elt_count * np.float16(0).itemsize))

and then

context.execute(1, device_bindings)

Any help would be highly appreciated.

We created a new “Deep Learning Training and Inference” section in Devtalk to improve the experience for deep learning and accelerated computing, and HPC users:

We are moving active deep learning threads to the new section.

URLs for topics will not change with the re-categorization. So your bookmarks and links will continue to work as earlier.


Please file a bug here:
Please include the steps/files used to reproduce the problem along with the output of infer_device.