Hi,
I use python 3.5.2 to work with TensorRT.
Modules installed are:
-tensorrt (3.0.4)
-pycuda (2017.1.1)
-uff (0.2.0)
-tensorflow (1.6.0rc1)
The problem I have is that even though I create tensorrt engine with DataType.HALF I must provide input in float32. If I provide input in float16 I got the error back during context.execute(1, self._device_bindings):
[TensorRT] ERROR: cudnnEngine.cpp (420) - Cuda Error in execute: 74
[TensorRT] ERROR: cudnnEngine.cpp (420) - Cuda Error in execute: 74
pycuda._driver.LogicError: cuFuncSetBlockShape failed: misaligned address
PyCUDA WARNING: a clean-up operation failed (dead context maybe?)
cuStreamDestroy failed: misaligned address
I create tensorrt engine via call to trt.utils.uff_to_trt_engine as follows:
uff_model = conversion_helpers.from_tensorflow(output_graph_def, "output layers names go here")
parser = uffparser.create_uff_parser()
#register some inputs on parser
#register some outputs on parser
_trt_engine = trt.utils.uff_to_trt_engine(G_LOGGER,
uff_model,
parser,
1,
1 << 20,
trt.infer.DataType.HALF)
then I create array of in\out bindings
by doing:
foreach i in trt_engine.get_nb_bindings():
dims = trt_engine.get_binding_dimensions(i).to_DimsCHW()
elt_count = dims.C() * dims.H() * dims.W()
device_bindings.append(cuda.mem_alloc(elt_count * np.float16(0).itemsize))
and then
context.execute(1, device_bindings)
Any help would be highly appreciated.