TensorRT 5.0 convert the model and infer

prince15046 · October 31, 2018, 1:30pm

Linux distro and version - Ubuntu 16.04
GPU type - Tesla K80
nvidia driver version - 390.46
CUDA version - 9.0
CUDNN version - 7.3
Python version [if using python] - 3.5
Tensorflow version 1.7.0
TensorRT version - 5

Describe the problem

Conversion to tensorRT engine was successful, wrote it to the engine file.

uff_model = measure(lambda: uff.from_tensorflow_frozen_model(frozen_graph_filename, output_names), 'uff.from_tensorflow')
    builder = trt.Builder(G_LOGGER)
    builder.max_batch_size = 1
    builder.max_workspace_size = 1 <<  30
    network = builder.create_network()
    parser2 = trt.UffParser()
    parser2.register_input(input_names[0], (channel, height, width))
    parser2.parse_buffer(uff_model, network)
    engine2 =   builder.build_cuda_engine(network)
    with open("new_engine_1.engine", "wb") as f:
        f.write(engine2.serialize())

While loading the file for inference.

def get_engine(engine_file):
        with open(engine_file,"rb") as f, trt.Runtime(TRT_LOGGER) as runtime:
                print("Engine Loaded")
                return runtime.deserialize_cuda_engine(f.read())
        return None



def infer(engine, x, batch_size):
        n = engine.num_bindings
        print('%d bindings' % n)
        print(x.shape)
        h_input = cuda.pagelocked_empty(trt.volume(engine.get_binding_shape(0)),  dtype=trt.nptype(trt.float32))
        h_output = cuda.pagelocked_empty(trt.volume(engine.get_binding_shape(1)),  dtype=trt.nptype(trt.float32))
        d_input = cuda.mem_alloc(h_input.nbytes)
        d_output = cuda.mem_alloc(h_output.nbytes)
        stream = cuda.Stream()
        context = engine.create_execution_context()
        cuda.memcpy_htod_async(d_input, h_input, stream)
        context.execute_async(batch_size,bindings=[int(d_input), int(d_output)], stream_handle=stream.handle)
        cuda.memcpy_dtoh_async(h_output, d_output, stream)
        stream.synchronize()
        return h_output

Output is shown -

2 bindings
(3, 368, 368)
[TensorRT] ERROR: cuda/cudaConvolutionLayer.cpp (163) - Cudnn Error in execute: 7
[TensorRT] ERROR: cuda/cudaConvolutionLayer.cpp (163) - Cudnn Error in execute: 7

The engine is getting error at execute.

NVES · October 31, 2018, 3:01pm

Hello,

error 7 usually indicates that a launch did not occur because it did not have appropriate resources. Usually indicates that the user has attempted to pass too many arguments to the device kernel, or the kernel launch specifies too many threads for the kernel’s register count.

can you share a complete source (including import statements) so we can repro and help debug?

thomas.deegan · December 3, 2018, 11:52pm

I also ran into this error. I believe it may be a bug in TensorRT 5.

abarcovschi · June 18, 2020, 10:47am

Could you please elaborate on how this could be a bug in TRT 5?
I have ran into the same issue just now.

Topic		Replies	Views
TensorRT 5 Bug？cuda/cudaConvolutionLayer.cpp (163) - Cudnn Error in execute: 3 TensorRT	3	2693	June 28, 2019
TensorRT 5.0.2.6 with CuDNN 7.3.1 c++ execution raise an Engine.cpp (555) - Cuda Error in execute: 77 TensorRT	1	980	January 2, 2019
Cuda Error in execute tensorrt GPU-Accelerated Libraries	1	1214	December 18, 2017
Using TensorRT3.0 to convert tensorflow model to create TensorRT engine Jetson TX1	3	630	March 8, 2018
Inference error at engine.cpp::enqueue::293 TensorRT	4	2316	January 31, 2019
TensorRT batch Error TensorRT	1	1679	August 24, 2018
Cannot execute TensorRT samples TensorRT	2	906	May 2, 2018
Cudnn Error in execute: 8 TensorRT	5	3138	October 12, 2021
RuntimeError: CHECK failed: (index) < (current_size_): TensorRT	5	6318	September 6, 2019
Yolov5 Engine Inference error TensorRT tensorrt	3	1952	May 6, 2022

TensorRT 5.0 convert the model and infer

Related topics