Hi guys, can someone help me? I encounter this error about an illegal memory access when inferencing using TensorRT Python
I also tried to use DeviceMemoryPool() instead of cuda.mem_alloc() but still not working
def allocate_buffers(engine):
inputs = []
outputs = []
bindings = []
stream = cuda.Stream()
mem_pool = DeviceMemoryPool()
for binding in engine:
size = trt.volume(engine.get_binding_shape(
binding))
dtype = trt.nptype(engine.get_binding_dtype(binding))
# Allocate host and device buffers
host_mem = cuda.pagelocked_empty(size, dtype)
device_mem = mem_pool.allocate(host_mem.nbytes)
# Append the device buffer to device bindings.
bindings.append(int(device_mem))
# Append to the appropriate list.
if engine.binding_is_input(binding):
inputs.append(HostDeviceMem(host_mem, device_mem))
else:
outputs.append(HostDeviceMem(host_mem, device_mem))
return inputs, outputs, bindings, stream, mem_pool
The inference script is similar to do_inference.py
def do_inference(engine, bindings, inputs, outputs, stream, batch_size=1):
with engine.create_execution_context() as context:
# Transfer input data to the GPU.
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
# Run inference.
context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
# Transfer predictions back from the GPU.
[cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
# Synchronize the stream
stream.synchronize()
# Return only the host outputs.
return [out.host for out in outputs]
I’m not sure whats wrong. But I think it has something to do with the memcpy_htod_async and memcpy_dtoh_async
[TensorRT] ERROR: engine.cpp (169) - Cuda Error in ~ExecutionContext: 700 (an illegal memory access was encountered)
[TensorRT] ERROR: INTERNAL_ERROR: std::exception
[TensorRT] ERROR: Parameter check failed at: safeContext.cpp::terminateCommonContext::216, condition: cudnnDestroy(context.cudnn) failure.
[TensorRT] ERROR: Parameter check failed at: safeContext.cpp::terminateCommonContext::221, condition: cudaEventDestroy(context.start) failure.
[TensorRT] ERROR: Parameter check failed at: safeContext.cpp::terminateCommonContext::226, condition: cudaEventDestroy(context.stop) failure.
[TensorRT] ERROR: ../rtSafe/safeRuntime.cpp (32) - Cuda Error in free: 700 (an illegal memory access was encountered)
terminate called after throwing an instance of 'nvinfer1::CudaError'
what(): std::exception
I also attach the nvidia-bug-report nvidia-bug-report.log.gz (1.4 MB)