CUDA Illegal Memory Acess using PyCuda and TensorRT Inference

haritsahm · April 6, 2021, 7:10am

Hi guys, can someone help me? I encounter this error about an illegal memory access when inferencing using TensorRT Python

I also tried to use DeviceMemoryPool() instead of cuda.mem_alloc() but still not working

def allocate_buffers(engine):
    inputs = []
    outputs = []
    bindings = []
    stream = cuda.Stream()
    mem_pool = DeviceMemoryPool()

    for binding in engine:
        
        size = trt.volume(engine.get_binding_shape(
            binding))

        dtype = trt.nptype(engine.get_binding_dtype(binding))
        # Allocate host and device buffers
        
        host_mem = cuda.pagelocked_empty(size, dtype)
        device_mem = mem_pool.allocate(host_mem.nbytes)

        # Append the device buffer to device bindings.
        bindings.append(int(device_mem))
        # Append to the appropriate list.
        if engine.binding_is_input(binding):
            inputs.append(HostDeviceMem(host_mem, device_mem))

        else:
            outputs.append(HostDeviceMem(host_mem, device_mem))

    return inputs, outputs, bindings, stream, mem_pool

The inference script is similar to do_inference.py

def do_inference(engine, bindings, inputs, outputs, stream, batch_size=1):
    with engine.create_execution_context() as context:
        # Transfer input data to the GPU.
        [cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]

        # Run inference.
        context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)

        # Transfer predictions back from the GPU.
        [cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
       
        # Synchronize the stream
        stream.synchronize()

    # Return only the host outputs.
    return [out.host for out in outputs]

I’m not sure whats wrong. But I think it has something to do with the memcpy_htod_async and memcpy_dtoh_async

[TensorRT] ERROR: engine.cpp (169) - Cuda Error in ~ExecutionContext: 700 (an illegal memory access was encountered)
[TensorRT] ERROR: INTERNAL_ERROR: std::exception
[TensorRT] ERROR: Parameter check failed at: safeContext.cpp::terminateCommonContext::216, condition: cudnnDestroy(context.cudnn) failure.
[TensorRT] ERROR: Parameter check failed at: safeContext.cpp::terminateCommonContext::221, condition: cudaEventDestroy(context.start) failure.
[TensorRT] ERROR: Parameter check failed at: safeContext.cpp::terminateCommonContext::226, condition: cudaEventDestroy(context.stop) failure.
[TensorRT] ERROR: ../rtSafe/safeRuntime.cpp (32) - Cuda Error in free: 700 (an illegal memory access was encountered)
terminate called after throwing an instance of 'nvinfer1::CudaError'
  what():  std::exception

I also attach the nvidia-bug-report nvidia-bug-report.log.gz (1.4 MB)

Topic		Replies	Views
An illegal memory access was encountered using PyCUDA and TensorRT TensorRT	0	1107	June 6, 2019
Illegal memory access Error Computer Vision & Image Processing tensorrt , pycuda	1	2623	May 3, 2023
GPU Illegal Memory Access when using pycuda/tensorrt with FP16 input TensorRT	3	1414	December 1, 2022
pycuda._driver.LogicError: cuMemcpyDtoHAsync failed: an illegal memory access was encountered TensorRT tensorrt , cuda , kernel	3	939	May 18, 2023
Illegal memory access using cuda graph and CUPTI TensorRT cuda , kernel , cudnn	0	382	March 14, 2024
Python run LPRNet with TensorRT show pycuda._driver.MemoryError: cuMemHostAlloc failed: out of memory TAO Toolkit	5	4150	July 7, 2021
Can't perform inference using Python API of TensorRT TensorRT inference-fil-spark	1	659	November 20, 2020
TensorRT error : cuMemcpyDtoHAsync failed: an illegal memory access was encountered TensorRT tensorrt , cuda	2	2265	April 7, 2021
Remove all the memcpy TensorRT	1	525	June 7, 2021
CUDA error 700 - an illegal memory access was encountered TensorRT	8	24001	April 12, 2022

CUDA Illegal Memory Acess using PyCuda and TensorRT Inference

Related topics