ERROR: 1: [gemmBaseRunner.cpp::allocateContextResources::236] Error Code 1: Cuda Runtime (invalid argument)

Description

I’m performing classification with the pre-serialized engine file.
Input dimension is 1* 3* 224* 224 and output dimension is 1* 3.
In the inference step, setTensorAddress function results in the error below.

// Managing additional state for intermdeiate activations
    context = engine->createExecutionContext();
    
    // Get the tensor shape of input and output tensors
    auto idims = engine->getTensorShape("input");
    auto odims = engine->getTensorShape("output");
    
    // Allocate GPU buffers for input
    size_t inputBufferSize = idims.d[0] * idims.d[1] * idims.d[2] * idims.d[3];
    size_t outputBufferSize = odims.d[0] * odims.d[1];
    void* inputBuffer;
    void* outputBuffer;
    cudaMalloc(&inputBuffer, inputBufferSize);
    cudaMalloc(&outputBuffer, outputBufferSize);
    
    context->setTensorAddress("input", inputBuffer);
    context->setTensorAddress("output", outputBuffer); 
ERROR: 1: [gemmBaseRunner.cpp::allocateContextResources::236] Error Code 1: Cuda Runtime (invalid argument)

Environment

TensorRT Version: 8.6.1
GPU Type: NVIDIA RTX A2000 Laptop GPU
Nvidia Driver Version: 535.129.03
CUDA Version: 12.2
CUDNN Version: 8.9.7
Operating System + Version: Ubuntu 20.04

Hi @jeonghyunjason ,
Can you please share your onnx model and repro scripts with us.

Thanks

download