Cuda Error in NCHWToNCHHW2: 1 (invalid argument)

I am trying to run TensorRT inference on C++, and have initialized void *buffers[2] and allocated memory using cudaMalloc. I then cudaMemcpy data from an image buffer cudaMemcpy(buffers[0], (void *) cpu_buffer, MODEL_INPUT_CHANNEL_SIZE * sizeof(float), cudaMemcpyDeviceToHost) and then try and run context->enqueue(BATCH_SIZE, &buffers[0], 0, nullptr) but I get ../rtSafe/cuda/ (925) - Cuda Error in NCHWToNCHHW2: 1 (invalid argument) error. I have not been able to find why this error could be happening.
Thank you!

Hi @ChocolateTidePods,

We need to allocate buffer size BATCH_SIZE * xx. We cannot find other issues in the given error. Could you please share error logs, issue reproducible code and model for better debugging.

Thank you.