Cuda Error in nhwcTonchwLaunch

../rtSafe/cuda/reformat.cu (1362) - Cuda Error in nhwcTonchwLaunch: 700 (an illegal memory access was encountered) FAILED_EXECUTION: std::exception
when I try to run TRT engine as;

at::Tensor spatial_features = torch::zeros({1, Config::num_input_channels, Config::grid_size_y, Config::grid_size_x}, 
                              torch::TensorOptions().device(device_).dtype(torch::kFloat));

  if (network_trt_ptr && network_trt_ptr->context_) {
    std::vector<void *> buffers = {spatial_features.data_ptr(),  output_center_t_.data_ptr(), 
                                    output_center_z_t_.data_ptr(), output_dim_t_.data_ptr(),
                                    output_rot_t_.data_ptr(), output_heatmap_t_.data_ptr()
                                  };
    bool ret = network_trt_ptr->context_->executeV2(buffers.data());
  }

And here is the onnx file i used to create my engine. Additionally I can run this engine in trtexec and it passes the test there.
livox_centerpoint_backbone_head.onnx (4.7 MB)
}

Hi,

Based on the error looks like accessing an invalid CUDA stream or CUDA event.
Please refer to the following doc, and samples and make sure your inference code is correct.

Could you please share with us the minimal issue repro if you still face this issue?

Thank you.

Instead of filling the engine’s outputs into libtorch tensors, I directly used cuda memory and then created libtorch tensors from that memory chunks. Somehow this solved the problem and I’m now able to get reasonable outputs. Still dont know why this solved the issue tho.