In GPU T4 cuda code error:an illegal memory access was encountered

I deploy pointpillar on the rtx30 series graphics card, and the code for post-processing nms is as follows
std::vector<uint64_t> host_mask(host_filter_count * col_blocks);
GPU_CHECK(cudaMemcpy(&host_mask[0], dev_mask,
sizeof(uint64_t) * host_filter_count * col_blocks,
cudaMemcpyDeviceToHost));
It can run correctly on rtx30 series graphics cards, but on T4 graphics cards, this part will report an error: an illegal memory access was encountered.Has anyone encountered this problem?

Compiling the code with -lineinfo and running it with compute-sanitizer should give you the exact location where the memory access occurs.

There are many code files in my entire project. I want to directly check the executable file generated by cmake. How to use this tool? I cannot directly nvcc to compile the entire project.

add -lineinfo to the cmake build settings (NVCC flags, or something like that.)

Then let’s say your compiled executable is my_executable.

To use compute-sanitizer, you would do:

compute-sanitizer ./my_executable

There is documentation for compute-sanitizer.

This writeup may also be of interest.

thanks for your advice

compute-sanitizer --tool memcheck ./dbg_test01
========= COMPUTE-SANITIZER
========= Error: process didn’t terminate successfully
========= The application may have hit an error when dereferencing Unified Memory from the host. Please rerun the application under cuda-gdb or a host debugger to catch host side errors.
========= Target application returned an error
========= ERROR SUMMARY: 0 errors
When I debug as required, no matter what program, there will be this error

you’ll have to fix that first before trying to use compute-sanitizer. That is a general application debug issue. It’s hard for me to give advice without knowing anything about the error output from the application.

Make sure the application does a

 return 0;

if it is not doing that, compute-sanitizer won’t work. It also won’t work if you have a seg fault or other host-code detectable error.

Basically you have an error that is detectable in host code.

It might be that your error handling routine/macro (GPU_CHECK()) is returning non-zero when you hit this CUDA error. in that case you will have to fix that (if you want to use compute-sanitizer).