Program hit cudaErrorIllegalAddress (error 700) [...] on CUDA API call to cudaDeviceSynchronize

Hi there

My CUDA program crashes consistently for large inputs and occasinally for small ones.
I used CUDA-MEMCHECK to look for out-of-bounds memory accesses and fixed the ones I found.
I am still getting crashes however, CUDA-MEMCHECK reports them occuring inside cudaDeviceSynchronize, Nsight reports the (same) error in cuCtxSynchronize.
I’ve run out of debugging options, so I’d be very happy for any advice on how to debug this.

Thanks,
Joel

Full CUDA-MEMCHECK output:
========= CUDA-MEMCHECK
PASSED ebs_copy_test
PASSED ebs_num_test
Allocating Memory…
Initializing Reference Sequence…
Allocating Memory (171B) for 9 Reads
Initializing Reads…
Starting Kernel…
========= Error: process didn’t terminate successfully
========= The application may have hit an error when dereferencing Unified Memory from the host. Please rerun the application under a host debugger to catch such errors.
========= Program hit cudaErrorIllegalAddress (error 700) due to “an illegal memory access was encountered” on CUDA API call to cudaDeviceSynchronize.
========= Saved host backtrace up to driver entry point at error
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nvmdi.inf_amd64_b5c7e9f1cc7d29c6\nvcuda64.dll (cuProfilerStop + 0x9da58) [0x2ccdb8]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nvmdi.inf_amd64_b5c7e9f1cc7d29c6\nvcuda64.dll (cuProfilerStop + 0xa011a) [0x2cf47a]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nvmdi.inf_amd64_b5c7e9f1cc7d29c6\nvcuda64.dll [0x8035e]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nvmdi.inf_amd64_b5c7e9f1cc7d29c6\nvcuda64.dll (cuProfilerStop + 0x1229fa) [0x351d5a]
========= Host Frame:C:\Windows\system32\DriverStore\FileRepository\nvmdi.inf_amd64_b5c7e9f1cc7d29c6\nvcuda64.dll (cuProfilerStop + 0x13db82) [0x36cee2]
========= Host Frame:C:\Users\joel\source\repos\genasm-gpu\genasm_gpu.exe (cudart::cudaApiChooseDevice + 0x41) [0x18e1]
========= Host Frame:C:\Users\joel\source\repos\genasm-gpu\genasm_gpu.exe (cudart::cudaApiStreamEndCapture_ptsz + 0x33) [0x10703]
========= Host Frame:C:\Users\joel\source\repos\genasm-gpu\genasm_gpu.exe (cudaGetErrorName + 0x15) [0x18305]
========= Host Frame:C:\Users\joel\source\repos\genasm-gpu\genasm_gpu.exe (cudaGraphExecKernelNodeSetParams + 0x3) [0x1c2d3]
========= Host Frame:C:\Users\joel\source\repos\genasm-gpu\genasm_gpu.exe (cudaHostAlloc + 0x124) [0x20514]
========= Host Frame:C:\Windows\System32\KERNEL32.DLL (BaseThreadInitThunk + 0x14) [0x17034]
========= Host Frame:C:\Windows\SYSTEM32\ntdll.dll (RtlUserThreadStart + 0x21) [0x52651]
=========
========= No CUDA-MEMCHECK results found