I’ve been seeing errors in my (quite complex) commercial CUDA app that pop up at random times - usually invalid argument from
cudaMemset. I’ve been working on this bug for a few weeks now. I’ve gotten to the point where now I’m using
cudaPointerGetAttributes to test all my GPU buffers after every CUDA call (as well as synchronizing before & after each), and I’ve narrowed it down to
cufftDestroy() . Before that call, all buffers are valid; after that call, for all my buffers,
cudaPointerGetAttributes returns success, but returns a devicePointer address of
0 ! After that point, any
cudaMemset will fail on those buffers.
This only happens on CUDA 11.2 and later, apparently. And I think only on Windows.
I’m definitely passing a valid plan ID to
1 , as returned from
cufftPlan2D ), and
cufftDestroy returns success as does
I’m pretty sure I have no host stack/heap corruption; I’m using a very careful host allocator with bounds checking and the (very large) app is otherwise behaving properly. Also, cuda-memcheck doesn’t report any issues.
I’ve written a small reproducer that mimics the order and size of CUDA mallocs/frees and cufft calls, but of course everything works fine there… Are there any known issues with
cufftDestroy? Is there any possible way for it to trash the CUDA heap in some unusual circumstance (presumably based on some odd situation I’ve created)? Wish I could peek into its source.