I tried to compile the tcufft2dc3 sample with the HPC SDK 20.7 and I get a segmentation fault when running the code. This occurs on Ubuntu Linux x86_64 18.06 and CUDA 10.2. These are the compilation flags:
I tried recreating the issue here, but it works fine for me. I tried various compiler versions, GPUs, CUDA versions, etc., including your specifics, but all ran correctly. Hence the issue is likely something to do with your system or environment.
Can you compile with debugging enabled (i.e. add “-g”), and run it through gdb to see where the segv is coming from? Note a segv occurs in host code.
Could this be a stack overflow? i.e. is your environment’s stacksize limit set too small? The limit can be seen either via “ulimit -s” (bash) or “limit” (csh).
Definitely a stack overflow given these are static arrays and placed on the stack.
8192 is relatively small. Try increasing your stack size to at least 16,384. I typically set it to unlimited in my .bashrc/.cshrc files so it’s always set when I launch a new shell.