Hi,
I’m experiencing an issue with my CUDA Fortran program. It runs fine on a GTX 1050, but fails on a Tesla V100. The program uses two kernel functions with the following thread configurations:
- 31x64 threads for the first kernel
- 64x64 threads for the second kernel
The error only occurs on the Tesla V100. I’ve attached the error message and a snippet of the code.
Could this issue be related to the Tesla V100’s hardware or resource limitations? Any suggestions would be appreciated!
Error Message Screenshot:
^C
Thread 1 “1” received signal SIGINT, Interrupt.
[Switching focus to CUDA kernel 0, grid 2, block (7,0,0), thread (0,2,0), device 0, sm 14, warp 2, lane 16]
flux::reconstruction_x<<<(8,8,1),(8,8,1)>>> (
aadens=<error reading variable: Cannot access memory at address 0x5>,
aaxmom=<error reading variable: Cannot access memory at address 0x0>,
aaymom=<error reading variable: Cannot access memory at address 0x0>,
aaener=<error reading variable: Cannot access memory at address 0x0>,
aad=<error reading variable: Cannot access memory at address 0x0>,
aax=<error reading variable: Cannot access memory at address 0x0>,
aay=<error reading variable: Cannot access memory at address 0x0>,
aae=<error reading variable: Cannot access memory at address 0x0>)
at 3.for:155
Thanks!