I’m kinda new to cuda-fortran and I’m having trouble with my code when certain variables are larger.
With small inputs, the function runs correctly, but when I use bigger inputs, the function crash with the following message:
FATAL ERROR: FORTRAN AUTO ALLOCATION FAILED ... FATAL ERROR: FORTRAN AUTO ALLOCATION FAILED 0: copyout Memcpy (host=0xe80ca80, dev=0x2b59cccb1e00, size=728428) FAILED: 719(unspecified launch failure)
The error seems to occur when I transfer the variable from the device to the host (Y1=Y_d, line 142 of the file Test_Cuda_fct.cuf).
You can find the program file and the output message in the attachment.
The medium_input folder is not in it due to its large size (750mo).
I run my program with the following commands:
nvfortran -c MOD_deviceQuery.cuf nvfortran Test_Cuda_fct.cuf MOD_deviceQuery.o -o Test_Cuda_fct.x nvprof ./Test_Cuda_fct.x
Test_Cuda_fct.zip (112.2 KB)
Thank you in advance for your help,