I have a cuda fortran code within a large fortran code which uses significant cpu memory. When the array size increases, cuda malloc returns this error message:
1040 bytes requested; not enough memory 2(out of memory)
With a simpler driver with less cpu memory, this cuda fortran code can run with a much larger array size. So I am curious whether the memory on cpu side causes the problem. Any suggestion?