I was trying to learn the CUDA feature of zero copying from host to device memory.
My main program is in Fortran, and I initialized a pointer (global variable to all C functions) in C with host mapped memory allocated. In Fortran, the pointer (technically, the array declared in Fortran) was assigned values. In the subsequent C functions, I tried to use cudaHostGetDevicePointer to map the host pointer to the device. However, it looks like CUDA doesn’t like this way, because cudaHostGetDevicePointer couldn’t work correctly (with error code 11 returned).
I think it is understandable as the same memory space was also declared by the Fortran code, which might be a problem. So, it seems in order to do the memory mapping, I have to make all my C pointers blind to Fortran code. I have tried that, and it worked properly.
Just curious about if there is any smart way that can somehow cheat either the Fortran or the C code, so that the CUDA device pointer can directly map the array declared in Fortran. Thanks!