I am trying to see if I can port a Fortran subroutine to CUDA and call it from Fortran itself. Is this possible?
Obviously I will have to allocate the device pointers and execute memcopy(device) and memcopy(host) each time the function is called.
I am using Intel Compiler 11 (ifort) to compile the Fortran code. My approach is to have the Fortran compiled object files linked with CUDA(nvcc) so that I compare the CUDA kernel vs the Fortran subroutine. Any pointers/references?
I apologize to be bad mouth, but really using FORTRAN with CUDA is quite tedious. I use FORTRAN for my CPU codes, but I found it easier to use CUDA C for nvidia. I had only a short course in C 12 years ago.