Afaik in 2.3 where the kernel should be in .ptx or .cubin format, you cannot compile the kernel in gcc, as gcc knows nothing about these (Nvidia?) formats.
I got mine test working on win64 with mingw-w64 by using the driver api and 64bit linux nvcc to compile the kernel to .cubin (and .ptx). What is kind of annoying is that i cannot compile 64b and 32b versions on same nvcc, as it seems to support only native compiling. Also the message if you try to load 32b .cubin to 64bit cuda program you just get ‘cannot find function’-error.
Steps to compile program using driver-api:
- create .def files for the nvcuda.dll that can be found from system32 directory (with for example mingw-w64 gendef tool)
- create .a gcc library files (dlltool)
- compile your program with gcc and link with libraries
- compile your kernel on some OS that has same bitcount (linux compiled 64bit kernel worked on w64 .exe and 32b on w32 .exe)
I have not tested would it be possible (well in chroot at least it should be) to install both 32b and 64b nvcc to 64b os.
I would love to use the runtime api.
Some post say that one could use “–foreign” option to select compiler but in nvcc 2.3 it is not documented.
I also think that i saw (i cannot find it anymore!) a post about suggestion writing a wrapper for gcc that converts the input parameters from cl.exe to gcc but i have my doubts on that since the output of these two compilers are (??) not the same?
So, has anyone got runtime working in windows without using CL.exe?