How to use Intel compiler c++ to run cuda code?

hl3381 · April 17, 2020, 5:04pm

Hi,
I am trying to hybrid OpenMp and Cuda to accelerate my program, but visual studio does not support higher versions than OpenMp2.0, and I want to use some function in at least OpenMp 3.1. So I installed Intel parallel studio, but it cannot work.
My OS is Windows 10, and the Visual studion 2019, Intel parallel studio XE 2019 and Cuda toolkit 10.2 are installed.
I have tried to run several programs which run well within VS and Cuda, but all failed in Intel compiler and Cuda. The same error is reported, "Invalid preprocessor command ‘include_next’ ". The ‘# include_next’ is in ‘math.h’ of Intel C++ include directory. I have searched online for a long time, and I still don’t find available approach to solve this problem in Windows operation system. Could someone give me some advise and I am really appreciate any comment you posted. Thank you.

luis.leon · April 17, 2020, 6:33pm

Hi @hl3381

I have done something kind of similar but on Linux. Since nvcc is actually a wrapper, you can use only the linker wrapped by nvcc and compile with Intel those parts which are really host code. The process is more or less:

Invoke the icc or icpp to create the object (.o files) from each source code (.cpp).
Invoke the nvcc to create the object files for cuda files
Invoke the nvcc to link

You can find a bit of information here:

https://devblogs.nvidia.com/separate-compilation-linking-cuda-device-code/

and to specify within the wrapper which compiler:

Regards,
Leon

hl3381 · April 20, 2020, 5:39am

Hi Leon,

Thank you million tons! It really did work. I am very sorry to reply so late cause I was not farmiliar with the compiler directives and I have to make a fresh start to learn it.

In windows system, the main steps to compile program with Intel c++ compiler and nvcc is same as linux, but there are some small differences between them.

The icl should be invoked to create the object file with each cpp file. Any cuda API or kernels are not allowed in the cpp file. The /Qopenmp should be included in the command line for Openmp usage.
Invoke the nvcc to create the object files for cuda files.
Invoke the nvcc to link and create the app file.

It should be noted that we don’t need to follow all steps posted in the Blog. It is no need to create the gupCode.obj, and the final step to link and create app is to invoke nvcc while not icl (g++ in the Blog).

The Visual Profiler can also be used to analyze the app.

Thank you!

Best regards,
LIU He