quick question, but first a little background…I’ve been using CUDA to speed up some data reduction codes for my institution. When I first started, to get things up and running quickly, I had all my source in one large .cu file. As the project grew a little larger, I saved separate sources belonging to the same project as .cu files so everything could be compiled with nvcc in one go. The project is now fairly sizable, so I was wondering if there is any performance benefit to now splitting up the code into .cu for host(s) and kernel(s), and .c files for everything else (i.e. the proper way to do things, apparently). Since my all-.cu code is running nicely, is there any reason to worry about this? Thanks in advance for any advice!
calling kernels with the caret notation <<< … >>> is not valid in C++. The C++ compiler will probably think that you are trying to do something funky with templates…
Also, none of the CUDA function specifiers global, device, etc are valid in C++.