I have found a simple program on the internet and tried to run it. Compiling tells me that theres no problem, but when executing the program there is the wrong result. The program should square all values in my vector, but the values did not change at all. I am compiling with “nvcc -o r vecadd.cu” and running it with ./r. I think that he is not even starting a thread or the kernel is not running at all. But shouldn’t he at least tell me that an error occured or that no threads have been started?
You probably want to take the first cudaMemcpy call out of the for loop. Also, the first parameter to the cudaMemcpy calls is the destination while the second one is the source, not vice versa.