We are working on optimizing some algorithms using CUDA.
To benchmark my GPU speed with that of the CPU speed,
Initially I had taken two separate files in VC++ project solution: one is .cpp file which contains the sequential algorithm (will run in CPU) and the other file .cu which contains the parallel algorithm (will run in GPU).
Later I just merged the sequential and parallel algorithm into single .cu file
When I profiled the above two programs, I got two different results in GPU speed. When I checked why this was so, I found that there is a difference in CPU speedups where as GPU is performing consistently! Can any one explain why this is happening? I am also working to find out more on thisâ€¦