sorry for this newbie question, but I am confused. Just getting started with CUDA and wrote the code below. What I want is to add a constant float to each element in a vector. However if call the kernel with val = 1.0 and pSrcDst filled with 1.0’s it returns all 1025.0’s. This looks like things got bit shifted. When I change the 2nd kernel line to
pSrcDst[idx] = val;
it does replace all values with 1.0.
It is probably quite trivial, but I am a bit confused at the moment.
Well, what do you pass in for [font=“Courier New”]pSrcDst[/font]?
There is however a mistake in the calculation of [font=“Courier New”]numBlocks[/font], which leads to your kernel trying to access [font=“Courier New”]len*len[/font] floats instead of just [font=“Courier New”]len[/font] values, which probably makes your kernel fail at some point (at least for larger values of [font=“Courier New”]len[/font]).
To catch these kinds of problems, always check return codes of CUDA function calls and test your program under cuda-memcheck.
Upd: the faulty numBlocks calculation was indeed responsible for the behavior. It was a sloppy copy/paste job I’m afraid. Thanks again for pointing it out.
Thanks for the tip. I’ll look into that. What I pass in as pSrcDst is an array of floats of length len.