I want to create a rand matrix in my cuda program. Something like the randn function in Matlab. I ONLY add one line to my code:
and my program becomes 16x slower! (from 6s to 100s) only including the curand_kernel.h file!
I’m using CUDA VS WIZARD, gpu architecture sm_13, 470 gtx, cuda 3.2