I’m somewhat new to CUDA, but I’m working on a FCT implementation on the GPU. I just noticed that for some reason cuInit is taking about a second to run.
I use the following code portion to time cuInit:
CUresult result = cuInit(0);
printTimeDiff(“Choose Device”, tm1, tm0);
where printTimeDiff prints the difference from tm0 to tm1 in seconds.
I get the following output:
Choose Device 1.269787
Does cuInit always take this long, or is there some sort of error?
Also, I’ve noticed that if I don’t use cuInit and I call cudaMallocPitch, instead of getting the CUDA_ERROR_NOT_INITIALIZED error that I should get, the malloc takes over a second.