I’m trying to use cudaMallocHost but it doesn’t work so far. Maybe I’m doing something wrong and you may be able to help me.
I declare one pointer as a member of a C++ class.
float * array;
Then from my class, I call one function to initiate all arrays on the GPU, so I give this pointer.
In InitGPU (in a .cu file so compiled with nvcc) I allocate the memory with cudaMallocHost.
Then if I try to access this array from one function on my C++ class, I got a segmentation fault (like array[i] = something). Same thing if I try a memset. And I’m not making any mistake with the size of the array. Did I do something wrong? I looked at the bandwidth test example from the SDK. But everything is done from the GPU (no interaction with a C++ class), and uses only memcpy. Or we have to use memcpy only with this kind of memory?