may i malloc mem on device.

not using ‘malloc’ as such, use ‘cudaMalloc’, this will allocate memory that exists on the device.

i.e.

float* device_buffer;

CUDA_SAFE_CALL(cudaMalloc((void**)&device_buffer,number_of_elements*sizeof(float)));

remember, you cannot directly access this memory on the host without copying over it first.