minimizing data transfers from cpu to gpu

I want to copy an array to gpu the first time a kernel is called,and from thereon I just want to change one element of the array(from host), every time the kernel is called.I am new to CUDA C.Can you please suggest me a way out ?

You can use mapped memory so that host and GPU can operate directly on the array without having to copy the array back and forth. Something like

cudaSetDeviceFlags(cudaDeviceMapHost);

var_type *ptr_array_on_host;

var_type *ptr_array_on_GPU;

cudaHostAlloc((void**)ptr_array_on_host,sizeof_array,cudaHos

tAllocMapped);//allocates memory on host side visible to GPU, replacing a malloc or calloc memory allocation on host side

cudaGetDevicePointer((void *)ptr_array_on_GPU,(void)ptr_array_on_host,0);//gets GPU pointer to host side memory

initialize(ptr_array_on_host);

//call kernel

while (something)

kernel<<<grid,threads>>>(ptr_to_array_on_GPU)

__cudaThreadSynchronize()

operate_on_array(ptr_to_array_on_host)

cudaHostFree(ptr_to_array_on_GPU);

That should do it. Though you need have a device capable of allocating mapped memory, reading the device properties should show you if it is possible. Anyway a good option is to browse through Dr. Dobb’s tutorials on CUDA Dr Dobbs on mapped memory