I have a very large set of very large arrays stored on the host and wish to process them by partitioning these arrays and copying them onto devices so that kernels can process them, but I’m having trouble striding along the host arrays.
when I use a stride counter as below to stride along the host arrays I get a “invalid device pointer” run time error i.e. the compiler says the syntax is ok
[codebox]cudaMemcpy(d_xi, h_x[index], DATABLOCKSIZE*sizeof(float4), cudaMemcpyHostToDevice);[/codebox]
where index = blockid*DATABLOCKSIZE
and when I use the following address mode (which has worked before but in a different context) I get a “cannot convert float4 to const void*” compile time error.
[codebox]cudaMemcpy(&d_xi, &h_x[index], DATABLOCKSIZE*sizeof(float4), cudaMemcpyHostToDevice);[/codebox]
So what is the precise syntax required?