How to copy to device memory with offset?

Hi collegues,

I see function cuMemcpyHtoD for copy to device memory, but I did not find offset parameter for device where memory should be copied.

E.g. I meean like this done in offset OpenCL parameter:

cl_int clEnqueueWriteBuffer (cl_command_queue command_queue,
 	cl_mem buffer,
 	cl_bool blocking_write,
 	<b>size_t offset,</b>
 	size_t cb,
 	const void *ptr,
 	cl_uint num_events_in_wait_list,
 	const cl_event *event_wait_list,
 	cl_event *event)

Is it posible somehow to copy memory to some specific place in device memory with offset not equal zero? I mean like:

CUdeviceptr devMem;
int hostMem[42];

cuMemAlloc(&devMem, 42*sizeof(int));


<b>someCudaCopyFunction</b>(&devMem, <b>sizeof(int)*10</b>, hostMem+10, (42-10)*sizeof(int))

where “sizeof(int)*10” my offset in device memory.

Thank you.

with cudaMemcpy and the like, it’s possible to use ordinary C-style pointer arithmetic to accomplish this:

#define DSIZE 1048576
int *h_data, *d_data;
h_data = (int *)malloc(DSIZE*sizeof(int));
cudaMalloc(&d_data, DSIZE*sizeof(int));
memset(h_data, 0, DSIZE*sizeof(int));
cudaMemcpy(d_data+(DSIZE/2), h_data, (DSIZE/2)*sizeof(int), cudaMemcpyHostToDevice);
// the above line will copy to d_data starting halfway through the buffer.

Using driver API, you have to cast CUdeviceptr to char* manually.
CUdeviceptr dst = reinterpret_cast< CUdeviceptr > ( reinterpret_cast<char*>(devMem) + sizeof(int) * 10 );
char* src = (char*)hostMem + sizeof(int) * 10;
int bytes = (42 -10) * sizeof(int);
cuMemcpyHtoD(dst, src, bytes);

This doesn’t work for driver API memory copy.