cudaMemcpyFromSymbol painful problem

Exelle · December 21, 2009, 3:01pm

Hi everyone.

Here’s my problem, with which I’m stuck for the last 2 days External Image : in short, I need to copy a large amount of data as an 1D array to the GPU memory, and from there I need to take chunks out of the array and put them in other smaller arrays, still on the GPU.

I’ve been trying to solve this with cudaMemcpyFrom/ToSymbol and although I don’t get any errors, I can’t modify the data with these functions.

as a small example of what I’m trying to do:

[codebox]

global void setZero(int* data, int length)

{

int poz = blockIdx.x * blockDim.x + threadIdx.x;

if(poz >= length)

	return;

data[poz] = 0;

}

global void addFive(int* data, int length)

{

int poz = blockIdx.x * blockDim.x + threadIdx.x;

if(poz >= length)

	return;

data[poz] += 5;

}

device int *beta, *gamma;

int *alpha;

int main(int argc, char* argv)

{

       // host data

        alpha = (int*)malloc(100 * sizeof(int));

// device data

cudaMalloc((void**)&beta, 100 * sizeof(int));

        // device small data chunk

cudaMalloc((void**)&gamma, 10 * sizeof(int));

cudaMemcpy(beta, alpha, 100 * sizeof(int), cudaMemcpyHostToDevice);

setZero<<<2, 50>>>(beta, 100);

addFive<<<2, 50>>>(beta, 100);

// ==== //

cudaMemcpyFromSymbol(gamma, "beta", 10*sizeof(int), 20, cudaMemcpyDeviceToDevice);

addFive<<<2, 50>>>(gamma, 100);

cudaMemcpyToSymbol("beta", gamma, 10*sizeof(int), 20, cudaMemcpyDeviceToDevice);

        // === //

cudaMemcpy(alpha, beta, 100 * sizeof(int), cudaMemcpyDeviceToHost);

for(int i = 0; i < 100; ++i)

	printf("%d = %d\n", i, alpha[i]);

return 0;

}

[/codebox]

the example is just to show that I’m trying to move part of the device array ‘beta’ into ‘gamma’, modify it there, and move it back with cudaMemcpyTo/FromSymbol. the code between the // ==== //'s doesn’t affect the data at all, I get at the end an array full of 5’s, nothing extra.

cudaGetErrorString(cudaGetLastError()) after each line of code only returns “no error”, and I do have to mention that I tried every example I found, with & and without, with constant, trying to copy the whole array or just the first position instead of a chunk from the middle, copying from host to device and back instead of device-device, etc. etc.

It seems something’s wrong with the whole scenario, and I can’t figure a way past this, since, these functions seem to be the only way to copy a part of an array into another inside the cuda machine.

Jimmy_Pettersson · December 21, 2009, 3:35pm

The constant memory space is read-only. You can only modify the data host side.

Exelle · December 21, 2009, 4:07pm

so what else can I do in that case ? I don’t really need to modify the constant memory space device side, but I do need to use it as a buffer from where to get smaller pieces of arrays. it would be plain stupid to send the separate pieces directly from the host to the device mainly because they overlap heavily, and this is also why I need functions similar to cudaMemcpyTo/FromSymbol, since these can copy parts of an array.

Gregory_Diamos · December 21, 2009, 4:24pm

Maybe I don’t understand exactly what you are trying to do, but why can’t you just use cudaMemcpy exclusively and forget about constant memory? It can copy smaller pieces of an array…

Exelle · December 21, 2009, 4:42pm

because I need to specify the offset from where to start copying, not just copy from position 0, and while in c/c++ I’d probably do something like memcpy(dest, source + offset, size), I understand that working directly with pointers on the cuda device is somewhere between “not recomended” and “plain forbidden”.

so that’s what I noticed between the normal cudaMemcpy and the cudaMemcpyTo/FromSymbol - being able to specify an offset for the copy.

in the specification it says that the ‘cudaMemcpyTo/FromSymbol’ functions work with the cudaMemcpyDeviceToDevice flag - and I used the functions from within the host code, not the device code, so technically, being the host I can modify the constant space any way I like - for example. I modify it in the first hand by copying from host to device.

avidday · December 21, 2009, 4:54pm

You can do that. You can’t dereference or otherwise manipulate device pointers in host code, but you can calculate an offset in linear device memory from a device pointer. You just have to make sure you understand padding and alignment words in GPU storage, if they exist.

Jimmy_Pettersson · December 21, 2009, 6:45pm

yepp

Topic		Replies	Views
Global arrays? CUDA Programming and Performance	24	10608	August 18, 2010
copy device memory to constant memory CUDA Programming and Performance	4	12456	November 11, 2008
directly copy device memory to the constant memory how to directly copy CUDA Programming and Performance	3	4689	October 30, 2008
copy only a small type from host to device cudaMemcpy not for an array CUDA Programming and Performance	3	1048	May 5, 2010
passing an array to a kenel ? CUDA Programming and Performance	9	13003	June 10, 2009
constant and shared memoty CUDA Programming and Performance	8	9134	December 12, 2008
Allocating a multidimensional array onto a device variable CUDA Programming and Performance	6	1581	July 15, 2015
help cudaMemcpy2d Trying to modify a 2d array on cuda device CUDA Programming and Performance	8	4975	September 11, 2010
Doubt about cudaMemcpy and cudaMemcpytoSymbol CUDA Programming and Performance	2	3034	December 26, 2012
Help with cuda 2d array CUDA Programming and Performance	6	7445	September 29, 2014

cudaMemcpyFromSymbol painful problem

Related topics