device variables and arrays

a_goryachih · July 17, 2013, 11:30am

Hello,
I need to use variables and arrays during all program in different device functions, so i decided to initial it with device.
But I got some error when i tried to use device array in global function, for example

float* host_a;
device float* dev_a;

global change ()
{
dev[threadIdx.x] *= -1;
}

main()
{
cudaError_t cuerr;

host_a = (int*) malloc(sizeof(int)*10);
for(int i =0; i<size; i++)
	host_a [i] = 0;


cuerr= cudaMalloc(&dev_a,10*sizeof(float));

cudaMemcpy(dev_a,host_a,(10*sizeof(float)),cudaMemcpyHostToDevice);



change<<<1,host_size>>();

cuerr=cudaMemcpy(host_a,dev_a,(10*sizeof(float)),cudaMemcpyDeviceToHost);

if(cuerr!=cudaSuccess)
	printf(,cudaGetErrorString(cuerr));

}
after this i got “unknowm error”

cmaster.matso · July 17, 2013, 12:40pm

As for You code sample ‘dev_a’ is a device pointer to float. You cannot access it form host code.
Tried passing ‘dev_a’ as kernel call parameter instead? Define it locally (in main() function for example), allocate it with cudaMalloc and then pass as kernel parameter, like so:

__global__ void change(float *ptr) {
...
}

int main() {
...
float *dev_a;
cuerr= cudaMalloc(&dev_a,10*sizeof(float));
...
change<<<1,host_size>>>(dev_a);
...
}

Cheers,
MK

pasoleatis · July 17, 2013, 4:11pm

I think that if you define it outside of main() you need to run a command find out the address of the pointer.

a_goryachih · July 18, 2013, 4:22am

Thanks, cmaster.matso!
But I have more than 20 arrays and variables. Using it like kernel call parameter is not so comfortable).
And what about constant memory? How can i define and use arrays constant memory? I mean something like that:
float* host_a;
constant float* const_a;

global change (float *array)
{
array[threadIdx.x] = const_a[threadIdx.x];
}

main()
{
cudaError_t cuerr;

float* dev_a;

host_a = (int*) malloc(sizeof(int)*10);
for(int i =0; i<size; i++)
host_a [i] = 0;

cuerr= cudaMalloc(&dev_a,10sizeof(float));
cudaMemcpy(dev_a,host_a,(10sizeof(float)),cudaMemcpyHostToDevice);

/* CUDA MAGIC
copy data from host_a to const_a
/
change<<<1,host_size>>(dev_a);
сuerr=cudaMemcpy(host_a,dev_a,(10sizeof(float)),cudaMemcpyDeviceToHost);
}

cmaster.matso · July 18, 2013, 3:16pm

Say You have a variable ‘array’ allocated by ‘cudaMalloc’, in the host code. That means You have a pointer to a device memory block. What You need to do is to find a symbol (can be constant) in Your device code, using ‘cudaGetSymbolAddress’ (say ‘dev_a’). Next copy the address of the allocated memory block (‘array’) to the address of the symbol You got with ‘cudaGetSymbolAddress’ call, i.e. copy it into the memory place where the ‘dev_a’ pointer is. In serial C code it would be like so:

...
float *array = malloc(N*sizeof(float));
float *dev_a;
dev_a = array; // Don't copy the content to the array but set both pointers point the same memory block
...

To sum-up:

[*] 'array' points to some device memory block, [*] 'dev_a' is a device-side pointer, not initialized by default (pointing somewhere unknown), [*] one need to set up 'dev_a' to point to memory block pointed by 'array', thus copy of the address itself.

Cheers,
MK

a_goryachih · July 23, 2013, 7:41am

Thanks!
Can you give me example of cuda code? Because all of it is not easy for me) I tried to use cudaGetSymbolAddress, and got error “invalid device symbol”.

cmaster.matso · August 1, 2013, 12:11pm

I use driver API functions, ‘cuModuleGetGlobal’ (to get address of a named, device-side variable), ‘cuMemGetAddressRange’ (to get address of allocated buffer) and ‘cuMemcpyHtoD’ (to finally copy the address).

cudafan · August 16, 2014, 12:42pm

I’m having the same issue: too many global device arrays to pass them as parameters to a kernel. Could not find any example how to do it. Help!

Robert_Crovella · August 16, 2014, 2:45pm

Flatten all your arrays and concatenate them. Then pass a pointer to the single array and a pointer to the array that gives the offset to the start of each sub-array, in the concateneated array. Maybe also a pointer to an array of lengths of each sub-array.
Pass a pointer to an array of pointers, each of which is a pointer to a sub-array. You probably also need to pass an integral value that is the length of the array of pointers. Maybe also an array of the lengths of each sub-array.

cublasSgetri/cublasSgetrf uses the second method, for example, for the array of pointers to device matrices to be inverted.

Topic		Replies	Views
how to get the value of array in a kernel function CUDA Programming and Performance	12	10980	October 31, 2007
__device__ array to __global__ Cant pass a __device__ array to __global__ CUDA Programming and Performance	3	2356	February 24, 2012
how to use __device__ or __constant__ array CUDA Programming and Performance	2	1073	July 25, 2013
passing an array to a kenel ? CUDA Programming and Performance	9	13780	June 10, 2009
How to Copy an Array to the GPU memory CUDA Programming and Performance	4	4004	June 20, 2008
How to pass variables to different kernal functions via global variables? CUDA Programming and Performance	8	3244	June 9, 2010
Global arrays? CUDA Programming and Performance	24	10953	August 18, 2010
Constant Array Usage Problem CUDA Programming and Performance	3	1162	July 26, 2009
Allocate non-constant number of arrays on device CUDA Programming and Performance	2	1651	April 20, 2009
passing constant memory CUDA Programming and Performance	3	2990	May 8, 2007

__device__ variables and arrays

Related topics

device variables and arrays