Passing variables into kernel over 256 bytes

ShingleServ · July 11, 2011, 7:52pm

Hi,

I’m working on a problem of how to pass a large number of variables into the kernel.
By large I mean 4 2d arrays of 10x100 each, and around 32 other double variables.

I have seen the suggestion of using struct and I have attempted to use these to pass the arrays in.
However each time I try to do this the kernel just crashes.

Another thought is that once the majority of variables are in the kernel they will not be modified. But they are not able to be declared as constants.
Is there a solution I could use with global memory that can be set through input of .csv files?

Basically read in the files and data. set the data and store it in global memory so that all threads can access them without having to pass them through the kernel as parameters? I could really use some help with this. Any suggestions or thoughts or tips are greatly appreciated.

Thank you.

Skybuck · July 11, 2011, 10:29pm

Simply pass 1 pointer as a parameter to the kernel.

The pointer points to cuda device memory, which can be allocated by you, use the copy functions to copy your data into that device memory.

ShingleServ · July 12, 2011, 1:51pm

Thanks for the read.

I have been passing pointers.

double *dev_dXn;

dXn = 0.25;									//delta X (change in X)

CUDA_CALL(cudaMalloc((double**)&dev_dXn, size * sizeof(double)));

CUDA_CALL(cudaMemcpy(dev_dXn, &dXn, size * sizeof(double), cudaMemcpyHostToDevice));

then I pass dev_dXn into the kernel. However I have many variables I need to pass into the kernel. Well over the 32 doubles.

I’m looking for help with creating a struct to pass the variables or alternate options to pass large numbers of variables into the kernel.

hyqneuron · July 12, 2011, 2:49pm

You certainly could wrap things in a struct, copy it to device memory, and then pass the address of the struct to your kernel.

Another way to do it is to copy things to a constant symbol, which can then be used in your kernel. A constant symbol can have a maximum size of 64K, but is limited in the sense that different threads will be serialized if they do not access the same part of the constant object at the same time. But there’s also an advantage: it doesn’t take up space in L1. It is cached separately and is just as fast as L1 when cached. The constant cache size of each MP is 8KB.

Actually, the compiler itself uses constant memory to pass parameters. The size limit of 256Bytes is artificial.

avidday · July 12, 2011, 2:53pm

On Fermi. But on older hardware, kernel arguments are passed in shared memory and the size limit is very real.

ShingleServ · July 12, 2011, 2:54pm

Thank you hyqneuron.

I have just got my struct working (was having issues which was why I was looking for other options)
Thanks for the information though I’m sure that will be very useful further on down the road for me.

Topic		Replies	Views
Kernel requiring large number of parameters CUDA Programming and Performance	14	8594	September 5, 2008
Optimizing a Kernel with a lot of variables - memory allocation. CUDA Programming and Performance	6	730	August 5, 2016
How to define variables in device memory? CUDA Programming and Performance	8	3760	November 3, 2007
Problem on psaaing memory from host to device CUDA Programming and Performance	3	842	April 23, 2012
How do you use constant memory for very large kernel parameter lists? CUDA Programming and Performance	4	3712	October 1, 2010
Defining global variables on the host and device at once? CUDA Programming and Performance	14	14012	December 19, 2020
Memory Questions CUDA Programming and Performance	6	2397	September 4, 2009
Passing an array to the kernel in constant memory CUDA Programming and Performance	3	4454	March 19, 2011
Pointer as formal parameter in kernel call CUDA Programming and Performance	8	6374	March 13, 2009
Using Constant memory to hold Parameters CUDA Programming and Performance	2	2664	April 10, 2013

Passing variables into kernel over 256 bytes

Related topics