Dynamic Memory Question

cshare · January 17, 2010, 11:47pm

Hi,

I’m currently trying to implement several image processing filters using CUDA. I’m relatively new to CUDA programming.

One problem I keep running into is allocating shared memory. For example, a median filter requires an array whose size depends on a variable filter radius which is set by the user at runtime. Obviously I can’t use this value to set the array size as it’s variable. At present I use a maximum value (#defined) although this seems wasteful in the case of small radii.

What’s the best way to allocate variable amounts of shared memory?

Cheers,

Chris

eyalhir74 · January 18, 2010, 8:43am

Look in the SDK for any sample with

extern __shared__  ....

such as radixSort, nbody etc…

eyal

eldadk · January 21, 2010, 12:59pm

see the manual (CUDA Toolkit Documentation 12.3 Update 1) pg 33:

4.2.3 Execution Configuration

Any call to a global function must specify the execution configuration for that call.

The execution configuration defines the dimension of the grid and blocks that will be used to execute the function on the device. It is specified by inserting an expression of the form <<< Dg, Db, Ns >>> between the function name and the parenthesized argument list, where:

Dg is of type dim3 (see Section 4.3.1.2) and specifies the dimension and size of the grid, such that Dg.x * Dg.y equals the number of blocks being launched;
Db is of type dim3 (see Section 4.3.1.2) and specifies the dimension and size of each block, such that Db.x * Db.y * Db.z equals the number of threads per block;
Ns is of type size_t and specifies the number of bytes in shared memory that is dynamically allocated per block for this call in addition to the statically allocated memory; this dynamically allocated memory is used by any of the variables declared as an external array as mentioned in Section 4.2.2.3; Ns is an optional argument which defaults to 0.

The arguments to the execution configuration are evaluated before the actual function arguments.

As an example, a function declared as

global void Func(float* parameter);

must be called like this:

Func<<< Dg, Db, Ns >>>(parameter);

Topic		Replies	Views
dynamic array in shared memory CUDA Programming and Performance	2	1970	October 16, 2015
A question of using shared memory CUDA Programming and Performance	5	5463	March 12, 2008
extern __shared__ does not allocate memory CUDA Programming and Performance	1	7510	December 1, 2009
how to assign shared memory size with variable blockDim.x blockDim.y and blockDim.z CUDA Programming and Performance	4	6943	September 29, 2010
Need for dynamic allocated shared memory? CUDA Programming and Performance	2	3586	March 4, 2011
Dynamic memory allocation CUDA Programming and Performance	4	2962	July 11, 2007
how to dynamically allocate shared memory CUDA Programming and Performance	1	4055	June 26, 2009
Shared Memory - Dynamic Allocation CUDA Programming and Performance	2	21462	November 21, 2008
shared memory dynamic allocation ? CUDA Programming and Performance	4	2967	December 11, 2009
__shared__ CUDA 9.0 programming Guide v. 0_Simple/matrixMul.cu CUDA Programming and Performance	5	941	December 17, 2017

Dynamic Memory Question

Related topics