dynamic alloca buffer (working space) in kernel locally CUDA programming

Hi all

How to dynamically allocate local buffer space in kernel? I need a buffer space to process data within kernel function. something like this

global mykerenl(float data1, float data2)
{

int size = …; //calculate buffer size
float buf = …// to allocate buffer size based on some calculation. similar as buf = (float)malloc(size*sizeof(float))
… //use the buffer
free(buf);
}

My reading is I could use shared memory but I must know the size upfront. That is not dynamic

global mykerenl(float data1, float data2)
{

 __shared __ float buf[];
 ...

}

when called, use third parameter
int bufferSize = …;//calculate size first.
mykerenl<<<thread, block, bufferSize>>>( data1, data2);

Any suggestion?
Thanks

Hi all

How to dynamically allocate local buffer space in kernel? I need a buffer space to process data within kernel function. something like this

global mykerenl(float data1, float data2)
{

int size = …; //calculate buffer size
float buf = …// to allocate buffer size based on some calculation. similar as buf = (float)malloc(size*sizeof(float))
… //use the buffer
free(buf);
}

My reading is I could use shared memory but I must know the size upfront. That is not dynamic

global mykerenl(float data1, float data2)
{

 __shared __ float buf[];
 ...

}

when called, use third parameter
int bufferSize = …;//calculate size first.
mykerenl<<<thread, block, bufferSize>>>( data1, data2);

Any suggestion?
Thanks