Allocate memory in the kernel execution

Hi people, I must allocate memory in the code of kernel (and no in the host code with cudaMalloc, since the total size of the allocation not know at compile time). Pratically, my kernel must create a vector of struct only in certain cases, then so it must allocate a structure at a time when it finds a new element to insert.

Code in the kernel:

if(massa>M0-del && massa<M0+del){ //Test


                        <u>Pi0[loc]=malloc(sizeof(...)); ???</u>








In red, what needs to be put in that place to allocate a new box in the vector?

P.S. The vector Pi0 is a pointer declared in the host code and send from the kernel call (kernel<<<…>>>(Pi0) ).

Would it be allocation of a single object in ‘Pi0’ table:

type_of_single_pi0_table_object_name *Pi0;


Pi0[loc] = malloc(sizeof(type_of_single_pi0_table_object_name));


Or You want to allocate the ‘Pi0’ itself?

I want allocate a single object in Pi0, one every time I find a new element to insert.

But don’t work since I have not previously allocated vector in the host code. :(

The error of compilation is: error: no operator “=” matches these operands

        operand types are: PI_0 = void *

A bit rough case.

In host, you need to

d_pi0_table_object_name** Pi0;
cudaMalloc( (void**)&(d_Pi0), sizeof(d_pi0_table_object_name*)* MAX_NO_OF_Pi0_YOU_WILL_EVER_NEED );

malloc individual d_pi0_table_object_name in kernel. But you are bound by MAX_NO_OF_Pi0_YOU_WILL_EVER_NEED
do work
free each of the object in Pi0 array

in host, cudaFree(Pi0)

If the host knows the number of Pi0 objects you will ever use in the kernel, you can allocate exact number of pointers.

but doing so may happen that allocates a maximum number of items that I will never use. I just wanted to optimize the memory allocated on the device.