Does Your code even compile? Beyond that isn’t it better to allocate global memory from within the host code and then pass pointer to it to both kernels?
As far as I know using ‘malloc’ in kernel refers to dynamic allocation of local memory (or the one for a single thread only in other words). But, please, correct me if I’m wrong.
And what would be the benefit of using malloc inside a kernel vs declaring like this float p[10];
Did someone ever use malloc inside kernel successful? In the tutorials I read the examples never had malloc inside a kernel. This is why it just seems out of places.
In the example given in the first post the p1 is allocated in one kernel and then used in another so the intent is to use it as a variable in the global memory.
pasoleatis: You are absolutely right. Calling ‘malloc’ for every thread can cause a great drop in performance of Your application. It is possible to use it (as like as the ‘new’ operator, with device compute capability >= 2.0), but not very efficient, I think.