Dynamic Linear Memory allocation? Need some tips..

Hello everyone,

as I am new to CUDA (and not a pro using C/C++) I am having trouble coming up with a solution for my problem.

I want to allocate an NxM array. The problem is that I cannot use cudaMallocPitch since the size of N reduces every M times… So for example a fully unrolled array would look like this:

1024x1024 1024x1024 1024x1024 1024x1024 

   512x512    512x512    512x512    512x512

   256x256    256x256    256x256    256x256

I can efficiently create such a 2D array using c++ by, for example:

float** array*[M];

for(i = 0; i< M; i++)


     array[i] = new float[N];

     if(i % N ==0)





However I cannot think of a way to do such a thing another way…

Any insights would be much appreciated :)

best regards