I try to use a cuda 2D array of some struct :
cudaMallocPitch((void **)&d_vars, &pitch, Msizeof(var), N);
It works BUT I noticed a loss of performance of a ratio of about 2 or 3 (compared to
a splitted set of arrays of float)
Is anybody experienced the same performance issue ? Is there any solution to improve performance ?
Thanks in advance,