you can read README.txt in CUDA3.0 doc:
" o Batched 2D & 3D transforms are now supported in CUFFT, using the new
cufftPlanMany() API. This is defined in cufft.h, as follows:
cufftResult CUFFTAPI cufftPlanMany(cufftHandle *plan,
int rank,
int *n,
int *inembed, // Unused: pass NULL
int istride, // Unused: pass 1
int idist, // Unused: pass 0
int *onembed, // Unused: pass NULL
int ostride, // Unused: pass 1
int odist, // Unused: pass 0
cufftType type,
int batch);
The arguments are:
*plan - The plan is returned here, as for other cufft calls
rank - The dimensionality of the transform (1, 2 or 3)
*n - An array of size [rank], describing the size of each
dimension
type - Transform type (e.g. CUFFT_C2C), as per other cufft calls
batch - Batch size for this transform
Return values are as for all other cufftPlanXxx functions. Thus to plan
a batch of 1000, 2D, double-precision, complex-to-complex transforms of
size (128, 256), you would do:
cufftHandle *myplan;
cufftPlanMany(myplan, 2, { 128, 256 }, NULL, 1, 0, NULL, 1, 0, CUFFT_Z2Z, 1000);
Note that for CUFFT 3.0, the layout of batched data must be side-by-side
and not interleaved. The inembed, istride, idist, onembed, ostride and
odist parameters are for enabling data windowing and interleaving in a
future version.
"