cufftPlanMany R2C advanced layout problem

Hi,
I’m trying to do a 1D batched FFT on real valued data, with the 4.1 SDK.
The data is in column major order: a0a1,…a511, b0,b1,b2,…b511,… and I want FFT to operate on the batches:
(a0,b0,c0,…), (a1,b1,c1,…), …
The length of each row is 512, and the FFT size is 64, so I want 512 batches of a 64 point FFT.

The FFT plan goes like this:
int n = {NUMBER_OF_CHANNELS};
cufftResult_t r = cufftPlanMany(&IFFT_plan, 1, n, NULL, //rank, SIZE , inmbed,
512, 1 , NULL, //istride, idist, ombed,
1, (64/2) + 1, //ostride, odist,
CUFFT_R2C, 512); //type, batch_size
I execute the FFT like this:
cufftExecR2C(IFFT_plan, RealInputData, ComplexOutputData);

But the output data doesn’t make sense.
Am I doing anything wrong??
Is cufftPlanMany supposed to work for R2C with the advanced layout format?
Thanks!!

Ok, I found my problem.
I used NULL for inmbed, ombed, as this is possible with the FFTW for 1D transforms.
The manual says that if they are null, the stride and dist parameters are ignored.
Setting this:
int inmbed = {64}; int ombed = {64/2 + 1};
Fixed it.

How to do fft transformation to a matrix with dimensions of Num_tests*Num_signals,
where “Num_signals” represents how many time-points, like t1,t2,…tn,
and “Num_tests” represents how many test points (it is time-serial), whose length each are Num_signals
(1)I try to code as below:

float* A = (float*)mxGetData(prhs[0]);
int numARows = (int)mxGetM(prhs[0]);
int numACols = (int)mxGetN(prhs[0]);
int outRows = numARows; // 对每行fft变换
int outCols = numACols /2 + 1;

cufftHandle plan;//创建CUBLAS句柄,
int* ArraySize;
ArraySize[0]=numARows;
ArraySize[1]=numACols; 
int* inembed;
inembed[0]=numARows; 
inembed[1]=numACols; 
int* onembed;
onembed[0]=outRows;
onembed[1]=outCols; 

const int ISTRIDE=numARows; //dist of two successive signal
const int OSTRIDE=1; 
const int IDIST=1; 
const int ODIST=outCols; 
const int BATCH=numARows;
cufftPlanMany(&plan, 1, ArraySize, inembed,ISTRIDE, IDIST, onembed, OSTRIDE,ODIST, CUFFT_R2C, BATCH);


(2) What I got is as below:

C=cufft_EachRow(A)

C =

3.3618 + 0.0000i -0.0591 + 0.3879i -0.7942 + 0.0000i
-0.1428 + 0.3248i 1.2878 - 0.0000i 2.5728
0.1826 - 0.0000i 1.3634 - 0.0000i -0.0572 - 0.4050i
2.4536 + 0.0000i -0.0306 + 0.5218i 1.1952 + 0.0000i
(3) What I expected is as below:
% A to be transformed is shown below:
A =

0.8147    0.6324    0.9575    0.9572
0.9058    0.0975    0.9649    0.4854
0.1270    0.2785    0.1576    0.8003
0.9134    0.5469    0.9706    0.1419

% to do 1D-fft transformation to a matrix A with dimensions of Num_tests*Num_signals, according to its first dim.
BB2 =

3.3618 -0.1428 + 0.3248i 0.1827
2.4536 -0.0591 + 0.3878i 1.2878
1.3634 -0.0306 + 0.5218i -0.7942
2.5727 -0.0572 - 0.4050i 1.1952