Hello everyone,
I have a problem with a piece of code where I am trying to execute some 1D FFTs. Specifically, the following code
constexpr int size = 3;
constexpr int offset =1;
int n [] = {16};
const int howmany = 3;
const size_t in_stride = n[0]*size;
const size_t out_stride =n[0] ;
cufftHandle p;
if (cufftPlanMany(&p, 1, n,
nullptr, size, in_stride,
nullptr, 1, out_stride,
CUFFT_D2Z, howmany)
!= CUFFT_SUCCESS) {
std::cerr << "CUFFT unable to create CUFFT_D2Z plan" << std::endl;
exit(1);
}
cufftDoubleReal *d_in;
cufftDoubleComplex *d_out;
cudaMalloc((void**)&d_in, sizeof(cufftDoubleReal)*n[0]*howmany*size);
cudaMalloc((void**)&d_in, sizeof(cufftDoubleComplex)*n[0]*howmany);
// Execute the FFT
const cufftResult err = cufftExecD2Z(p, d_in+offset, d_out);
if (err != CUFFT_SUCCESS) {
std::cerr << "CUFFT unable to cufftExecD2Z with error " << err << std::endl;
exit(1);
}
cudaFree(d_in);
cudaFree(d_out);
fails with CUFFT unable to cufftExecD2Z with error 4 unless I set the offset variable to 0.
Do you know any reason why I should not be able to offset the d_in device pointer in this scenario?
How can I have more information about what is triggering the CUFFT_INVALID_VALUE error?
It’s a requirement that the pointers be aligned properly. From here:
Pointers to idata and odata are both required to be aligned to cufftComplex data type in single-precision transforms and cufftDoubleComplex data type in double-precision transforms.
There are two cufftDoubleReal in one cufftDoubleComplex, so you should be able to use offset that is zero or a positive whole-number multiple of 2.
Aside: your posted code appears to have a typo or bug in it:
Considering that cufftDoubleComplex is 8 bytes aligned, I thought I just needed to preserve the alignment of the input with each element of the output and not with the entire structure.
Aside: your posted code appears to have a typo or bug in it:
You are right. It should be a d_out. I forgot to change the name while doing the copy and paste.
In cufft.h, cufftDoubleComplex is a typedef based on cuDoubleComplex. In cuComplex.h, cuDoubleComplex is a typedef for double2. In vector_types.hdouble2 is a typedef for the struct of the same name, and the definition of that struct is: