I am doing some FFT programming, and using the cuBLAS’s GEMM to accelerate the algorithm. But the question comes to my mind: is cufft optimized by taking advantage of tensor cores? If so, I wanna directly call the cufft library.
No, cuFFT doesn’t currently utilize Tensor Cores.
Hello, I see this question was posted 11 months ago and I would like to address it again in case there have been any new updates since then!
I recently did some benchmarks for 1D Batched FFTs on a Tesla V100 GPU and obtained at max 2.3 TFLOPS/sec. for single-precision complex numbers.
I used the CUDA 11.1.1 version, where Tensor-Cores enabled or?
cuFFT still doesn’t use Tensor Cores.
Okay, thanks for the update!