is there any guide/sample code on good performance conv for large sample/kernel size? seem like the example use fft base, is this give better performance compare to time domain? also for large sample size is it better to have multiple fft then a huge fft?
You might want to take the convolution class that I mentioned to you.
multiple==> batched is almost always more efficient IMI