Hello, there is much support and discussion of 2d convolutions with regard to CUDA, but I was wondering if anybody has experience with implementing and profiling 3d convolutions? Even more interesting would be MATLAB support for this operation.
Larger convolution radiuses will be hard due to shared memory constraints. Of course, if your convolution is seperable you can split it into a pass for each direction, which is easier.
Typically I would be interested in convolving a small filter (5x5x5) with a somewhat larger image (50x50x50 or 100x100x100). So if I understand your reply correctly, then 3d convolution is already supported as an operation in the CUDA API? Somehow this wasn’t obvious to me. Thanks so much!
There is fast separable 2D convolution implementation in the SDK. If your convolution kernels are separable, you probably just need to add third ‘Z’ kernel to existing kernels (with some minor midifcations), performing 1D convolutions in ‘X’ and ‘Y’ directions. However in general case (with non-separable convolution kernels), FFT-based convolution looks more promising.