Are there code one could learn from to benchmark CUDA routines
- 3D matrix operations (add, mean, etc.)
- masking / thresholding (branch issue?)
- convolution of 3D data with 2D kernel
Should one try to use textures or directly code kernels ?
Thank you in advance for your time and help.
Brahim