I and my labmates started a project to implement a bunch of image processing algorithms in CUDA.
We’ve implemented some(such as conv, sobel…etc) and got an average 10X accelerating, this is a perfect beginning, and we’re looking for more ideas about what algorithms to be implemented.We prefer the ones used frequently, cost lots of time and have a good parallel model.
Has anybody got an idea? We will make it open when we finish the 1.0 BETA.
Another way to compute the SVD of a matrix is to find the QR factorization first, which can be done by computing Q as a combination of Givens rotations (there is already an example in cuBLAS for this).
AFAIK, you should be able to compute all of your Givens rotation matrices in parallel, then multiply them all together with some sort of reduction kernel. Then multiply the resulting matrix by your original to find R.
Then you can follow a few more steps to get the SVD. It’s not the most optimal algorithm, but I believe that most (all?) of the steps this way are parallelizable on a large scale, so it should work out to be quite fast on the GPU.
CUVI Lib (CUDA for Vision and Imaging Lib) is an add-on library for NPP (NVIDIA Performance Primitives) and includes several advanced computer vision and image processing functions presently not available in NPP
In this version of CUVI Lib you will find:
Optical Flow (Horn & Shunck)
Optical Flow (Lucas & Kanade)
Discrete Wavelet Transform (Forward and Inverse)
Hough Transform
Hough Lines (Lines Detector)
Color Conversion (RGB-to-gray and RGBA-to-Gray)
Several more advanced features will be added to CUVI Lib in upcoming releases. A detailed function reference can be downloaded from: www.cuvilib.com/cuvimanual.pdf
We are looking forward to hearing your feedback and guidance on our forums ([url=“CUVI - CUDA Vision & Imaging Library”]http://www.cuvilib.com/forums[/url]) and look forward to make CUVI Lib a single complete source of computer vision and image processing functions implemented on the GPU.