I’m a kind of newbie in parallel programming and CUDA environment.
My problem has a lot of small eigen and singular value decompositions(EIG/SVD), so I think it would be accelerated by using parallel processing using CUDA. However, it is very hard to find those basic numerical algorithms running IN A THREAD.
Is it because it is not efficient, or just because I cannot find one?
Any idea is appreciated.
Thanks in advance.