Hello all,
I am trying to implement vector-vector multiplication in CUDA.
There is a lot of reduction involved in this operation. Do anybody know about samples where vector vector multiplication is done or how reductions are done.
Thanks
Hello all,
I am trying to implement vector-vector multiplication in CUDA.
There is a lot of reduction involved in this operation. Do anybody know about samples where vector vector multiplication is done or how reductions are done.
Thanks
Why not just use one of the CUBLAS sdot routines?