Vector Vector Multiplication

Hello all,

I am trying to implement vector-vector multiplication in CUDA.

There is a lot of reduction involved in this operation. Do anybody know about samples where vector vector multiplication is done or how reductions are done.

Thanks

Why not just use one of the CUBLAS sdot routines?