I’m quite new to CUDA-programming and trying to compute the 2-Norm of a vector. I tried it with a selfwritten function, but here it seems that the threads are overwriting randomly the result (so each one reads out a value from the same variable and writes it back after addition).
There is the problem: How can I compute the norm for all elements in the vector without loosing the last results?
I also tried the cuBLAS-libraries, but all I get is 0 or a Segmentationfault, because I can’t find an example how to use cublasSnrm2 corectly. :(