Adding a Vector

Is there a function in CUDA somewhere that will add the elements in a vector?

CUBLAS offers a function for computing the sum of the absolute value of all elements in a vector. This is the {S,D}ASUM functionality provided by all BLAS implementations. That would work if your elements are of a floating-point type and all positive.

Note that the computation of such a sum is a straightforward implementation of a reduction, so you might want to check whether you can find something suitable in Thrust.

Thanks, that should work since my numbers are positive. Out of curiosity, what’s the reason for having an absolute sum function but not a signed function? Is there something about it that’s computationally easier?

Also is there an easy way to initialize an array to e.g. all zero, short of writing a kernel or doing a cudaMemcpy from the host?

We would have to ask the designers of BLAS to find out. I don’t think I have ever seen a rationale for {S,D}ASUM in particular, but you might want to check the following references:

C. Lawson, R. Hanson, D. Kincaid, and F. Krogh, “Basic Linear Algebra Subprograms for Fortran Usage,” ACM Trans. on Math. Soft. 5 (1979) 308-325

J.J. Dongarra, J. DuCroz, S. Hammarling, and R. Hanson, “An Extended Set of Fortran Basic Linear Algebra Subprograms,” ACM Trans. on Math. Soft. 14,1 (1988) 1-32

J.J. Dongarra, I. Duff, J. DuCroz, and S. Hammarling, “A Set of Level 3 Basic Linear Algebra Subprograms,” ACM Trans. on Math. Soft. 16,1 (1990) 1-17

For zeroing an array, you can call cudaMemset() from your host code. I believe CUDA supports device-side memset() as well. Check the documentation. Worst case, write a simple kernel for clearing device storage.

Do you need something like this?

#include <thrust/device_vector.h>
#include <thrust/transform.h>
#include <thrust/functional.h>

void test(void)
  auto const size = size_t{1'000'000};
  auto const init = int{-1};

  auto const vec_a = thrust::device_vector<int>{size, init};
  auto const vec_b = vec_a;
  auto       vec_c = vec_a;

    vec_a.begin(), vec_a.end(), vec_b.begin(),


Oh, you meant like a reduction, didn’t you?

Look at: