Adding a Vector

Tuee22 · July 10, 2017, 8:17pm

Is there a function in CUDA somewhere that will add the elements in a vector?

njuffa · July 10, 2017, 8:37pm

CUBLAS offers a function for computing the sum of the absolute value of all elements in a vector. This is the {S,D}ASUM functionality provided by all BLAS implementations. That would work if your elements are of a floating-point type and all positive.

Note that the computation of such a sum is a straightforward implementation of a reduction, so you might want to check whether you can find something suitable in Thrust.

Tuee22 · July 10, 2017, 8:41pm

Thanks, that should work since my numbers are positive. Out of curiosity, what’s the reason for having an absolute sum function but not a signed function? Is there something about it that’s computationally easier?

Also is there an easy way to initialize an array to e.g. all zero, short of writing a kernel or doing a cudaMemcpy from the host?

njuffa · July 10, 2017, 8:49pm

We would have to ask the designers of BLAS to find out. I don’t think I have ever seen a rationale for {S,D}ASUM in particular, but you might want to check the following references:

C. Lawson, R. Hanson, D. Kincaid, and F. Krogh, “Basic Linear Algebra Subprograms for Fortran Usage,” ACM Trans. on Math. Soft. 5 (1979) 308-325

J.J. Dongarra, J. DuCroz, S. Hammarling, and R. Hanson, “An Extended Set of Fortran Basic Linear Algebra Subprograms,” ACM Trans. on Math. Soft. 14,1 (1988) 1-32

J.J. Dongarra, I. Duff, J. DuCroz, and S. Hammarling, “A Set of Level 3 Basic Linear Algebra Subprograms,” ACM Trans. on Math. Soft. 16,1 (1990) 1-17

For zeroing an array, you can call cudaMemset() from your host code. I believe CUDA supports device-side memset() as well. Check the documentation. Worst case, write a simple kernel for clearing device storage.

MutantJohn · July 10, 2017, 9:00pm

Do you need something like this?

#include <thrust/device_vector.h>
#include <thrust/transform.h>
#include <thrust/functional.h>

void test(void)
{
  auto const size = size_t{1'000'000};
  auto const init = int{-1};

  auto const vec_a = thrust::device_vector<int>{size, init};
  auto const vec_b = vec_a;
  auto       vec_c = vec_a;

  thrust::transform(
    vec_a.begin(), vec_a.end(), vec_b.begin(),
    vec_c.begin(),
    thrust::plus<int>{});
}

Edit:

Oh, you meant like a reduction, didn’t you?

Look at: thrust: Reductions