Cuda Thrust Custom function

iassael · September 20, 2011, 12:25pm

How can I impliment this function in Thrust?

for (i=0;i<n;i++)

    if (i==pos)

        h1[i]=1/h1[i];

    else

        h1[i]=-h1[i]/value;

In CUDA I did it like:

__global__ void inverse_1(double* h1, double value, int pos, int N)

{

    int i = blockDim.x * blockIdx.x + threadIdx.x;

    if (i < N){

        if (i == pos)

            h1[i] = 1 / h1[i];

        else

            h1[i] = -h1[i] / value;

    }

}

Thanks!

MarkusM · September 21, 2011, 8:07am

First you define a Functor for the transform, that includes pos and value and works with one h1-element and it’s index i as parameters.

struct InvertFunctor: thrust::binary_function<double, int, double> {

	int pos;

	double value;

	InvertFunctor(int pos_, double value_) :

		pos(pos_), value(value_) {

	}

	__host__ __device__

	double operator()(double h1, int i) {

		if (i == pos)

			return 1 / h1;

		else

			return -h1 / value;

	}

};

You use it with thrust’s binary transform on a device_vector h1 for example. A counting_iterator provides the indices.

thrust::transform(h1.begin(), h1.end(), thrust::make_counting_iterator(0), h1.begin(), InvertFunctor(pos, value));

iassael · September 21, 2011, 8:13am

First you define a Functor for the transform, that includes pos and value and works with one h1-element and it’s index i as parameters.
struct InvertFunctor: thrust::binary_function<double, int, double> {

	int pos;

	double value;

	InvertFunctor(int pos_, double value_) :

		pos(pos_), value(value_) {

	}

	__host__ __device__

	double operator()(double h1, int i) {

		if (i == pos)

			return 1 / h1;

		else

			return -h1 / value;

	}

};
You use it with thrust’s binary transform on a device_vector h1 for example. A counting_iterator provides the indices.
thrust::transform(h1.begin(), h1.end(), thrust::make_counting_iterator(0), h1.begin(), InvertFunctor(pos, value));

thank you very much!!! One last thing if I wanted to pass another vector to copy values such as this?

for (i=0;i<n;i++)

    for (j=0;j<n;j++)

        y[i*n+j]=h1[i]*a1[pos*n+j];

Its still not clear to me how to pass values to the function created Thanks!

MarkusM · September 21, 2011, 12:48pm

Once you want to input/output even more vectors it gets complicated. The zip_iterator is what you need. (Especially note the two examples linked in that section.)
However you might want to ponder first whether it’s really worth the effort at this level of complexity. A handwritten kernel will probably be much easier and possibly faster. You can even use it with your thrust vectors by unwrapping their pointers.

iassael · September 21, 2011, 5:33pm

I have it in a Cuda kernel however I want to make my program use Thrust.

Yes but the arrays have different lengths

MarkusM · September 22, 2011, 10:29am

Okay, if you insist. I’d say this version with permutation_iterators should be clearest. Note: idx = i * n + j.

First a Functor to calculate h1’s indices i = idx/n.

struct Indexh1Functor: thrust::unary_function<int, int> {

	int n;

	Indexh1Functor(int n_) :

		n(n_) {

	}

	__host__ __device__

	int operator()(int idx) {

		return idx / n;

	}

};

Second a Functor to calculate a1’s indices pos * n + j with j = idx%n.

struct Indexa1Functor: thrust::unary_function<int, int> {

	int n;

	int pos;

	Indexa1Functor(int n_, int pos_) :

		n(n_), pos(pos_) {

	}

	__host__ __device__

	int operator()(int idx) {

		return pos * n + idx % n;

	}

};

But the call is probably quite obscure.

thrust::transform(

		thrust::make_permutation_iterator(

				h1.begin(),

				thrust::make_transform_iterator(

						thrust::make_counting_iterator(0),

						Indexh1Functor(n))),

		thrust::make_permutation_iterator(

				h1.begin(),

				thrust::make_transform_iterator(

						thrust::make_counting_iterator(0),

						Indexh1Functor(n))) + n * n,

		thrust::make_permutation_iterator(

				a1.begin(),

				thrust::make_transform_iterator(

						thrust::make_counting_iterator(0),

						Indexa1Functor(n, pos))),

		y.begin(),

		thrust::multiplies<double>());

The permutation_iterators choose the elements from h1 and a1 by the indices we calculate with the Functors. The multiplication can then simply be done with thrust’s multiplies and outputted directly to y.

iassael · September 25, 2011, 11:09am

Thanks mate!
I m giving it a try right now External Image

Topic		Replies	Views
Can we call a thrust transform of a functor inside another functor? CUDA Programming and Performance	7	1640	July 18, 2018
Thrust Complex Transform of 3 different vectors CUDA Programming and Performance	0	2379	October 2, 2011
Using thrust::transform_output_iterator in real code CUDA Programming and Performance	3	882	October 19, 2019
How to efficiently sort 5 arrays of integers? CUDA Programming and Performance	7	1162	June 19, 2015
Using thrust to handle vectors in Cuda classes CUDA Programming and Performance	0	1792	April 13, 2011
thrust::exclusive_scan with thrust::zip_iterator? CUDA Programming and Performance	9	1542	November 24, 2014
Thrust functor return array; code optimization CUDA Programming and Performance	3	1104	December 27, 2019
Help requested for a three vector thrust::transform CUDA Programming and Performance	10	831	August 8, 2022
How to merge these operations. CUDA Programming and Performance	1	1606	December 7, 2011
Dispatch iterator (THRUST) CUDA Programming and Performance	7	625	October 17, 2019

Cuda Thrust Custom function

Related topics