OpenACC routine bind

Hi Mat,
My question is about routine directive and bind clause. From the OpenACC Application Programming Interface, I don’t know the meaning of bind clause. Simple example is :

#include<stdio.h>

#pragma acc routine worker bind(sum1)
int sum(int ,float *);

int sum(int n,float *A)
{
	int i;
	float s=0.0f;
	for(i=0;i<n;i++){
		s=s+A[i];
	}
	return s;
}

#pragma acc routine worker 
int sum1(int n,float *A)
{
	int i;
	float s=0.0f;
	#pragma acc loop vector reduction(+:s)
	for(i=0;i<n;i++){
		s=s+A[i]+2;
	}
	return s;
}

int main()
{
	float *X,*Y;
	X=(float*)malloc(sizeof(float)*100*200);
	Y=(float*)malloc(sizeof(float)*100);
	int j,i;
	for(j=0;j<100;j++){
		for(i=0;i<200;i++){
			X[j*200+i]=j;
		}
	}
	#pragma acc parallel copyout(Y[0:100]) copyin(X[0:100*200])
	{
		#pragma acc loop gangs
		for(j=0;j<100;j++){
			Y[j]=sum(200,(X+j*200));
		}
	}

	for(j=0;j<10;j++){
		printf("Y[%d]=%f\n",j,Y[j]);
	}
	free(X);
	free(Y);
	return 0;
}

Referencing to the PPT, I code this simple example, but I don’t understand the code .

Hi uestc0626,

By default, the compiler will generate two versions of a subroutine/function. One for the host, and another for the target device. The bind clause can be used to change which subroutine/function from which the device version is created.

In this case, the programmer will call “sum” in the body of the program. If this call was made from host code, then the “sum” function will be used. If it is called from an OpenACC compute region, then it will call the device code version for the “sum1” function instead.

Note, “bind” could also specify a call to a CUDA device routine.

Hope this helps clarify things,
Mat