Hi, I’m new to open ACC.
For testing open ACC compiler well and how fast it makes a program,
I compiled with pcgcc -acc -o test03.exe test03.c -ta=tesla:cc1x -Minfo=accel, and executed it.
But when varying n from 1 to 10^6, the time measured by using PGI_ACC_TIME doesn’t satisfy me.
I heard when n is small number, there must be some overhead for sending and getting datas from host to device, and versus.
Approximately, the time varys 41 at 1 times, 41 at 10 times, 41 at 100 times, 82 at 1000 times, about 400 at 10000 times, about 4000 at 100000 times.
Any problems in my code? or any advices for me?
I really really appreciate to your very kind reply, in advance :).
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include "stdafx.h"
#include <windows.h>
int main( int argc, char* argv[] )
{
int n; /* size of the vector */
float *a; /* the vector */
float *restrict r; /* the results */
float *e; /* expected results */
int i;
if( argc > 1 )
n = atoi( argv[1] );
else
n = 1000;
if( n <= 0 ) n = 1000;
a = (float*)malloc(n*sizeof(float));
r = (float*)malloc(n*sizeof(float));
e = (float*)malloc(n*sizeof(float));
/* initialize */
for( i = 0; i < n; ++i ) a[i] = (float)(i+1);
printf("start!!\n");
#pragma acc kernels loop
for( i = 0; i < n; ++i ) r[i] = a[i]*2.0f;
/* compute on the host to compare */
for( i = 0; i < n; ++i ) e[i] = a[i]*2.0f;
/* check the results */
for( i = 0; i < n; ++i )
{
assert( r[i] == e[i] );
}
printf("end!!\n");
printf( "%d iterations completed\n", n );
return 0;
}