Hi. I am new to CUDA, and would really appreciate if you could help me with two questions…
Given a 3D point named POINT, and an array of 3D points named COMP_POINTS_ARRAY, I need to determine which points in COMP_POINTS_ARRAY are close to POINT, i.e., points for which the euclidean distance to POINT is smaller than MAXDISTANCE.
QUESTION 1 - Is this the type of task where a CUDA implementation could be faster than a CPU implementation?
I have this coded in Java, working on the CPU. Given POINT and an array of 15,000,000 points, it does not take long to perform the calculations, . However, since I need to compare thousands of points to the ones in the array, time adds up and the computations can take days.
The euclidean distance is given by:
distance = SQRT( (x2 - x1)^2 + (y2-y1)^2 + (z2-z1)^2 ), where (x1,y1,z1) are the coordinates of one point and (x2,y2,z2) are the coordinates of the second point
So here’s my CUDA strategy:
-
load into the GPU memory 3 float arrays compX, compY, and compZ with 15,000,000 elements each; these arrays contain the coordinates for all the points to compare (for example, coordinates for the first point are compX[0], compY[0] and compZ[0])
-
compute which are close to the POINT with coordinates pointX, pointY and pointZ using the kernel shown in the bottom, and put the comparison results in an array of 15,000,000 booleans
-
copy the array with the booleans to the host; leave the other arrays (compX, compY, compZ) in memory so that comparison of the next point does not require loading them to the GPU memory again
QUESTION 2 - the cuda version takes longer than the cpu version when comparing thousands of points. Am I doing something wrong or was this to be expected?
Thanks in advance.
#include <Math.h>
extern "C" __global__ void isClose(float maxDistance, float pointX, float pointY, float pointZ, float *compX, float *compY, float *compZ, bool *isClose)
{
int index = blockIdx.x*blockDim.x + threadIdx.x;
if(index>=0) {
float diffX = compX[index] - pointX;
float diffY = compY[index] - pointY;
float diffZ = compZ[index] - pointZ;
diffX = diffX * diffX;
diffY = diffY * diffY;
diffZ = diffZ * diffZ;
float distance = sqrt(diffX + diffY + diffZ);
if(distance < maxDistance) {
isClose[index] = true;
}
else {
isClose[index] = false;
}
}
}