Hope you guys can help a newcomer. I’m using Vista, GeForce 8600M GT, and Visual Studio 2008 to try a simple CUDA program. However, I’m getting different return results from the following code fragment that uses the same input array arranged as a matrix:
void computeSimple(float* reference, float* idata, const unsigned int rows, const unsigned int columns)
{
for(int r = 0; r < rows; r++)
for(int c = 0; c < columns; c++)
{
reference[r * columns + c] = idata[r * columns + c] * c;
}
}
global void simpleKernel(float* g_oelevdata, float* g_ielevdata, const unsigned int num_rows, const unsigned int num_cols)
{
int r = blockIdx.y * blockDim.y + threadIdx.y;
int c = blockIdx.x * blockDim.x + threadIdx.x;
g_oelevdata[r * num_cols + c] = g_ielevdata[r * num_cols + c] * c;
}
The idata and g_ielevdata is the same data setting the reference and g_oelevdata arrays with the result. What am I doing wrong?
Thanks