New to CUDA, simple kernel gives output of zero

GenTiradentes · April 4, 2010, 4:44am

I’m a programmer new to CUDA, looking to use GPGPU in a certain embarrassingly parallel problem I’m working on. I recently found out that my crappy integrated Geforce 8200 chipset actually supports CUDA, so I installed drivers for it, hooked up a second display, installed the SDK and examples, and actually got the example apps to run on the hardware.

Next, I tried following a tutorial to get a simple floating point benchmark running. It’s a program I write and use all the time, in various languages, using different compilers and optimization techniques, to see what kind of performance each system offers. The program is a (slow and inaccurate) Monte Carlo method of calculating pi, which is heavily dependent on floating point performance. I love this problem for benchmarking because it’s easy to write, and embarrassingly parallel.

So I wrote a version of this program using CUDA, and the API has been surprisingly easy to learn and use, I must say. However, I have run into a confusing problem. Whenever I run my application, the output from the GPU is always an array full of zeros. I’m not sure what I’m doing wrong here, and I would greatly appreciate the advice of people more experienced in this.

#include <iostream>

#include <stdio.h>

#include <cuda.h>

#include <ctime>

#include <cmath>

using namespace std;

__global__ void withinCircle(float* x, float* y, float* out, unsigned int num)

{

	int idx = blockIdx.x * blockDim.x + threadIdx.x;

	out[idx] = sqrt((x[idx] * x[idx]) + (y[idx] * y[idx]));

}

int main()

{

	unsigned int iterations = 100000000;

	srand(time(NULL));

	

	// Host pointers

	float *randomX, *randomY, *out; 

	

	// Device pointers

	float *gRandomX, *gRandomY, *gOut;

	randomX = new float[iterations];

	randomY = new float[iterations];

	out = new float[iterations];

	cudaMalloc((void**) &gRandomX, iterations);

	cudaMalloc((void**) &gRandomY, iterations);

	cudaMalloc((void**) &gOut, iterations);

	for(unsigned int i = 0; i < iterations; i++)

	{

		randomX[i] = (rand() / (float)RAND_MAX);

		randomY[i] = (rand() / (float)RAND_MAX);

	}

	

	cout << "Finished generating random input." << endl;

	cudaMemcpy(gRandomX, randomX, sizeof(float)*iterations, cudaMemcpyHostToDevice);

	cudaMemcpy(gRandomY, randomY, sizeof(float)*iterations, cudaMemcpyHostToDevice);

	cudaMemcpy(gOut, out, sizeof(float)*iterations, cudaMemcpyHostToDevice);

	unsigned int blockSize = 4;

	unsigned int numBlocks = iterations/blockSize + (iterations % blockSize == 0 ? 0 : 1);

	withinCircle<<< numBlocks, blockSize >>>(gRandomX, gRandomY, gOut, iterations);

	cudaThreadSynchronize();

	cudaMemcpy(out, gOut, sizeof(float)*iterations, cudaMemcpyDeviceToHost);

	unsigned int hits = 0;

	for(int i = 0; i < iterations; i++)

	{ 

		cout << out[i] << endl;

		hits += (out[i] <= 1.0f) ? 1 : 0;

	}

	

	delete [] randomX;

	delete [] randomY;

	delete [] out;

	

	cudaFree(gRandomX);

	cudaFree(gRandomY);

	cudaFree(gOut);

	cout << (iterations/(float)hits)*4 << endl;

	return 0;

}

Cheers.

EDIT: Oops, just realized this might have been better off in a different forum. I’m closing this topic and moving it to the general CUDA discussion forum. Sorry for the mix up.

Topic		Replies	Views
New to CUDA, simple kernel give output of zero. CUDA Programming and Performance	3	3667	April 4, 2010
problem in the program running on CUDA CUDA Programming and Performance	7	2457	September 20, 2015
New to CUDA, Questions on Optimization CUDA Programming and Performance	2	845	August 26, 2010
Issues running CUDA on GPU cluster CUDA Programming and Performance	3	909	February 18, 2017
CUDA+MPI = Unexplained Issues... Random Crashes, Errenous Output?!? CUDA Programming and Performance	5	3268	July 7, 2008
Very simple CUDA program bad output CUDA Programming and Performance	3	767	July 3, 2017
The kernel always returns values equal to zero CUDA Programming and Performance	10	8040	February 2, 2018
stymied by my first cuda simple test, need help! CUDA Programming and Performance	4	3876	August 27, 2011
Random Corruption? CUDA Programming and Performance	4	5680	July 28, 2008
Problem with kernel output CUDA Programming and Performance	5	1110	February 18, 2016

New to CUDA, simple kernel gives output of zero

Related topics