Newbie trying to find fault in program

javkrei · January 25, 2008, 5:03pm

Hi!

This is my first post here. I’m also a newbie as regards CUDA programming. The problem I have is that when I compile the program below and run it I get the following output:

0 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1

when it should simply output 0 1 2 3 4 5 6 … etc; it is as if only the first thread had done it’s job. There’s probably a very simple and stupid mistake, if someone could illuminate me I’d be very grateful. Thank you!

BTW, I can run the examples in the SDK, for example BlackScholes and Binomial and some others I’ve tried seem to run just fine. I have an XFX Geforce 8800GT 256MB.

#include <stdlib.h>

#include <stdio.h>

#include <cutil.h>

#define N_DIV 32

__global__ void createTestGrid_kernel(int* d_TestGrid)

{

	d_TestGrid[threadIdx.x]=threadIdx.x;

}

int main( int argc, char** argv) 

{

	CUT_DEVICE_INIT();

int *h_TestGrid; 

	int *d_TestGrid; 

	h_TestGrid=(int *)malloc(N_DIV*sizeof(int));

	int i;

	for (i=0;i<N_DIV;i++){

  h_TestGrid[i]=-1;

	}

  

	CUDA_SAFE_CALL( cudaMalloc((void **)&d_TestGrid,N_DIV*sizeof(int)));

dim3 threads(N_DIV,1,1); dim3 grid(1,1,1);

	CUDA_SAFE_CALL( cudaMemcpy(d_TestGrid, h_TestGrid, N_DIV*sizeof(int), cudaMemcpyHostToDevice));

	createTestGrid_kernel<<<threads,grid>>>(d_TestGrid);	

	

	CUDA_SAFE_CALL( cudaThreadSynchronize() );

	CUDA_SAFE_CALL( cudaMemcpy(h_TestGrid, d_TestGrid, N_DIV*sizeof(int), cudaMemcpyDeviceToHost));

	CUDA_SAFE_CALL( cudaFree(d_TestGrid));

	FILE *output;

	fopen_s(&output,"output.txt","w");

	for (i=0;i<N_DIV;i++){

  fprintf_s(output, "%i ",h_TestGrid[i]);

	}

	fclose(output);

	free(h_TestGrid);

      CUT_EXIT(argc, argv);

}

tanmay.Learns · January 25, 2008, 5:09pm

shouldn’t your kernel call be
createTestGrid_kernel<<<grid, threads>>>(d_TestGrid);

instead of what you have -
createTestGrid_kernel<<<threads, grid>>>(d_TestGrid);

mfatica · January 25, 2008, 5:15pm

The execution configuration should be <<<grid, threads>>

You have them transposed in your code
createTestGrid_kernel<<<threads,grid>>>(d_TestGrid);

javkrei · January 25, 2008, 5:23pm

Thank you guys! Now it’s working, thank god, I was stuck since yesterday not knowing what to do. Such a relief, you’ve made one person a little happier, thanks!

Topic		Replies	Views
Can some one check this for me please..... Newbie needs help learning CUDA Programming and Performance	2	2679	April 10, 2008
Simple question on passing to the kernel CUDA Programming and Performance	15	3628	January 15, 2012
Program generates wrong result CUDA Programming and Performance	1	1431	July 9, 2009
Losing CUDA calculatons CUDA Programming and Performance	5	2398	March 21, 2011
For-Loop is not executed CUDA Programming and Performance	5	1374	December 5, 2012
CUDA Noob here. Kernel does not act correctly CUDA Programming and Performance cuda , kernel	5	490	June 7, 2022
simple kernel, stupid mistake? CUDA Programming and Performance	1	1270	April 27, 2010
My first program with CUDA need some help CUDA Programming and Performance	3	2636	August 10, 2009
Very simple CUDA program bad output CUDA Programming and Performance	3	830	July 3, 2017
kernel execution and related questions CUDA Programming and Performance	2	2464	December 5, 2009

Newbie trying to find fault in program

Related topics