Floating Point Subtraction

Mainak · July 1, 2010, 6:24pm

I have been trying floating point subtraction using CUDA. The code worked fine when I emulated it but it isn’t working on the device. The problem that I am having is that the difference is coming out to be 0 which is really strange. Please help! Here is the code

[codebox]#include “stdafx.h”

#include <stdio.h>

#include <stdlib.h>

#include “cutil.h”

extern “C” double *zeros(int);

extern int MODE;

global void d_Idifference(double *d_M, double *d_S, double *d_Idiff, int n)

{

int x,y,idx;

x = blockIdx.x*blockDim.x + threadIdx.x;

y = blockIdx.y*blockDim.y + threadIdx.y;

idx = n*y + x;



if(x<n && y<n)

	d_Idiff[idx] = d_M[idx] - d_S[idx];

}

extern “C” double Idifference(double M, double *S, int height, int width, double *timer){

double *Idiff;

int n = width;

unsigned int timer1 = 0;



Idiff = zeros(width);

cutCreateTimer(&timer1);

cutStartTimer(timer1);

if (MODE){

	double *d_Idiff, *d_M, *d_S;

	dim3 dimBlock(16,16);

	dim3 dimGrid(height/dimBlock.x,width/dimBlock.y);

	

	cudaMalloc((void**)&d_Idiff, n*n*sizeof(double));

	cudaMalloc((void**)&d_M, n*n*sizeof(double));

	cudaMalloc((void**)&d_S, n*n*sizeof(double));

	cudaMemcpy(d_S, S, n*n*sizeof(double), cudaMemcpyHostToDevice);

	cudaMemcpy(d_M, M, n*n*sizeof(double), cudaMemcpyHostToDevice);

	cudaMemcpy(d_Idiff, Idiff, n*n*sizeof(double), cudaMemcpyHostToDevice);



	d_Idifference<<<dimGrid,dimBlock>>>(d_M,d_S,d_Idiff,n);



	cudaThreadSynchronize();

	

	cudaMemcpy(Idiff, d_Idiff, n*n*sizeof(double), cudaMemcpyDeviceToHost);

	printf("%f %f %f ", S[51*n + 134], M[51*n + 134], Idiff[51*n + 134]);

	

	cutStopTimer(timer1);

	*timer += cutGetTimerValue(timer1);

	cudaFree(d_M);

	cudaFree(d_S);

	cudaFree(d_Idiff);

}

else{



	int x,y;



	for(y = 0; y < height; y++){

		for(x = 0; x < width; x++)

			Idiff[n*y + x] = M[n*y + x] - S[n*y + x];

	}		

	cutStopTimer(timer1);

	*timer += cutGetTimerValue(timer1);

}

return Idiff;

}[/codebox]

This is in fact a portion of a bigger program. So all of it may not make much sense to everyone. But my problem is essentially this: when I do it in emulation mode, the printf gives the right result. But when I do it in non-emulation mode, Idiff[…] gives 0.00 regardless of the indices I feed to it. Please help!

seibert · July 1, 2010, 7:26pm

You should check the return codes from all the CUDA functions. There are a wide variety of problems that could prevent your kernel from running at all (giving you zero as the answer), and you will only know about them if you check the return codes.

seibert · July 1, 2010, 7:26pm

You should check the return codes from all the CUDA functions. There are a wide variety of problems that could prevent your kernel from running at all (giving you zero as the answer), and you will only know about them if you check the return codes.

Mainak · July 2, 2010, 9:36am

OK. Thanks a lot for the speedy reply. I tried checking the error codes of the CUDA functions. However, it seems that the kernels are running all right. They are returning cudaSuccess and the error string is “no error”. I tried this for the cudamalloc, cudamemcpy and the d_Idifference function. But no error. Also, instead of using d_Idifference to find the difference, I tried modifying it a little. I did d_Idiff[idx] = 134; So I should be getting 134 when I do the printf(). But I am still getting 0.0. I think I am doing some silly mistake but it would be nice if you could throw some light on it. I am running the program on an NVIDIA GeForce 9500 GT with driver version 191.07. The sample programs are running fine. But this program isn’t working!

Thanks a lot!

Mainak · July 2, 2010, 9:36am

OK. Thanks a lot for the speedy reply. I tried checking the error codes of the CUDA functions. However, it seems that the kernels are running all right. They are returning cudaSuccess and the error string is “no error”. I tried this for the cudamalloc, cudamemcpy and the d_Idifference function. But no error. Also, instead of using d_Idifference to find the difference, I tried modifying it a little. I did d_Idiff[idx] = 134; So I should be getting 134 when I do the printf(). But I am still getting 0.0. I think I am doing some silly mistake but it would be nice if you could throw some light on it. I am running the program on an NVIDIA GeForce 9500 GT with driver version 191.07. The sample programs are running fine. But this program isn’t working!

Thanks a lot!

Mainak · July 2, 2010, 10:30am

OK. I found the solution myself. Thanks for your time. The trouble is with the data type I am using. double isn’t supported on many CUDA versions unless you use the -arch sm_13 switch (even that was causing problems). So I decided to convert my double data types into float. That solves the problem! Thanks a lot!

Mainak · July 2, 2010, 10:30am

OK. I found the solution myself. Thanks for your time. The trouble is with the data type I am using. double isn’t supported on many CUDA versions unless you use the -arch sm_13 switch (even that was causing problems). So I decided to convert my double data types into float. That solves the problem! Thanks a lot!

Topic		Replies	Views
Strange change in behaviour between float and double CUDA Programming and Performance	6	1322	April 1, 2009
Bilateral Filter on GPU Can somebody please help CUDA Programming and Performance	5	4773	November 23, 2008
Cube computing difference in GPU and CPU? CUDA Programming and Performance	4	519	November 1, 2017
Floating-point precision problems CUDA Programming and Performance	14	4443	January 7, 2011
Help with strange error CUDA Programming and Performance	8	2107	February 25, 2010
Difference between Device emulation and execution modes CUDA Programming and Performance	0	1094	April 11, 2009
CUDA double/float woes CUDA not denoting double prec types? CUDA Programming and Performance	1	5922	May 1, 2008
cudaMemcpyToSymbol returnes "invalid device symbol" CUDA Programming and Performance	12	35673	May 2, 2011
strange behavior with device emulation CUDA Programming and Performance	5	2698	May 20, 2008
This is driving me nuts! memory access problem.. CUDA Programming and Performance	5	2670	December 7, 2007

Floating Point Subtraction

Related topics