A biginner's CUDA.net problem

kingsering · June 24, 2012, 9:53am

Hi all,
I just wrote a simple code to use CUDA.net do a vector summation but somehow it doesn’t give me correct result.Below is the kernel code and host code, I really can’t find anything wrong. Could somebody help me to point out where is wrong?

Test.cu:

/* Add two vectors on the GPU */
extern “C” global void vectorAddGPU(float *a, float *b, float c, int N)
{
int idx = blockIdx.xblockDim.x + threadIdx.x;
if (idx < N)
c[idx] = a[idx] + b[idx];
}

Program.cs:
CUDA cuda = new CUDA(0, true);
string s = Path.Combine(Environment.CurrentDirectory ,“Test.ptx”);
CUfunction func;
try
{
cuda.LoadModule(s);
func = cuda.GetModuleFunction(“vectorAddGPU”);
}
catch (CUDAException e)
{
Console.WriteLine(e);
return;
}
float a = new float[1 << 10];
for (int i = 0; i < a.Length; i++)
a[i] = i;

        float[] b = new float[1 << 10];
        for (int i = 0; i < b.Length; i++)
            b[i] = 2 * i + 1;
        float[] c = new float[1 << 10];
        float[] c1 = new float[1 << 10];            
            
        CUdeviceptr d_a = cuda.CopyHostToDevice<float>(a);            

        CUdeviceptr d_b = cuda.CopyHostToDevice<float>(b);            
                    
        int N = 1<<10;

        for (int i = 0; i < N; i++)
            c1[i] = -1;
        CUdeviceptr d_c = cuda.CopyHostToDevice<float>(c1);
                    
        try
        {
            cuda.SetParameter(func, 0, (uint)d_a.Pointer);
            cuda.SetParameter(func, IntPtr.Size, (uint)d_b.Pointer);
            cuda.SetParameter(func, IntPtr.Size * 2, (uint)d_c.Pointer);
            cuda.SetParameter(func, IntPtr.Size * 3, (uint)N);
            cuda.SetParameterSize(func, (uint)(IntPtr.Size * 3 + sizeof(int)));
            cuda.SetFunctionBlockShape(func, 1<<10, 1, 1);
            cuda.Launch(func, 1, 1);
            cuda.CopyDeviceToHost<float>(d_c, c1);
        }
        catch (CUDAException e)
        {
            Console.WriteLine(e);
            return;
        }            

        for (int i = 0; i < 1 << 10; i++)
        {
            c[i] = a[i] + b[i];
            if (c1[i] != c[i])
                Console.WriteLine("not OK\n");
        }

        cuda.Free(d_a);
        cuda.Free(d_b);
        cuda.Free(d_c);//*/

In the code I initialize the d_c (point on device) to be -1. But after launching the kernel, it still hold the value -1. I really can’t figure out where is wrong. Please help me. Thanks.

Topic		Replies	Views
Result of simple vector summation is not correct. CUDA Programming and Performance	2	779	July 23, 2013
help for my cuda code Teaching and Curriculum Support	2	3894	March 31, 2015
problem in the program running on CUDA CUDA Programming and Performance	7	2457	September 20, 2015
The kernel always returns values equal to zero CUDA Programming and Performance	10	8050	February 2, 2018
Newbie: Super simple first CUDA program what's wrong? CUDA Programming and Performance	4	3520	October 2, 2009
My first program it doesn't behave as expected CUDA Programming and Performance	2	2500	July 19, 2009
Wrong outputs for SUMMATION example SUMMATION problem in CUDA CUDA Programming and Performance	0	997	April 29, 2010
Getting started with CUDA ... cannot add simple vectors CUDA Programming and Performance	9	20946	January 31, 2011
[Beginner] Math operations giving incorrect answers CUDA Programming and Performance	3	1395	October 30, 2010
cudaMemcpy don't work CUDA Programming and Performance	4	1808	July 3, 2015

A biginner's CUDA.net problem

Related topics