Simple "Hello CUDA" example?

jamesqf · December 6, 2008, 7:38am

Could someone point me to a simple example CUDA program, preferrably in C?

I’ve installed the drivers & SDK, and can use the script to compile & run most of the example programs. However, when I look at the source and try to figure out how I might actually write a program of my own, I find that most of the details are hidden away in obfuscated .h files, which are themselves stuck off somewhere out of the standard path. I could of course spend the time to slowly hack my way through all this, but I would rather be doing something useful.

Thanks,
James

kyzhao · December 6, 2008, 8:56am

if windowsï¼š [url=“http://forums.nvidia.com/index.php?showtopic=83054”]http://forums.nvidia.com/index.php?showtopic=83054[/url]

jamesqf · December 6, 2008, 5:18pm

else if (linux_suse_11) ?

But it seems I wasn’t clear about what I’m looking for. I don’t want a “wizard” that adds more behind-the-curtain stuff to organize what NVidia stuffed behind the curtain; I want a C (preferrably, though FORTRAN would do at a pinch) program that doesn’t have the curtains in it.

Or to put it another way, I’m the wizard (hey, it’s right there in my job description :-)), and I’d like to learn the details of this particular spell…

MisterAnderson42 · December 6, 2008, 9:33pm

All the details you need are in the programming guide, including a step by step run through of a simple matrix multiplication example.

alex_dubinsky · December 6, 2008, 10:21pm

The only “hidden details” in the SDK samples are in cutil.h, which just has some useful error-checking macros. There simply isn’t a lot to the Runtime API. You don’t need to learn what this API is managing for you, especially if you’d like to be doing something useful. Just copy what the samples do, and worry about device code not boilerplate host code.

Or if you really love boilerplate code, look at some of the samples that use the Driver API. The driver api however doesn’t add anything useful. Resist the urge to use it.

Boxed_Cylon · December 6, 2008, 10:23pm

Here is a simple test program I wrote to get my feet wet - it fills a matrix with numbers, and then takes the

cosine of the matrix. Compile with “nvcc cosine.cu -use_fast_math” (on Suse 10.3…)

N.B.: I claim to be an expert in neither C nor CUDA…

/* Cuda GPU Based Program that use GPU processor for finding cosine of numbers */

/* --------------------------- header secton ----------------------------*/

#include<stdio.h>

#include<cuda.h>

#define COS_THREAD_CNT 200

#define N 10000

/* --------------------------- target code ------------------------------*/

struct cosParams {

float *arg;

float *res;

int n;

};

__global__ void cos_main(struct cosParams parms)

{

int i;

for (i = threadIdx.x; i < parms.n; i += COS_THREAD_CNT) {

parms.res[i] = __cosf(parms.arg[i] );

}

}

/* --------------------------- host code ------------------------------*/

int main (int argc, char *argv[])

{

int i = 0;

cudaError_t cudaStat;

float* cosRes = 0;

float* cosArg = 0;

float* arg = (float *) malloc(N*sizeof(arg[0]));

float* res = (float *) malloc(N*sizeof(res[0]));

struct cosParams funcParams;

/* ... fill arguments array "arg" .... */

for(i=0; i < N; i++ ){

arg[i] = (float)i;

}

cudaStat = cudaMalloc ((void **)&cosArg, N * sizeof(cosArg[0]));

if( cudaStat )

printf(" value = %d : Memory Allocation on GPU Device failed\n", cudaStat);

cudaStat = cudaMalloc ((void **)&cosRes, N * sizeof(cosRes[0]));

if( cudaStat )

printf(" value = %d : Memory Allocation on GPU Device failed\n", cudaStat);

cudaStat = cudaMemcpy (cosArg, arg, N * sizeof(arg[0]), cudaMemcpyHostToDevice);

if( cudaStat )

printf(" Memory Copy from Host to Device failed.\n", cudaStat);

funcParams.res = cosRes;

funcParams.arg = cosArg;

funcParams.n = N;

cos_main<<<1,COS_THREAD_CNT>>>(funcParams);

cudaStat = cudaMemcpy (res, cosRes, N * sizeof(cosRes[0]), cudaMemcpyDeviceToHost);

if( cudaStat )

printf(" Memory Copy from Device to Host failed.\n" , cudaStat);

for(i=0; i < N; i++ ){

if ( i%10 == 0 )

printf("\n cosf(%f) = %f ", arg[i], res[i] );

}

}

/* nvcc cosine.cu -use_fast_math */

jamesqf · December 6, 2008, 11:15pm

Thanks, that worked just fine (after I put in a couple of degree/radian conversions), and was much, much easier to understand. Especially the compile command: one line vs ~350 lines of the SDK Makefile :-)

alex_dubinsky · December 7, 2008, 12:44am

Oh, I guess the fact that you’re on Linux has a factor there. On windows the samples have VS solutions where the custom build step also boils down to a line. NVIDIA should really look into simplifying this facet of the examples. Actually even on windows the line is unnecessarily long and hard to find (buried in the properties of the .cu file).

Plus I hate how there’s some very important nvcc options, like --ptxas-options=-v, that are pretty tricky to discover.

Would be great if NVIDIA had its own IDE, eg one based on Eclipse or on VS2k8 Shell. Could make for a much better and smoother experience.

jamesqf · December 7, 2008, 6:50pm

Yeah, though I find it hard to understand why anyone would even contemplate doing serious number-crunching on Windoze. Games, sure, but (looking for handy foxhole) there’s more to life than games. Yet I can’t even install the NVidia graphics SDK, 'cause the download’s a .exe file.

Thanks for the hint :-)

I have to disagree there. I have my own preferred editor & so on, which I know how to use well and have customized (over many years) to fit my needs. Why take a big jump back down the learning curve, and take several times longer to accomplish a given task because I’d only know some bare minimum command set of whatever editor the IDE developers liked? Then drop it all and shift to something different when I need to work on a non-NVidia project?

alex_dubinsky · December 7, 2008, 9:09pm

Because unlike your beloved vi, you don’t need to spend a year learning a new GUI IDE ;) Obviously, having a GUI tailored for NVCC and its options, a gui that reports back important information, and, in general, a gui custom designed for CUDA would cut down learning curves for everyone, including you.

Lol, and why wouldn’t they? Numbers multiply the same way over here…

jamesqf · December 8, 2008, 12:12am

vi? Oh, get serious :-) As to spending a year, or however long it takes, to learn some new GUI IDE, I suppose that depends on what the IDE can do. If your IDE editor for instance just allows you to type things in (and I have seen ones in the Windoze world that not only just do that, but do in incorrectly), then yes, it may not take long to learn. But it will take several times longer to do any given task.

Perhaps having one that reports back useful information would be of some benefit, but I think it would be a whole lot simpler not to hide that information in the first place.

Only slower :-)

alex_dubinsky · December 8, 2008, 1:36am

Where do you get that? CUDA has been observed to run the same on linux and windows. In general, anything that doesn’t spawn ten thousand threads and use all kinds of OS resources will run at the same speed no matter what the kernel is.

There’s nothing wrong with using Windows for HPC, and there are some very obvious useability advantages (especially if that’s what you’re familiar with). Linux has a clear advantage only if there is a particular application you want to use or code for. (Eg, right now I’m programming a project for Asterisk, which is a Linux telephony server.) Of course there are many such applications, but there is no other, general, magical reason why Linux is better.

Topic		Replies	Views
CUDA C language compatibility CUDA Programming and Performance	11	3703	July 15, 2009
How did the CUDA experts get started with CUDA programming? CUDA Programming and Performance	6	4357	October 2, 2023
faq / howto - unsure how to get started CUDA Programming and Performance	12	6815	June 4, 2008
Could nVidia ever make regular instalators? CUDA Programming and Performance	18	14019	October 9, 2008
How do I CUDA on Windows? CUDA Programming and Performance	12	6227	November 8, 2016
Error in my code... CUDA Programming and Performance	11	2534	December 19, 2014
CUDA Toolkit 3.0 update GPU HW debugging tools to replace device emulation CUDA Programming and Performance	44	29433	April 29, 2010
CUDA for Non-programmers? I am an undergrad physics major... CUDA Programming and Performance	6	2095	June 19, 2011
simplest programming environment (editor) for Cuda? CUDA Programming and Performance	23	22857	March 13, 2009
An Even Easier Introduction to CUDA Technical Blog	141	6074	November 28, 2023

Simple "Hello CUDA" example?

Related topics