Learning how to harness cuda.

Guys,

I am an IT specialist and have some programming background mainly in Java, but haven’t gotten a chance to use C++. Now, I already downloaded CUDA toolkit 4.0 - Microsoft Visual Studio 2008 (express).

Now, I already skimmed the book “CUDA by example” and the author included the file book.h? I am unable to compile book.h without including stdafx.h - which is the standard norm, I understand. However, it keeps bugging me to include stdafx.h.

Furthermore, is there any decent book, training, on how to implement CUDA with C++? I really want to learn.

Here an example: I am copying and pasting the codes from CUDA example:

  • Knowingly, I am aware that stdafx.h is required as a standardization. However, CUDA example doesn’t include it, only book.h but the compile will fail without including stdafx.h

Any ideas?

1>------ Build started: Project: test2, Configuration: Debug Win32 ------
1>Compiling…
1>stdafx.cpp
1>Compiling…
1>test2.cpp
1>c:\users\neo\documents\visual studio 2008\projects\test2\test2\test2.cpp(1) : warning C4627: ‘#include “…/cuda/common/book.h”’: skipped when looking for precompiled header use
1> Add directive to ‘stdafx.h’ or rebuild precompiled header
1>c:\users\neo\documents\visual studio 2008\projects\test2\test2\test2.cpp(48) : fatal error C1010: unexpected end of file while looking for precompiled header. Did you forget to add ‘#include “stdafx.h”’ to your source?
1>Build log was saved at “file://c:\Users\NEO\Documents\Visual Studio 2008\Projects\test2\test2\Debug\BuildLog.htm”
1>test2 - 1 error(s), 1 warning(s)
========== Build: 0 succeeded, 1 failed, 0 up-to-date, 0 skipped ==========

Here is the code, I hope you get the idea. And also, here is system SPEC:

Windows 7 x64
CUDA TOOLKIT 4.0
6 GB RAM
Intel 990X
Geforce 480 GTX

#include “…/cuda/common/book.h”
int main( void ) {
cudaDeviceProp prop;
int count;
HANDLE_ERROR( cudaGetDeviceCount( &count ) );
for (int i=0; i< count; i++) {
HANDLE_ERROR( cudaGetDeviceProperties( &prop, i ) );
printf( " — General Information for device %d —\n", i );
printf( “Name: %s\n”, prop.name );
printf( “Compute capability: %d.%d\n”, prop.major, prop.minor );
printf( “Clock rate: %d\n”, prop.clockRate );
printf( “Device copy overlap: " );
if (prop.deviceOverlap)
printf( “Enabled\n” );
else
printf( “Disabled\n” );
printf( “Kernel execition timeout : " );
if (prop.kernelExecTimeoutEnabled)
printf( “Enabled\n” );
else
printf( “Disabled\n” );
printf( " — Memory Information for device %d —\n”, i );
printf( “Total global mem: %ld\n”, prop.totalGlobalMem );
printf( “Total constant Mem: %ld\n”, prop.totalConstMem );
printf( “Max mem pitch: %ld\n”, prop.memPitch );
printf( “Texture Alignment: %ld\n”, prop.textureAlignment );
usInG devIce ProPertIes
33
roperties
printf( " — MP Information for device %d —\n”, i );
printf( “Multiprocessor count: %d\n”,
prop.multiProcessorCount );
printf( “Shared mem per mp: %ld\n”, prop.sharedMemPerBlock );
printf( “Registers per mp: %d\n”, prop.regsPerBlock );
printf( “Threads in warp: %d\n”, prop.warpSize );
printf( “Max threads per block: %d\n”,
prop.maxThreadsPerBlock );
printf( “Max thread dimensions: (%d, %d, %d)\n”,
prop.maxThreadsDim[0], prop.maxThreadsDim[1],
prop.maxThreadsDim[2] );
printf( “Max grid dimensions: (%d, %d, %d)\n”,
prop.maxGridSize[0], prop.maxGridSize[1],
prop.maxGridSize[2] );
printf( “\n” );
}
}

I’d download & read the CUDA C Grogramming Guide. The “CUDA by Example” book does a lot of fakey stuff like the book.h file, in order to hide implementation details that IMHO are better not hidden. So you can for instance just #include <cuda_runtime.h>.

My $0.02 worth, anyway.

I don’t think the purpose of the book.h file is to hide anything, it is designed to minimize bloat and clutter in the book so maximum focus can be given to the main features of each example. The idea is similar to how the cutil library is used by the CUDA SDK.