I’m really new to this stuff and want to dig into it and explore the features and benefits.
So far the SDK examples work and I’ve startet applying my own changes.
I appreciate, that Nvidia wrote the small macros (e.g. CUT_DEVICE_INIT ), to get the examples much more readable. However, it is really hard to find out what each of the macros does and how it does that. Am I blind or is there really no good reference or documentation about these toolkit additions?
There you can find the useful predefined macros you’re talking about.
The definitions and the code are not that difficult to understand.
AFAIK there is no documentation for these utilities.
I have not found any documentation either and in fact I am having a problem when I use the CUDA_SAFE_CALL macro. If you find anything, please (please please …) make sure that you post it up here for everyone’s benefit.
The linux beta 1.1 toolkit is correctly installing a new cutil_readme.txt in the common directory, however it isn’t getting installed for windows. We’ll fix this.
It now has a little bit of documentation. Please note the described difference when using Release versus Debug samples (this is handled in the included Make and Visual Studio project files)
Here’s what the readme contains (so far):
CUDA Utility Library
CUTIL is a simple utility library designed for use in the CUDA SDK samples.
It provides functions for:
parsing command line arguments
read and writing binary files and PPM format images
comparing arrays of data (typically used for comparing GPU results with CPU)
timers
macros for checking error codes
checking for shared memory bank conflicts
CUTIL is not part of CUDA
Note that CUTIL is not part of the CUDA Toolkit and is not supported by NVIDIA.
It exists only for the convenience of writing concise and platform-independent
example code.
Library Functions
Most of the functions should be self explanatory. The function parameters are
documented in the “cutil.h” file.
Macros
CUTIL includes a number of macros that can be used to easily initialize the
device, and automatically check the error codes returned by CUDA runtime
functions when debugging.
These macros are compiled out in release builds and so they will not affect
performance. Note that in debug mode they call cudaThreadSynchronize()
to ensure that kernel execution has completed, which can affect performance.
CUT_INIT_DEVICE
this macro finds the first available CUDA device and initializes it. When
compiling for device emulation it has no effect.
CUT_EXIT
this simply exits the program, prompting the user to press enter so that
the console window doesn’t disappear too quickly under Windows. You can force
SDK samples to exit without a prompt by passing the “–noprompt” command line
option.
CUDA_SAFE_CALL(call)
this macro is intended to be wrapped around a CUDA runtime API call. It
checks the returned error code and exits with a message if there is an error.