Toolkit Documentation

Hello there,

I’m really new to this stuff and want to dig into it and explore the features and benefits.
So far the SDK examples work and I’ve startet applying my own changes.

I appreciate, that Nvidia wrote the small macros (e.g. CUT_DEVICE_INIT ), to get the examples much more readable. However, it is really hard to find out what each of the macros does and how it does that. Am I blind or is there really no good reference or documentation about these toolkit additions?



Have a look at CUDA_SDK/common/inc/cutil.h

There you can find the useful predefined macros you’re talking about.
The definitions and the code are not that difficult to understand.
AFAIK there is no documentation for these utilities.

I have not found any documentation either and in fact I am having a problem when I use the CUDA_SAFE_CALL macro. If you find anything, please (please please …) make sure that you post it up here for everyone’s benefit.


The linux beta 1.1 toolkit is correctly installing a new cutil_readme.txt in the common directory, however it isn’t getting installed for windows. We’ll fix this.

It now has a little bit of documentation. Please note the described difference when using Release versus Debug samples (this is handled in the included Make and Visual Studio project files)

Here’s what the readme contains (so far):

CUDA Utility Library

CUTIL is a simple utility library designed for use in the CUDA SDK samples.

It provides functions for:

  • parsing command line arguments
  • read and writing binary files and PPM format images
  • comparing arrays of data (typically used for comparing GPU results with CPU)
  • timers
  • macros for checking error codes
  • checking for shared memory bank conflicts

CUTIL is not part of CUDA

Note that CUTIL is not part of the CUDA Toolkit and is not supported by NVIDIA.
It exists only for the convenience of writing concise and platform-independent
example code.

Library Functions

Most of the functions should be self explanatory. The function parameters are
documented in the “cutil.h” file.


CUTIL includes a number of macros that can be used to easily initialize the
device, and automatically check the error codes returned by CUDA runtime
functions when debugging.

These macros are compiled out in release builds and so they will not affect
performance. Note that in debug mode they call cudaThreadSynchronize()
to ensure that kernel execution has completed, which can affect performance.


  • this macro finds the first available CUDA device and initializes it. When
    compiling for device emulation it has no effect.


  • this simply exits the program, prompting the user to press enter so that
    the console window doesn’t disappear too quickly under Windows. You can force
    SDK samples to exit without a prompt by passing the “–noprompt” command line


  • this macro is intended to be wrapped around a CUDA runtime API call. It
    checks the returned error code and exits with a message if there is an error.


  • as above, but designed for CUDA driver API calls


  • as above, but for CUTIL functions.


  • checks for CUDA runtime errors.


  • checks for OpenGL errors

I thought as much!

However treating the macros as a black box would have been easier for a jump start.

Another task on my list.