Which preprocessor defines are defined by NVCC compiler itself ?
It would be nice to have some preprocessor define which tells me for which compute capability (e.g. arch=sm_13) the code currently is compiled, as e.g for fermi architecture i want to set thread block sizes differently etc…
In host code? Nothing.
In host code? Nothing.
In host code, i.e. for setting different thread block sizes as you suggested, you need to figure it out at runtime using the cudaGetDeviceProperties function
and looking at the major and minor values.
In device code you can use something like this:
#if (__CUDA_ARCH__ < 200)
#define KERNEL_BLOCK_THREAD_SIZE 64
#else
#define KERNEL_BLOCK_THREAD_SIZE 128
#endif
eyal
In host code, i.e. for setting different thread block sizes as you suggested, you need to figure it out at runtime using the cudaGetDeviceProperties function
and looking at the major and minor values.
In device code you can use something like this:
#if (__CUDA_ARCH__ < 200)
#define KERNEL_BLOCK_THREAD_SIZE 64
#else
#define KERNEL_BLOCK_THREAD_SIZE 128
#endif
eyal
thx, that’s what I looked for
thx, that’s what I looked for