cudaOccupancyMaxPotentialBlockSize() deleted in cuda 6.5?

Has this function been deleted in cuda 6.5?

It is not defined in “cuda_runtime.h”, “cuda_occupancy.h” or “device_launch_parameters.h”.

Grepping shows it’s there. At least on Windows CUDA 6.5 (and 7.0 RC).

I have CUDA 6.5 installed on linux and it is in cuda_runtime.h

 * \brief Returns grid and block size that achieves maximum potential occupancy for a device function
 * Returns in \p *minGridSize and \p *blocksize a suggested grid /
 * block size pair that achieves the best potential occupancy
 * (i.e. the maximum number of active warps with the smallest number
 * of blocks).
 * Use \sa ::cudaOccupancyMaxPotentialBlockSizeVariableSMem if the
 * amount of per-block dynamic shared memory changes with different
 * block sizes.
 * \param minGridSize - Returned minimum grid size needed to achieve the best potential occupancy
 * \param blockSize   - Returned block size
 * \param func        - Device function symbol
 * \param dynamicSMemSize - Per-block dynamic shared memory usage intended, in bytes
 * \param blockSizeLimit  - The maximum block size \p func is designed to work with. 0 means no limit.
 * \return
 * ::cudaSuccess,
 * ::cudaErrorCudartUnloading,
 * ::cudaErrorInitializationError,
 * ::cudaErrorInvalidDevice,
 * ::cudaErrorInvalidDeviceFunction,
 * ::cudaErrorInvalidValue,
 * ::cudaErrorUnknown,
 * \notefnerr
 * \sa ::cudaOccupancyMaxActiveBlocksPerMultiprocessor
 * \sa ::cudaOccupancyMaxPotentialBlockSizeVariableSMem
template<class T>
__inline__ __host__ CUDART_DEVICE cudaError_t cudaOccupancyMaxPotentialBlockSize(
    int    *minGridSize,
    int    *blockSize,
    T       func,
    size_t  dynamicSMemSize = 0,
    int     blockSizeLimit = 0)
  return cudaOccupancyMaxPotentialBlockSizeVariableSMem(minGridSize, blockSize, func, __cudaOccupancyB2DHelper(dynamicSMemSize), blockSizeLimit);

cudaOccupancyMaxPotentialBlockSize() was added as a new API function in CUDA 6.5. In CUDA 6.5 the function definition is located in the header file cuda_runtime.h at line 1271.

I do see a condition which needs to be set during the preprocessing.

#if defined(CUDACC)

Could it possible not be activated during compilation? What should I do to change this?

IntelliSense can find the function in a .cpp file, not in a .cu file…

how are you compiling?

Since the occupancy API generally expects a kernel function pointer as a parameter:

it stands to reason that in typical usage it should be in a .cu filed and compiled by nvcc

The global keyword is not recognized by host compilers.

It might be possible to circumvent this, I’m not sure. It’s not clear what your problem is or what you’re doing.

EDIT: If this discussion is entirely based on intellisense, you should state that up front. IT requires some effort to get intellisense to play nice with CUDA.

I didn’t recognize it as an intellisense issue at first, my bad.

I’m going to look up how to get intellisense to work properly or can’t it be done?

There are numerous suggestions on the web, even blogs written about it.

Some variation of this may work for you, it’s the one that made the most sense to me. However you may want to move some (more) of the include files after the CUDACC define:

#pragma once


#include <cuda.h>
#include <cuda_runtime.h>
#include <device_launch_parameters.h>

#define __CUDACC__

#include <device_functions.h>


This has the advantage (in my view) that it is completely transparent when your code is actually being compiled.

By the way I lifted this from here: