New cudaDeviceSetCacheConfig and cudaFuncSetCacheConfig mode

GK110 supports a new cache config mode where the L1 and shared memory are split 32:32

references:
http://docs.nvidia.com/cuda/kepler-tuning-guide/index.html#shared-memory-and-warp-shuffle

and slide 29 of:

http://developer.download.nvidia.com/GTC/PDF/GTC2012/PresentationPDF/S0514-GTC2012-GPU-Performance-Analysis.pdf

However the cuda reference manual only list the older 16:48 and 48:16 split using cudaDeviceSetCacheConfig or cudaFuncSetCacheConfig, page 23 and 52 respectively in the Toolkit Reference Manual.

How do I set the 32:32 split?

For the Runtime API:

/**
 * CUDA function cache configurations
 */
enum __device_builtin__ cudaFuncCache
{
    cudaFuncCachePreferNone   = 0,    /**< Default function cache configuration, no preference */
    cudaFuncCachePreferShared = 1,    /**< Prefer larger shared memory and smaller L1 cache  */
    cudaFuncCachePreferL1     = 2,    /**< Prefer larger L1 cache and smaller shared memory */
    cudaFuncCachePreferEqual  = 3     /**< Prefer equal size L1 cache and shared memory */
};

… and for the Driver API:

/**
 * Function cache configurations
 */
typedef enum CUfunc_cache_enum {
    CU_FUNC_CACHE_PREFER_NONE    = 0x00, /**< no preference for shared memory or L1 (default) */
    CU_FUNC_CACHE_PREFER_SHARED  = 0x01, /**< prefer larger shared memory and smaller L1 cache */
    CU_FUNC_CACHE_PREFER_L1      = 0x02, /**< prefer larger L1 cache and smaller shared memory */
    CU_FUNC_CACHE_PREFER_EQUAL   = 0x03  /**< prefer equal sized L1 cache and shared memory */
} CUfunc_cache;

Thanks!