cudaOccupancyMaxPotentialBlockSize and warning C4267

paulrichmond · February 5, 2015, 2:33pm

Maybe I’m loosing my mind but I can’t seem to avoid C4267 warnings when using cudaOccupancyMaxPotentialBlockSize in CUDA 7 VS2012.

Compiler output.

1>C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include\cuda_runtime.h(1541): warning C4267: '=' : conversion from 'size_t' to 'int', possible loss of data
1>          C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include\cuda_runtime.h(1623) : see reference to function template instantiation 'cudaError_t cudaOccupancyMaxPotentialBlockSizeVariableSMemWithFlags<__cudaOccupancyB2DHelper,void(__cdecl *)(unsigned int *,unsigned int *)>(int *,int *,T,UnaryFunction,int,unsigned int)' being compiled
1>          with
1>          [
1>              T=void (__cdecl *)(unsigned int *,unsigned int *),
1>              UnaryFunction=__cudaOccupancyB2DHelper
1>          ]
1>          C:/my_example/simulation.cu(384) : see reference to function template instantiation 'cudaError_t cudaOccupancyMaxPotentialBlockSize<void(__cdecl *)(unsigned int *,unsigned int *)>(int *,int *,T,size_t,int)' being compiled
1>          with
1>          [
1>              T=void (__cdecl *)(unsigned int *,unsigned int *)
1>          ]

Line 384 makes the following call

cudaOccupancyMaxPotentialBlockSize( &minGridSize, &blockSize, my_kernel, 0, count);

where minGridSize, blockSize and count are all ints according to the API docs.

Any suggested fix for this other than using a pragma disable?

Thanks

Paul

paulrichmond · February 5, 2015, 8:54pm

The problem seems to be caused internally by the use of size_t in the __cudaOccupancyB2DHelper struct called by the cudaOccupancyMaxPotentialBlockSize function. To answer my own question, it is possible (if not a bit clunky) to get round this by defining a unary function which returns an int rather than a size_t and pass it to cudaOccupancyMaxPotentialBlockSizeVariableSMem as follows;

cudaOccupancyMaxPotentialBlockSizeVariableSMem( &minGridSize, &blockSize, my_kernel, my_kernel_sm_size, count);

In the above case it is necessary to define a function which just returns 0.

int my_kernel_sm_size(int b){
	return 0;
}

P