Size of CUDA Object Code?

danuk · November 23, 2010, 9:06pm

Hi,

To what extent should the size of the object code produced after compiling CUDA kernels be considered (ignoring the compilation time for the moment)?

I could have a C++ template that produces 10,000 unqiue device functions at compile time that might save a small amount of work for example.

I’m not doing anything quite that silly, but must the binary fit into a GPU memory or cache of a certain size or am I going to start eating away at the memory available on a GPU if I really push this?

Thanks

danuk · November 23, 2010, 9:06pm

Hi,

To what extent should the size of the object code produced after compiling CUDA kernels be considered (ignoring the compilation time for the moment)?

I could have a C++ template that produces 10,000 unqiue device functions at compile time that might save a small amount of work for example.

I’m not doing anything quite that silly, but must the binary fit into a GPU memory or cache of a certain size or am I going to start eating away at the memory available on a GPU if I really push this?

Thanks

Romant · November 24, 2010, 1:53am

The size of the kernel is limited by the number of GPU instructions - 2 million instructions is a limit. However, it is still not clear how to count those instructions :-) I have asked NVIDIA people three times without any meaningful answers. In summary, the following thoughts might be reasonable:

Size of the kernel is limited by the number of GPU instructions - 2 million is a max.
Minimal size of a single GPU instruction is 32 bytes, maximal size is 64 bytes.
It is possible to check the size of the .cubin binary file when compiling the kernel with -keep option.
The size of the .cubin file is, as I think, close to the size of the actual binary code that represents the compiled kernel (if compiled for a single architecture).
According to (0) and (1) the maximal size of the .cubin file must be somewhere between 8 and 16 megabytes.
It is possible to estimate how far your kernel is from the 2 million limit by the size of the .cubin file.

Hope this helps.

Romant · November 24, 2010, 1:53am

The size of the kernel is limited by the number of GPU instructions - 2 million instructions is a limit. However, it is still not clear how to count those instructions :-) I have asked NVIDIA people three times without any meaningful answers. In summary, the following thoughts might be reasonable:

Size of the kernel is limited by the number of GPU instructions - 2 million is a max.
Minimal size of a single GPU instruction is 32 bytes, maximal size is 64 bytes.
It is possible to check the size of the .cubin binary file when compiling the kernel with -keep option.
The size of the .cubin file is, as I think, close to the size of the actual binary code that represents the compiled kernel (if compiled for a single architecture).
According to (0) and (1) the maximal size of the .cubin file must be somewhere between 8 and 16 megabytes.
It is possible to estimate how far your kernel is from the 2 million limit by the size of the .cubin file.

Hope this helps.

Gregory_Diamos · November 24, 2010, 2:22am

Also, this may be a per-kernel limit rather than a per-application limit. So if each of those functions is a separate kernel it might not matter.

Gregory_Diamos · November 24, 2010, 2:22am

Also, this may be a per-kernel limit rather than a per-application limit. So if each of those functions is a separate kernel it might not matter.

Topic		Replies	Views
NVIDIA people, please pay attention, still have no meaningful answer How to estimate the proximity t CUDA Programming and Performance	5	683	November 5, 2010
CUDA kernel size What if it exceeds 2MB CUDA Programming and Performance	4	3883	November 5, 2007
Estimating kernel size? CUDA Programming and Performance	5	2650	March 1, 2010
Kernel code size limitations CUDA Programming and Performance	2	4618	March 9, 2007
Could Kernel size limit performance? CUDA Programming and Performance	6	2693	December 17, 2014
How to find out how many ptx instructions are in the kernel ? Keeping in mind the 2 million ptx inst CUDA Programming and Performance	11	7427	September 18, 2009
kernel size and caching CUDA Programming and Performance	3	8070	May 6, 2007
What is maximum size of kernel code? CUDA Programming and Performance	2	8689	February 18, 2010
Kernel max instructions? CUDA Programming and Performance	8	1871	February 8, 2018
Can a Kernel be too big?? CUDA_ERROR_NO_BINARY_FOR_GPU error 209 CUDA Programming and Performance	11	3290	November 13, 2017

Size of CUDA Object Code?

Related topics