How to reduce compile time for big kernel function?

wlx · November 21, 2009, 7:41am

Hi, I have redesign a model to use cuda, the kernel function is very big, about 10,000 lines.
I have to wait about 2hours to compile it, is there anyway that can reduce the compiling time?

SPWorley · November 21, 2009, 7:57am

In my current project, my kernel is about 3000 lines of code but it compiles in 15-20 seconds or so.
My raytracing and GI kernels are something like 13000 lines (split across many files) and compile in about a minute. I haven’t even bothered to set up parallel make for it.

BUT I did find and report one compilation bug dealing with multiplication of 64 bit constants. When such a line was used, compilation time shot up to hours!
This is just a simple x=12345ULL*y; The generated code was correct and ran at full speed… it was the COMPILATION that slowed by 3 orders of magnitude.
NVidia fixed it for the 3.0 toolkit nvcc.

You might not be hitting that exact issue, but perhaps there’s some code structure which has similar compile slowdowns. It may be literally one line of code.
Start chopping out functions and lines of code, ignoring functionality, just to see if suddenly compile speed improves.
If so, you have a good compiler bug to report to NVIDIA!

You may also try the 3.0 toolkit beta just for fun.

wlx · November 21, 2009, 9:07am

I am using the 3.0beta version now.

Only these information are complained by nvcc:

clm_cuda.cu(4590): warning: variable "scvold" is used before its value is set

clm_cuda.cu(4590): warning: variable "scvold" is used before its value is set

/tmp/tmpxft_000044cc_00000000-7_clm_cuda.cpp3.i(0): Warning: Olimit was exceeded on function process_patch_device; will not perform function-scope optimization.

	To still perform function-scope optimization, use -OPT:Olimit=0 (no limit) or -OPT:Olimit=164494

/tmp/tmpxft_000044cc_00000000-7_clm_cuda.cpp3.i(0): Warning: To override Olimit for all functions in file, use -OPT:Olimit=164494

	(Compiler may run out of memory or run very slowly for large Olimit values)

wlx · November 23, 2009, 2:40am

Is there any limits of code length in kernel function?
If I remove some core functions, the compile speed is fast.

Topic		Replies	Views
Slow Compilation with multiple calls of same function CUDA Programming and Performance	1	764	September 30, 2011
speed nvcc compiler CUDA Programming and Performance	1	2163	January 3, 2014
High compilation time CUDA Programming and Performance	4	1538	September 26, 2008
kernel function size limit? how many lines or variables are allowed? CUDA Programming and Performance	7	6963	November 15, 2007
very slow compile CUDA Programming and Performance	7	2174	February 8, 2012
compiling costs too much time CUDA Programming and Performance	3	3932	November 26, 2009
Unexpected slow-down on executing kernel in CUDA CUDA Programming and Performance	2	1011	March 2, 2021
Kernel Interruption in Command Line Application CUDA Programming and Performance	1	7372	July 15, 2011
An question about a cuda program CUDA Programming and Performance	2	1137	June 13, 2013
CUDA v2.0 beta is slower than CUDA v1.1 Is it just temporarily ? CUDA Programming and Performance	3	2664	July 20, 2008

How to reduce compile time for big kernel function?

Related topics