compiling costs too much time

yhs1003 · November 25, 2009, 9:10am

i am trouble that when i comple cuda code, it costs too much time. i noticed that compiling some function needs much more time than others. why?how to resolve it.

Ichabod · November 25, 2009, 11:50am

Same here. I have eight entry-functions. The --ptxas-options=-v parameter shows, that the needed time increases from function to function. While the first one needs much less than one second, I have to wait about 2 minutes until the last function has been proceeded. The functions are quite small, all the same size.

I already made kernels without that effect. The only thing that is different now: in THIS kernel I specified a static array in constant space with 48kB size.

Does anybody have an idea?

Ichabod · November 25, 2009, 1:44pm

I solved my problem. Indeed the compiler took that much time because of the constant data. First I had something like that:

__constant__ double data[4321] = { .................many entries....... };  //causes the compiler to allocate memory and set with values

I changed my code so that the array is not initialized at startup. I copy the data into the constant space explicitly:

__constant__ double data[4321];  //just allocate the memory

double cpudata[4321] = { ..........many entries...........};  // initialize second array with constants in cpu memory

...

cudaMemcpyToSymbol(data, cpudata, sizeof(double)*4321, 0, cudaMemcpyHostToDevice);  // before first kernelusage of the constant array: copy the data to the constant space

That shrinked my compilation time from 3 minutes to 3 seconds :)

yhs1003 · November 26, 2009, 3:29pm

thank for your answer. but i do like yours.

constant double data[4321]; //just allocate the memory
double cpudata[4321] = { …many entries…}; // initialize second array with constants in cpu memory

it’s still costs too much time to compile. i find that the function which needs too much time to compile also uses many registers. 88 registers.
do someone also have the same question?