Slow compilation

Hi! recently I met a problem that my cuda program compiles to slow. I looked in task manager and found out that ptxas.exe takes too much time, i.e. it consumes memory until some limit 300-400mb then it restarts and does the same again for a few times, then my program compiles and everything works well.

I’m using implementation of complex class from http://forums.nvidia.com/index.php?showtopic=73978, but I added few functions that I needed

// complex ExpComplex(complex)

__device__ singlecomplex ExpComplex(const singlecomplex REF(a)) {

   float value = expf(a.value.x);

   singlecomplex result = { value * cosf(a.value.y), value * sinf(a.value.y) };

   return result;

}

// complex LogComplex(complex)

__device__ singlecomplex LogComplex(const singlecomplex REF(a)) {

   singlecomplex result = { logf(sqrtf(a.value.x * a.value.x + a.value.y * a.value.y)), 

										atan2(a.value.x,a.value.y) };

   return result;

}

// complex PowComplex(complex, degree)

__device__ singlecomplex PowComplex(const singlecomplex REF(a),const float REF(b)) {

   singlecomplex result;

   if (a.value.y == 0 && b >= 1)

   { 

	   result = make_singlecomplex(powf(a.value.x,b), 0 );

   }

   else

	   if(a.value.y == 0 && a.value.x >= 0 && b < 1)

	   {

		   result = make_singlecomplex(powf(a.value.x,b), 0);

	   }

	   else

	   {

			result = ExpComplex(LogComplex(a)*b);

		   }

   return result;

}

I found out that whenever I delete trigonometric functions from function ExpComplex or from function LogComplex compiler does not meet any problems and compiles everything in few seconds, but whenever I use these two functions with trigonometric (sinf and cosf in ExpComplex and atan2 in LogComplex) at the same time like here result = ExpComplex(LogComplex(a)*b ); compiler seems to enter some infinite loop and thus consumes lots of memory and time.

I can not understand what is wrong? Does anyone meet the same problem or has any ideas how to fix it?