Hello!
I have a problem with a kernel. It uses too much registers. When I look in the ptx file I found the following:
mov.f32 %f1, 0f00000000; // 0
mov.f32 %f2, %f1; //
mov.f32 %f3, 0f00000000; // 0
mov.f32 %f4, %f3; //
mov.f32 %f5, 0f00000000; // 0
...
mov.f32 %f59, 0f00000000; // 0
mov.f32 %f60, %f59; //
mov.f32 %f61, 0f00000000; // 0
mov.f32 %f62, %f61; //
mov.f32 %f63, 0f00000000; // 0
mov.f32 %f64, %f63; //
The actual C code looks like this:
float c1[16] = {0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f};
float c2[16] = {0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f,0.0f};
Instead of using 32 registers for the two arrays, the compiler uses 64…It is interessting, that the even number registers are never used as a source. But whenever the odd ones are set, a copy of them is stored in the even ones…
Does anybody know that happened there?
Thanks!
Moritz