Is there an instruction limit ?

Is there an instruction limitation for a cuda-function ?

I have the following problem:

My renderer works fine as it is now, but if I only add one more (even senseless) line in the source, it seems to be not executed at all :(

I cant paste the complete source as it is too big…

The main cuda render-function has about 400 lines - and I cant continue development as it wont execute if I only add one more line.

(I get a black screen and the framerate bumps up)

Here an example modification that wont let it run:

(Actually I tested various things)

Version that runs OK :

unsigned int test_color1 = ...;

unsigned int color_final = test_color1;

Version that doesnt run at all:

int position_x = ...

int position_y = ...

...

unsigned int test_color1 = ...;

unsigned int test_color2 = ( position_x ^ position_z );

unsigned int color_final = test_color1^test_color2;

My system:

VS2005, WinXP SP2, Cuda1.1, GF8800GTS, Driver 6.14.11.6921 (2007/12/05),

CPU Pentium D 3.0Ghz, 1.0GB Ram, GF8800GTS

does anybody had a similar problem ?

cudaGetLastError() Gives me

“error :too many resources requested for launchtime:14”

CUT_CHECK_ERROR doesnt return an error.

Edit: Just found

http://forums.nvidia.com/index.php?act=ST&f=71&t=51905

perhaps its the solution

-Sven

yes, there is an instruction limitation of 2 million native instructions per kernel.

(see the CUDA programming guide appendix A.1).

Normally that are more than enough instruction even for complicated calculations.

I think the link that you found refers more to your problem, than the stuff with the limitation of instructions.

Thank you for the quick answer.

Ok, I got the error: I ran out of registers.

(similar to the previous link)

128 threads * 66 regs > 8192

from the .obj file:

name = _Z10cudaRenderP6Render

lmem = 120

smem = 20

reg = 66

bar = 0

-Sven

I don’t think you can even exceed that instruction limit of 2 million. Believe me, I’ve tried :) Before you get that far you will get a segmentation fault in the compiler, because of too many virtual registers.