Hello,
I bumped into following error:
nvcc -O3 -use_fast_math --prec-sqrt=false --prec-div=false --keep --keep-dir keep -L/home/tener/localopt/NVIDIA_GPU_Computing_SDK/C/lib -lcurand -lcutil_x86_64 -lparamgl_x86_64 -lglut -lGL -lglfw -lGLEW -lpthread -lboost_system -lboost_thread-mt -gencode arch=compute_20,code=sm_21 -I/home/tener/localopt/NVIDIA_GPU_Computing_SDK/C/common/inc -I/home/tener/localopt/NVIDIA_GPU_Computing_SDK/shared/inc/ -I. --compiler-options -mtune=native,-march=native,-O3 --ptxas-options '--verbose -O4' *.cu obj/graphics.o obj/main.o obj/server.o obj/utils.o -o rt
keep/kernel.cpp3.i(0): Warning: Olimit was exceeded on function _ZN8RayTraceI7SurfaceIL4Surf0E6float3fE12ModelViewRayIS2_fEEclEi; will not perform function-scope optimization.
To still perform function-scope optimization, use -OPT:Olimit=0 (no limit) or -OPT:Olimit=47393
### Assertion failure at line 2761 of ../../be/cg/NVISA/cgtarget.cxx:
### Compiler Error in file keep/kernel.cpp3.i during Register Allocation phase:
### ran out of registers in float
nvopencc INTERNAL ERROR: /home/tener/localopt/cuda/open64/lib//be returned non-zero status 1
Does it mean I have to decrease the complexity of my kernel or it is indeed an internal error I should report?
– edit –
Adding “–opencc-options -OPT:Olimit=0” to nvcc options makes the error go away.