Does the following error means I have ran out of registers for executing my kernel?
Is there any way to solve this?? I used -g -maxrregcount and it didn’t have any effects …
Thanks.
Assertion failure at line 2433 of …/…/be/cg/NVISA/cgtarget.cxx:
Compiler Error in file Test.cpp3.i during Register Allocation phase:
ran out of registers in float
nvopencc INTERNAL ERROR: /usr/local/cuda/open64/lib//be returned non-zero status 1
Are you sure? Because of single-asignment, registres get used up very quickly. You might only need a few hundred lines of code to use that many. I have kernels with several thousand lines.
maxrregcount affects ptxas. this is an error in nvcc/open64
1>### Assertion failure at line 2433 of …/…/be/cg/NVISA/cgtarget.cxx:
1>### Compiler Error in file XXXX\Temp/tmpxft_00000b98_00000000-9_cudaEntry.cpp3.i during Register Allocation phase:
1>### ran out of registers in float
1>nvopencc ERROR: F:\CUDA\bin/…/open64/lib//be.exe returned non-zero status 1
I admit my cu file is a bit long, are there any method to bypass it? (can device functions be called as external functions?)
It looks like the advertised “2 million PTX instructions” can never be hit because of opencc’s puzzling limitation. Is there no simple way to just increase the max?