Mangled name limit ?

disclaimer : I know this is quite specific, and that I’m probably the only one here to encounter this problem. It may even not be related to CUDA, but to the gcc precompiler

Hello everybody,

I use in my library lots and lots of templates, which are eaten by the precompiler and work in CUDA. But, when too many templates are involved in a single CUDA call, I get a strange error :

1>Signal: caught in Writing WHIRL file phase.

1>Error: Signal caught in phase Writing WHIRL file -- processing aborted

1>nvopencc: Permission denied

1>nvopencc INTERNAL ERROR: cannot unlink temp file C:/DOCUME~1/degomme/LOCALS~1/Temp/ccB#.a05000

(this is on windows, I didn’t try that on linux. This only happens in a real mode, not in emulation. I only tried it with CUDA 1.1)

The thing is, that when I look at the .cubin file, my function names are mangled by gcc and become quite … consequent

For instance:

name = _Z12CUDA_arctan2I21CUDA_Linear_ConvolverIssLb1ELi5ELcn1ELcn2ELc0ELc2ELc1ELc0ELc0ELc0ELc0ELc0EES0_IssLb1ELi5ELc1ELc2ELc4ELc2ELc1ELc0ELc0ELc0ELc0ELc0EELi48ELi32ELi32ELi2EEvPNT_11_pixel_typeES5_ii

this one works (the error wasn’t triggerred), because it is short enough I guess (the mangled name is 193 characters long)

When I add some more parameters to my calls, this name gets longer, and when it reaches more than 242 characters, I don’t get any compilation error, but this .cubin is generated

code  {

	name = __dummy_entry__

	lmem = 0

	smem = 0

	reg = 0

	bar = 0

	bincode  {

  0xf0000001 0xe0000001 

	}

and when I add some more parameters, the error is triggered.

Including the “name=” in the line, I think that the number of characters in it is quite close to 256.

Is there a limit (256) in the size of the function names CUDA can accept ?

Could it be increased and/or can the mangling be avoided/modified in nvcc ?

If someone from Nvidia could answer, I would be really pleased.

Last minute : I tried with the template project from the sdk, replacing testKernel by lots of ‘a’ … works until the name reaches 241 characters, produces dummy cubin until 245, and fails after that.

You are not the only one:

Signal: Segmentation fault in Writing WHIRL file phase.
(0): Error: Signal Segmentation fault in phase Writing WHIRL file – processing aborted
*** Internal stack backtrace:
/usr/local/cuda/open64/lib//gfec [0x766582]
/usr/local/cuda/open64/lib//gfec [0x767030]
/usr/local/cuda/open64/lib//gfec(ErrMsgLine+0x8f) [0x76671f]
/usr/local/cuda/open64/lib//gfec [0x7678c8]
/lib/libc.so.6 [0x2ac57e44a7d0]
/usr/local/cuda/open64/lib//gfec(WFE_Alias_Finish+0x2a) [0x4b367a]
/usr/local/cuda/open64/lib//gfec(compile_file+0xca) [0x6ba52a]
/usr/local/cuda/open64/lib//gfec(main+0x56) [0x4d94d6]
/lib/libc.so.6(__libc_start_main+0xf4) [0x2ac57e436b44]
/usr/local/cuda/open64/lib//gfec [0x45a7ba]
nvopencc INTERNAL ERROR: /usr/local/cuda/open64/lib//gfec died due to signal 4
nvopencc ERROR: core dumped

On 2.0 beta with kernels generated from c++ templates.

Yes, with CUDA 2.0, I figured out that templates were handled very differently and that my code wouldn’t work, even after making it compliant with GCC4 (typename everywhere, etc…). There were still lots and lots of errors … in the files CUDA itself generates (missing typename in extern “C” functions I guess), causing this kind of error to happen :

z:\nvidia_cuda_sdk\projects\flop\CUDA_ops.cudafe1.stub.h(113) : error C4430: missing type specifier - int assumed. Note: C++ does not support default-int

when I watch the generated code, the error is here in the stub file :

template
void CUDA_horizontal(const int sizeX, const _pixType_in *data, _pixType_out *result);

which has been generated from my code :

template
<
typename weaver

void
global CUDA_horizontal
(
const int sizeX,
const typename weaver::_convolver::_pixType_in * data,
typename weaver::_convolver::_pixType_out * result
)

(note the missing typename keywords in the generated code… I guess they cause the compiler to think that pixType_in is the name of the variable, and then cause a large amount of syntax errors… I didn’t try to modify them manually and recompile from here, because I rollbacked to a fully-working and templates-friendly CUDA 1.1)

I think Nvidia only tried simple template codes to test the compliancy of the new version, because, as it is not meant to be officially supported, template programs are not meant to exist :p

(I saw a topic on the forum about a guy with the same experience, maybe he had more luck and managed to have his library working again, but I doubt that)

The previous problem with the name limit wasn’t that annoying, compared to these ones