Hi there,
we are running teslas on a ubuntu 9.10 installation with gcc4.4.
This works just fine with the 2.3 version of the SDK when -O2 is
not used for compilation. Our hope was, that
the 4.4 support would get into the next release of the SDK.
However the code is still only compilable without -O2, and it doesn’t
work for some examples at all.
dct8x8 fails due the newly introduced shrUtils.h stuff.
cudafe++ inserts a __builtin_stdarg_start, which is no longer available in
GCC 4.4 (GCC 4.4 Release Series — Changes, New Features, and Fixes - GNU Project):
cudafe++ --m64 --gnu_version=40401 --diag_error=host_device_limited_call --diag_error=ms_asm_decl_not_allowed --parse_templates --gen_c_file_name “/tmp/tmpxft_0000237c_00000000-1_dct8x8.cudafe1.cpp” --stub_file_name “/tmp/tmpxft_0000237c_00000000-1_dct8x8.cudafe1.stub.c” --stub_header_file_name “/tmp/tmpxft_0000237c_00000000-1_dct8x8.cudafe1.stub.h” “/tmp/tmpxft_0000237c_00000000-10_dct8x8.cpp4.ii”
→
/tmp/tmpxft_0000237c_00000000-1_dct8x8.cudafe1.cpp:22293:__builtin_stdarg_start(__args,__f
mt);
This is easily fixed by replacing builtin_stdarg_start by builtin_va_start, however doing this manually would be a little
tedious. Would it be possible to fix this, or workaround this?
The only thing, I can think of right now would be a wrapper around nvcc, which uses nvcc --dryrun to create a script and insert a
sed after each cudafe++ call before executing the script. But this is a little bit clumsy and not very nice.
Thanks!