GPU SDK 3.0-beta1 on GCC4.4 Problem with stdarg.h appears with gcc4.4

Hi there,

we are running teslas on a ubuntu 9.10 installation with gcc4.4.
This works just fine with the 2.3 version of the SDK when -O2 is
not used for compilation. Our hope was, that
the 4.4 support would get into the next release of the SDK.

However the code is still only compilable without -O2, and it doesn’t
work for some examples at all.

dct8x8 fails due the newly introduced shrUtils.h stuff.
cudafe++ inserts a __builtin_stdarg_start, which is no longer available in
GCC 4.4 (GCC 4.4 Release Series — Changes, New Features, and Fixes - GNU Project):
cudafe++ --m64 --gnu_version=40401 --diag_error=host_device_limited_call --diag_error=ms_asm_decl_not_allowed --parse_templates --gen_c_file_name “/tmp/tmpxft_0000237c_00000000-1_dct8x8.cudafe1.cpp” --stub_file_name “/tmp/tmpxft_0000237c_00000000-1_dct8x8.cudafe1.stub.c” --stub_header_file_name “/tmp/tmpxft_0000237c_00000000-1_dct8x8.cudafe1.stub.h” “/tmp/tmpxft_0000237c_00000000-10_dct8x8.cpp4.ii”

/tmp/tmpxft_0000237c_00000000-1_dct8x8.cudafe1.cpp:22293:__builtin_stdarg_start(__args,__f
mt);

This is easily fixed by replacing builtin_stdarg_start by builtin_va_start, however doing this manually would be a little
tedious. Would it be possible to fix this, or workaround this?

The only thing, I can think of right now would be a wrapper around nvcc, which uses nvcc --dryrun to create a script and insert a
sed after each cudafe++ call before executing the script. But this is a little bit clumsy and not very nice.

Thanks!

gcc 4.4 isn’t supported yet, so you’re best off rolling back to gcc 4.3 and waiting for 4.4 support (which, as far as I know, is not in 3.0).

Well, fine. I know its not supported.

Just tinkering around to see whats possible.

Anyway builtin_stdarg_start seems to be deprecated since gcc4.0, so it might be a good

idea to change it anyway?

If it doesn’t work with 4.4, thats fine. I’m not complaining ;) Just looking for ideas.

Thanks for your quick reply.

OK,
so I can compile the SDK 3.0 examples without -O2 and the attached script

nvcc4.4.gz (218 Bytes)

(which is very crude) wrapping nvcc.

haraldkl,

Thanks for putting together this script. I have switched to CUDA Toolkit and SDK 3.0 now that’s it’s no longer in beta. I am running Ubuntu 9.10 with gcc4.4, and don’t really want to install an older version of gcc just for the purposes to compiling the SDK examples. Here’s what I have done by way of installation:

  1. I gunzip’ped the file, and copied the nvcc4.4 script to the same directory as nvcc (in my case, /usr/local/cuda/bin), and set executable permissions on the file.

  2. I edited nvcc4.4 to point to the location of nvcc on my system (/usr/local/cuda/bin/nvcc).

  3. I edited NVIDIA_GPU_Computing_SDK/C/common/common.mk for the following line: “NVCC := $(CUDA_INSTALL_PATH)/bin/nvcc4.4”

But I get the following error message from “make verbose=1”

/usr/local/cuda/bin/nvcc4.4 -gencode=arch=compute_10,code="sm_10,compute_10" -gencode=arch=compute_20,code="sm_20,compute_20" --compiler-options -fno-strict-aliasing --compiler-options -fno-inline -po maxrregcount=16 -I. -I/usr/local/cuda/include -I…/…/common/inc -I…/…/…/shared//inc -DUNIX -O0 -o obj/x86_64/release/BlackScholes.cu.o -c BlackScholes.cu

sed: can’t read /tmp/gedit.root.2710946334: No such device or address

sed: couldn’t edit /tmp/keyring-PpTVRw: not a regular file

sed: can’t read /tmp/gedit.root.2710946334: No such device or address

sed: couldn’t edit /tmp/keyring-PpTVRw: not a regular file

sed: can’t read /tmp/gedit.root.2710946334: No such device or address

sed: couldn’t edit /tmp/keyring-PpTVRw: not a regular file

sed: can’t read /tmp/gedit.root.2710946334: No such device or address

sed: couldn’t edit /tmp/keyring-PpTVRw: not a regular file

sed: can’t read /tmp/gedit.root.2710946334: No such device or address

sed: couldn’t edit /tmp/keyring-PpTVRw: not a regular file

/usr/include/c++/4.4/x86_64-linux-gnu/bits/c++locale.h: In function ‘int std::__convert_from_v(__locale_struct* const&, char*, int, const char*, …)’:

/usr/include/c++/4.4/x86_64-linux-gnu/bits/c++locale.h:86: error: ‘__builtin_stdarg_start’ was not declared in this scope

make[1]: *** [obj/x86_64/release/BlackScholes.cu.o] Error 1

make[1]: Leaving directory `/home/cmejia/NVIDIA_GPU_Computing_SDK/C/src/BlackScholes’

make: *** [src/BlackScholes/Makefile.ph_build] Error 2

Do you or anyone else know what is going wrong? I appreciate the fact that you took the time to put together this clever wrapper, but unfortunately I don’t know enough about sed or nvcc to figure out what is going on.

Thanks in advance,

–Chris

Hi,

the script should create a file named .cudacompile.tmp.sh by running the original nvcc with the --dry-run option and subsequently perform

some sed scripts upon this. In your case this doesn’t seem to happen. Instead it is trying to act on a file /tmp/gedit.root.2710946334 and

/tmp/keyring-PpTVRw, which is very strange. Are you sure, the script is not corrupted?

I didn’t test the newly released files, so it might also be due to changed behaviour of nvcc --dry-run, though I doubt it.

Hi haraldkl,

I just had the problem too – seems the second-to-last line of your script tries to operate on all files in /tmp:

[codebox]sed -i “/^ cudafe++/a sed -i ‘s/__builtin_stdarg_start/__builtin_va_start/g’ /tmp/” .cudacompile.tmp.sh[/codebox]

I haven’t yet figured out what filename pattern to replace your “/tmp/*” with yet, but a quick workaround (if you don’t need any of the stuff in /tmp) is to just delete everything from /tmp between runs.

Ah, right, sorry.

You need to modify the cudafe++ generated files, which will be in /tmp.

As I said the workaround is very crude and I didn’t bother finding some more elegant way to identify the generated files…

Anyway the files in question should be modified, even if there is other stuff in /tmp.

You could have look into the .cudacompile.tmp.sh script and the cudafe++ generated files in /tmp and try to figure out, why there

is still a stdarg instead of a va in the files handed over to gcc.

I worked around the problem in a simple way. Just add this option to the nvcc command:

-Xcompiler -D__builtin_stdarg_start=__builtin_va_start

In this way the bad symbol __builtin_stdarg_start will be replaced with the correct symbol __builtin_va_start just by the preprocessor.

Hey, that’s great! This is obviously a much better solution ;)

It seems I’m not working enough with preprocessors.

Anyway, I put this in nvcc.profile:

[codebox]INCLUDES += “-D__builtin_stdarg_start=__builtin_va_start”[/codebox]

as the documentation says, this flag is passed with the -Xcompiler flag by nvcc.

It looks like the Cuda SDK 3.0 works fine with that.

Is there any better place to put this?

Like a CFLAGS in the same file?

I put it in the common.mk:

[codebox]

NVCCFLAGS := --compiler-options -fno-inline -Xcompiler -D__builtin_stdarg_start=__builtin_va_start

[/codebox]

This does the job.

Thanks to the previous posters for this hint!