Nvcc 12.3 with gcc 13.2 not working

After updating to cuda 12.3, the compiler fails to compile most codes. Is there some configuration that I should do, or will I need to roll back the version? Any help is much appreciated.

When using make to compile cuda samples:

make[1]: Entering directory '/data/opt/cuda-samples-12.3/Samples/6_Performance/alignedTypes'
/opt/cuda/bin/nvcc -ccbin g++ -I../../../Common -m64 --threads 0 --std=c++11 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_89,code=sm_89 -gencode arch=compute_90,code=sm_90 -gencode arch=compute_90,code=compute_90 -o alignedTypes.o -c alignedTypes.cu
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "alignedTypes.cu".
make[1]: *** [Makefile:328: alignedTypes.o] Error 255
make[1]: Leaving directory '/data/opt/cuda-samples-12.3/Samples/6_Performance/alignedTypes'
make: *** [Makefile:45: Samples/6_Performance/alignedTypes/Makefile.ph_build] Error 2

When using cmake to compile my own code.

CMake Error at /usr/share/cmake/Modules/CMakeDetermineCompilerId.cmake:753 (message):
  Compiling the CUDA compiler identification source file
  "CMakeCUDACompilerId.cu" failed.

  Compiler: /opt/cuda/bin/nvcc

  Build flags:

  Id flags: --keep;--keep-dir;tmp -v



  The output was:

  2

  #$ _NVVM_BRANCH_=nvvm

  #$ _SPACE_=

  #$ _CUDART_=cudart

  #$ _HERE_=/opt/cuda/bin

  #$ _THERE_=/opt/cuda/bin

  #$ _TARGET_SIZE_=

  #$ _TARGET_DIR_=

  #$ _TARGET_DIR_=targets/x86_64-linux

  #$ TOP=/opt/cuda/bin/..

  #$ NVVMIR_LIBRARY_DIR=/opt/cuda/bin/../nvvm/libdevice

  #$ LD_LIBRARY_PATH=/opt/cuda/bin/../lib:

  #$
  PATH=/opt/cuda/bin/../nvvm/bin:/opt/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/bin:/opt/cuda/bin:/opt/cuda/nsight_compute:/opt/cuda/nsight_systems/bin:/usr/lib/jvm/default/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl


  #$ INCLUDES="-I/opt/cuda/bin/../targets/x86_64-linux/include"

  #$ LIBRARIES= "-L/opt/cuda/bin/../targets/x86_64-linux/lib/stubs"
  "-L/opt/cuda/bin/../targets/x86_64-linux/lib"

  #$ CUDAFE_FLAGS=

  #$ PTXAS_FLAGS=

  #$ rm tmp/a_dlink.reg.c

  #$ gcc -D__CUDA_ARCH_LIST__=520 -E -x c++ -D__CUDACC__ -D__NVCC__
  "-I/opt/cuda/bin/../targets/x86_64-linux/include" -D__CUDACC_VER_MAJOR__=12
  -D__CUDACC_VER_MINOR__=3 -D__CUDACC_VER_BUILD__=52
  -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=3
  -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64
  "CMakeCUDACompilerId.cu" -o "tmp/CMakeCUDACompilerId.cpp4.ii"

  #$ cudafe++ --c++17 --gnu_version=130201 --display_error_number
  --orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name
  "/tmp/tmp.7ejqaF2sAk/build/Core/CMakeFiles/3.27.7/CompilerIdCUDA/CMakeCUDACompilerId.cu"
  --allow_managed --m64 --parse_templates --gen_c_file_name
  "tmp/CMakeCUDACompilerId.cudafe1.cpp" --stub_file_name
  "CMakeCUDACompilerId.cudafe1.stub.c" --gen_module_id_file
  --module_id_file_name "tmp/CMakeCUDACompilerId.module_id"
  "tmp/CMakeCUDACompilerId.cpp4.ii"

  /usr/include/bits/floatn.h(86): error: invalid combination of type
  specifiers

    typedef __float128 _Float128;
                       ^

  

  /usr/include/bits/floatn-common.h(214): error: invalid combination of type
  specifiers

    typedef float _Float32;
                  ^

  

  /usr/include/bits/floatn-common.h(251): error: invalid combination of type
  specifiers

    typedef double _Float64;
                   ^

  

  /usr/include/bits/floatn-common.h(268): error: invalid combination of type
  specifiers

    typedef double _Float32x;
                   ^

  

  /usr/include/bits/floatn-common.h(285): error: invalid combination of type
  specifiers

    typedef long double _Float64x;
                        ^

  

  5 errors detected in the compilation of "CMakeCUDACompilerId.cu".

  # --error 0x2 --





Call Stack (most recent call first):
  /usr/share/cmake/Modules/CMakeDetermineCompilerId.cmake:8 (CMAKE_DETERMINE_COMPILER_ID_BUILD)
  /usr/share/cmake/Modules/CMakeDetermineCompilerId.cmake:53 (__determine_compiler_id_test)
  /usr/share/cmake/Modules/CMakeDetermineCUDACompiler.cmake:307 (CMAKE_DETERMINE_COMPILER_ID)
  CMakeLists.txt:2 (project)


CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
make: Makefile: No such file or directory
make: *** No rule to make target 'Makefile'.  Stop.
Core build failed
cppsrc build failed
3 Likes

Same thing is happening to me trying to install a Python package with a C++ component. I was told in a Github issue to make sure that CUDA supported my version of GCC, but was hoping for a fix without having to roll one of them back.

The supported/tested gcc versions for any given CUDA version can be found in the CUDA linux install guide for that CUDA version. At the moment, here (and here) is the one for 12.3, and you can see that gcc 13.x is not listed anywhere.

So there is no expectation by NVIDIA that CUDA 12.3 works with gcc 13.x. For CUDA 12.3, the stated gcc support goes up to gcc 12.2.

the same issue here,

gcc --version
gcc (GCC) 13.2.1 20230801

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Sep__8_19:17:24_PDT_2023
Cuda compilation tools, release 12.3, V12.3.52
Build cuda_12.3.r12.3/compiler.33281558_0
/opt/cuda/bin/nvcc -ccbin g++ -DDUFFING_MODEL -DLYAPUNOV_METRICS -Iinclude/ -dc --threads 0 --std=c++11 -lcuda -o tests_lyapunov.o -c tests/tests_lyapunov.cu
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "tests/tests_lyapunov.cu".
make: *** [Makefile:92: tests_lyapunov.o] Error 255

I believe this may be an nvcc issue. Currently, on arch linux gcc 13 is the default gcc version. However, cuda uses (should use?) gcc 12, and indeed, gcc 12 is a dependency of the cuda package and it is correctly installed.

Unfortunately, nvcc is mixing gcc 12 and 13, and this seems to cause the issue.

For example, as of today:

$ cat test.cu
#include <stdio.h>

int main() { }
$ nvcc test.cu
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "test.cu".

Even though the headers (which are glibc headers) have the necessary guards, e.g.:

#  if !__GNUC_PREREQ (7, 0) || (defined __cplusplus && !__GNUC_PREREQ (13, 0))
typedef float _Float32;
#  endif

they are defeated by nvcc mixing gcc 12 and 13.

$ strace -e execve -s 9999 -f nvcc test.cu
[...]
[pid 189798] execve("/usr/lib/gcc/x86_64-pc-linux-gnu/13.2.1/cc1plus", ["/usr/lib/gcc/x86_64-pc-linux-gnu/13.2.1/cc1plus", "-E", "-quiet", "-D_GNU_SOURCE", "/tmp/tmpxft_0002e564_00000000-2.cpp", "-mtune=generic", "-march=x86-64", "-dumpbase", "tmpxft_0002e564_00000000-2.cpp", "-dumpbase-ext", ".cpp"], 0x1cc0d80 /* 64 vars */) = 0
[...]
[pid 189800] execve("/usr/lib/gcc/x86_64-pc-linux-gnu/12.3.0/cc1plus", ["/usr/lib/gcc/x86_64-pc-linux-gnu/12.3.0/cc1plus", "-E", "-quiet", "-I", "/opt/cuda/bin/../targets/x86_64-linux/include", "-D_GNU_SOURCE", "-D", "__CUDA_ARCH_LIST__=520", "-D", "__CUDACC__", "-D", "__NVCC__", "-D", "__CUDACC_VER_MAJOR__=12", "-D", "__CUDACC_VER_MINOR__=3", "-D", "__CUDACC_VER_BUILD__=52", "-D", "__CUDA_API_VER_MAJOR__=12", "-D", "__CUDA_API_VER_MINOR__=3", "-D", "__NVCC_DIAG_PRAGMA_SUPPORT__=1", "-include", "cuda_runtime.h", "test.cu", "-o", "/tmp/tmpxft_0002e564_00000000-5_test.cpp4.ii", "-m64", "-mtune=generic", "-march=x86-64", "-dumpdir", "/tmp/", "-dumpbase", "tmpxft_0002e564_00000000-5_test.cpp4.cu", "-dumpbase-ext", ".cu"], 0x21cd730 /* 77 vars */) = 0
[...]
[pid 189801] execve("/opt/cuda/bin/cudafe++", ["cudafe++", "--c++17", "--gnu_version=130201", "--display_error_number", "--orig_src_file_name", "test.cu", "--orig_src_path_name", "/tmp/test.cu", "--allow_managed", "--m64", "--parse_templates", "--gen_c_file_name", "/tmp/tmpxft_0002e564_00000000-6_test.cudafe1.cpp", "--stub_file_name", "tmpxft_0002e564_00000000-6_test.cudafe1.stub.c", "--gen_module_id_file", "--module_id_file_name", "/tmp/tmpxft_0002e564_00000000-4_test.module_id", "/tmp/tmpxft_0002e564_00000000-5_test.cpp4.ii"], 0x555e63c36340 /* 75 vars */) = 0

Note that gcc 13 is used first, then gcc 12, then cudafe++ is executed with --gnu_version=130201, which ultimately fails with the above error messages.

Specifying -ccbin=/opt/cuda/bin (which has the gcc and g++ symbolic links to the gcc 12 versions) fixes the issue, and gcc 12 will be used all the way (and --gnu_version=120300 will be specified when executing cudafe++). (If you use meson, you can set -D cuda_ccbindir=/opt/cuda/bin.)

EDIT: It looks like Iā€™m way too late. Based on the following comment, this has already been reported to nvidia: cuda.sh Ā· 4e66a3ddd593027aaadbf731b6c56de154292344 Ā· Arch Linux / Packaging / Packages / cuda Ā· GitLab

If your system gcc is now 13 (like on Arch) you can install gcc 12 alongside it and point CUDA_HOST_COMPILER to your 12 install when building with nvcc.

This is what I had to do to get the latest opencv built, and it links fine with the rest of my application code that builds with gcc 13.2

CUDA 12.4 was recently released and it claims support for GCC 13.2:

6.x - 13.2

However the issue persists:

/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
typedef __float128 _Float128;

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
typedef float _Float32;

It still happens during invocation of cudafe++ with --gnu_version=130200

For reference:

$ cudafe++ --version
cudafe: NVIDIA (R) Cuda Language Front End
Portions Copyright (c) 2005-2024 NVIDIA Corporation
Portions Copyright (c) 1988-2018 Edison Design Group Inc.
Based on Edison Design Group C/C++ Front End, version 6.5 (Feb 27 2024 16:19:42)
Cuda compilation tools, release 12.4, V12.4.99

@Robert_Crovella This is not expected, is it?

Well, Iā€™m not really sure what your test case is. I donā€™t know what you are compiling, what compile command you issued, or how your machine is set up. If your machine is mixed up or corrupted, it might be expected.

For a proper install of CUDA 12.4 along with gcc 13.2, I wouldnā€™t expect that.

Hereā€™s what I did:

  1. I started with a ubuntu 22.04 machine that had CUDA 12.1 installed via runfile install method. I updated that to CUDA 12.4, using the runfile installer. This machine had gcc/g++ 11.3 which I believe was the default gcc shipped with Ubuntu 22.04
  2. I built gcc/g++ 13.2 from source using the method here. I then installed it as indicated there. Executables got installed in /usr/local/gcc-13.2.0/bin
  3. I created symlinks for gcc and g++ in that directory, to point to the freshly built versions.
  4. I updated the PATH to include that directory first.
  5. Then I compiled a random code I had sitting around, specifying the -ccbin switch to point to that directory. Here is the verbose output:
$ nvcc -ccbin /usr/local/gcc-13.2.0/bin t2.cu -o t2 --verbose
#$ _NVVM_BRANCH_=nvvm
#$ _NVVM_BRANCH_SUFFIX_=
#$ _SPACE_=
#$ _CUDART_=cudart
#$ _HERE_=/usr/local/cuda/bin
#$ _THERE_=/usr/local/cuda/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_DIR_=targets/x86_64-linux
#$ TOP=/usr/local/cuda/bin/..
#$ NVVMIR_LIBRARY_DIR=/usr/local/cuda/bin/../nvvm/libdevice
#$ LD_LIBRARY_PATH=/usr/local/cuda/bin/../lib::/usr/local/cuda/lib64
#$ PATH=/usr/local/cuda/bin/../nvvm/bin:/usr/local/cuda/bin:/usr/local/gcc-13.2.0/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/usr/local/cuda/bin
#$ INCLUDES="-I/usr/local/cuda/bin/../targets/x86_64-linux/include"
#$ LIBRARIES=  "-L/usr/local/cuda/bin/../targets/x86_64-linux/lib/stubs" "-L/usr/local/cuda/bin/../targets/x86_64-linux/lib"
#$ CUDAFE_FLAGS=
#$ PTXAS_FLAGS=
#$ "/usr/local/gcc-13.2.0/bin"/gcc -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -E -x c++ -D__CUDACC__ -D__NVCC__  "-I/usr/local/cuda/bin/../targets/x86_64-linux/include"    -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=4 -D__CUDACC_VER_BUILD__=99 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=4 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "t2.cu" -o "/tmp/tmpxft_000340b3_00000000-5_t2.cpp4.ii"
#$ cudafe++ --c++17 --gnu_version=130200 --display_error_number --orig_src_file_name "t2.cu" --orig_src_path_name "/home/bob/t2.cu" --allow_managed  --m64 --parse_templates --gen_c_file_name "/tmp/tmpxft_000340b3_00000000-6_t2.cudafe1.cpp" --stub_file_name "tmpxft_000340b3_00000000-6_t2.cudafe1.stub.c" --gen_module_id_file --module_id_file_name "/tmp/tmpxft_000340b3_00000000-4_t2.module_id" "/tmp/tmpxft_000340b3_00000000-5_t2.cpp4.ii"
#$ "/usr/local/gcc-13.2.0/bin"/gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -E -x c++  -DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__  "-I/usr/local/cuda/bin/../targets/x86_64-linux/include"    -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=4 -D__CUDACC_VER_BUILD__=99 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=4 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "t2.cu" -o "/tmp/tmpxft_000340b3_00000000-9_t2.cpp1.ii"
#$ cicc --c++17 --gnu_version=130200 --display_error_number --orig_src_file_name "t2.cu" --orig_src_path_name "/home/bob/t2.cu" --allow_managed   -arch compute_52 -m64 --no-version-ident -ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 --include_file_name "tmpxft_000340b3_00000000-3_t2.fatbin.c" -tused --module_id_file_name "/tmp/tmpxft_000340b3_00000000-4_t2.module_id" --gen_c_file_name "/tmp/tmpxft_000340b3_00000000-6_t2.cudafe1.c" --stub_file_name "/tmp/tmpxft_000340b3_00000000-6_t2.cudafe1.stub.c" --gen_device_file_name "/tmp/tmpxft_000340b3_00000000-6_t2.cudafe1.gpu"  "/tmp/tmpxft_000340b3_00000000-9_t2.cpp1.ii" -o "/tmp/tmpxft_000340b3_00000000-6_t2.ptx"
#$ ptxas -arch=sm_52 -m64  "/tmp/tmpxft_000340b3_00000000-6_t2.ptx"  -o "/tmp/tmpxft_000340b3_00000000-10_t2.sm_52.cubin"
#$ fatbinary -64 --cicc-cmdline="-ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 " "--image3=kind=elf,sm=52,file=/tmp/tmpxft_000340b3_00000000-10_t2.sm_52.cubin" "--image3=kind=ptx,sm=52,file=/tmp/tmpxft_000340b3_00000000-6_t2.ptx" --embedded-fatbin="/tmp/tmpxft_000340b3_00000000-3_t2.fatbin.c"
#$ rm /tmp/tmpxft_000340b3_00000000-3_t2.fatbin
#$ "/usr/local/gcc-13.2.0/bin"/gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -c -x c++  -DCUDA_DOUBLE_MATH_FUNCTIONS -Wno-psabi "-I/usr/local/cuda/bin/../targets/x86_64-linux/include"   -m64 "/tmp/tmpxft_000340b3_00000000-6_t2.cudafe1.cpp" -o "/tmp/tmpxft_000340b3_00000000-11_t2.o"
#$ nvlink -m64 --arch=sm_52 --register-link-binaries="/tmp/tmpxft_000340b3_00000000-7_t2_dlink.reg.c"    "-L/usr/local/cuda/bin/../targets/x86_64-linux/lib/stubs" "-L/usr/local/cuda/bin/../targets/x86_64-linux/lib" -cpu-arch=X86_64 "/tmp/tmpxft_000340b3_00000000-11_t2.o"  -lcudadevrt  -o "/tmp/tmpxft_000340b3_00000000-12_t2_dlink.sm_52.cubin" --host-ccbin "/usr/local/gcc-13.2.0/bin/gcc"
#$ fatbinary -64 --cicc-cmdline="-ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 " -link "--image3=kind=elf,sm=52,file=/tmp/tmpxft_000340b3_00000000-12_t2_dlink.sm_52.cubin" --embedded-fatbin="/tmp/tmpxft_000340b3_00000000-8_t2_dlink.fatbin.c"
#$ rm /tmp/tmpxft_000340b3_00000000-8_t2_dlink.fatbin
#$ "/usr/local/gcc-13.2.0/bin"/gcc -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -c -x c++ -DFATBINFILE="\"/tmp/tmpxft_000340b3_00000000-8_t2_dlink.fatbin.c\"" -DREGISTERLINKBINARYFILE="\"/tmp/tmpxft_000340b3_00000000-7_t2_dlink.reg.c\"" -I. -D__NV_EXTRA_INITIALIZATION= -D__NV_EXTRA_FINALIZATION= -D__CUDA_INCLUDE_COMPILER_INTERNAL_HEADERS__  -Wno-psabi "-I/usr/local/cuda/bin/../targets/x86_64-linux/include"    -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=4 -D__CUDACC_VER_BUILD__=99 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=4 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -m64 "/usr/local/cuda/bin/crt/link.stub" -o "/tmp/tmpxft_000340b3_00000000-13_t2_dlink.o"
#$ "/usr/local/gcc-13.2.0/bin"/g++ -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -m64 -Wl,--start-group "/tmp/tmpxft_000340b3_00000000-13_t2_dlink.o" "/tmp/tmpxft_000340b3_00000000-11_t2.o"   "-L/usr/local/cuda/bin/../targets/x86_64-linux/lib/stubs" "-L/usr/local/cuda/bin/../targets/x86_64-linux/lib"  -lcudadevrt  -lcudart_static  -lrt -lpthread  -ldl  -Wl,--end-group -o "t2"
$

I didnā€™t seem to have any issue using gcc 13.2 with CUDA 12.4. I also didnā€™t have any trouble compiling cuda sample codes

Ok letā€™s compare:

  • Iā€™m working on a HPC machine running Rocky Linux 8.7
  • I installed GCC 13.2 from source and set up $PATH and $LD_LIBRARY_PATH (HPC installation donā€™t go to /usr)
  • I installed CUDA/12.4 using the runfiles method and set up the environment variables ($CUDA_HOME, $CUDA_ROOT, $CUDA_PATH to the install root; add $CUDA_HOME/include:$CUDA_HOME/extras/CUPTI/include:$CUDA_HOME/nvvm/include to $CPATH, $CUDA_HOME/lib64:$CUDA_HOME/extras/CUPTI/lib64:$CUDA_HOME/nvvm/lib64 to $LD_LIBARY_PATH, $CUDA_HOME/lib64 to $LIBRARY_PATH, $CUDA_HOME/bin:$CUDA_HOME/nvvm/bin to $PATH)
  • Create a trivial test source: echo "int main() {}" > main.cu
  • Compile the code with nvcc main.cu (-ccbin is AFAIK optional as gcc is found via $PATH already, but also using -ccbin made no difference. In both cases I see --gnu_version=130200 indicating the correct GCC was found)

The verbose output is:

#$ _NVVM_BRANCH_=nvvm
#$ _NVVM_BRANCH_SUFFIX_=
#$ _SPACE_= 
#$ _CUDART_=cudart
#$ _HERE_=/software/CUDA/12.4.0/bin
#$ _THERE_=/software/CUDA/12.4.0/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_DIR_=targets/x86_64-linux
#$ TOP=/software/CUDA/12.4.0/bin/..
#$ NVVMIR_LIBRARY_DIR=/software/CUDA/12.4.0/bin/../nvvm/libdevice
#$ LD_LIBRARY_PATH=/software/CUDA/12.4.0/bin/../lib:/software/CUDA/12.4.0/nvvm/lib64:/software/CUDA/12.4.0/extras/CUPTI/lib64:/software/CUDA/12.4.0/lib:/software/binutils/2.40-GCCcore-13.2.0/lib:/software/zlib/1.2.13-GCCcore-13.2.0/lib:/software/GCCcore/13.2.0/lib64
#$ PATH=/software/CUDA/12.4.0/bin/../nvvm/bin:/software/CUDA/12.4.0/bin:/software/CUDA/12.4.0/nvvm/bin:/software/CUDA/12.4.0/bin:/software/binutils/2.40-GCCcore-13.2.0/bin:/software/GCCcore/13.2.0/bin:/home/s3248973/.local/EasyBuildDev/easybuild-framework:/home/s3248973/.local/bin:/home/s3248973/bin:/software/foundation/generic/bin:/software/foundation/x86_64/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/software/util/dtwrapper/bin
#$ INCLUDES="-I/software/CUDA/12.4.0/bin/../targets/x86_64-linux/include"  
#$ LIBRARIES=  "-L/software/CUDA/12.4.0/bin/../targets/x86_64-linux/lib/stubs" "-L/software/CUDA/12.4.0/bin/../targets/x86_64-linux/lib"
#$ CUDAFE_FLAGS=
#$ PTXAS_FLAGS=
#$ gcc -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -E -x c++ -D__CUDACC__ -D__NVCC__  "-I/software/CUDA/12.4.0/bin/../targets/x86_64-linux/include"    -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=4 -D__CUDACC_VER_BUILD__=99 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=4 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "main.cu" -o "/tmp/tmpxft_000db2c5_00000000-5_main.cpp4.ii" 
#$ cudafe++ --c++17 --gnu_version=130200 --display_error_number --orig_src_file_name "main.cu" --orig_src_path_name "/tmp/main.cu" --allow_managed  --m64 --parse_templates --gen_c_file_name "/tmp/tmpxft_000db2c5_00000000-6_main.cudafe1.cpp" --stub_file_name "tmpxft_000db2c5_00000000-6_main.cudafe1.stub.c" --gen_module_id_file --module_id_file_name "/tmp/tmpxft_000db2c5_00000000-4_main.module_id" "/tmp/tmpxft_000db2c5_00000000-5_main.cpp4.ii" 
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "main.cu".
# --error 0x2 --

Doing exactly the same but with GCC 12.3 instead leads to --gnu_version=120300 in the cudafe++ call. Complete output:

#$ _NVVM_BRANCH_=nvvm
#$ _NVVM_BRANCH_SUFFIX_=
#$ _SPACE_= 
#$ _CUDART_=cudart
#$ _HERE_=/software/CUDA/12.4.0/bin
#$ _THERE_=/software/CUDA/12.4.0/bin
#$ _TARGET_SIZE_=
#$ _TARGET_DIR_=
#$ _TARGET_DIR_=targets/x86_64-linux
#$ TOP=/software/CUDA/12.4.0/bin/..
#$ NVVMIR_LIBRARY_DIR=/software/CUDA/12.4.0/bin/../nvvm/libdevice
#$ LD_LIBRARY_PATH=/software/CUDA/12.4.0/bin/../lib:/software/binutils/2.40-GCCcore-12.3.0/lib:/software/zlib/1.2.13-GCCcore-12.3.0/lib:/software/GCCcore/12.3.0/lib64:/software/CUDA/12.4.0/nvvm/lib64:/software/CUDA/12.4.0/extras/CUPTI/lib64:/software/CUDA/12.4.0/lib
#$ PATH=/software/CUDA/12.4.0/bin/../nvvm/bin:/software/CUDA/12.4.0/bin:/software/binutils/2.40-GCCcore-12.3.0/bin:/software/GCCcore/12.3.0/bin:/software/CUDA/12.4.0/nvvm/bin:/software/CUDA/12.4.0/bin:/home/s3248973/.local/EasyBuildDev/easybuild-framework:/home/s3248973/.local/bin:/home/s3248973/bin:/software/foundation/generic/bin:/software/foundation/x86_64/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/software/util/dtwrapper/bin
#$ INCLUDES="-I/software/CUDA/12.4.0/bin/../targets/x86_64-linux/include"  
#$ LIBRARIES=  "-L/software/CUDA/12.4.0/bin/../targets/x86_64-linux/lib/stubs" "-L/software/CUDA/12.4.0/bin/../targets/x86_64-linux/lib"
#$ CUDAFE_FLAGS=
#$ PTXAS_FLAGS=
#$ gcc -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -E -x c++ -D__CUDACC__ -D__NVCC__  "-I/software/CUDA/12.4.0/bin/../targets/x86_64-linux/include"    -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=4 -D__CUDACC_VER_BUILD__=99 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=4 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "main.cu" -o "/tmp/tmpxft_000db684_00000000-5_main.cpp4.ii" 
#$ cudafe++ --c++17 --gnu_version=120300 --display_error_number --orig_src_file_name "main.cu" --orig_src_path_name "/tmp/main.cu" --allow_managed  --m64 --parse_templates --gen_c_file_name "/tmp/tmpxft_000db684_00000000-6_main.cudafe1.cpp" --stub_file_name "tmpxft_000db684_00000000-6_main.cudafe1.stub.c" --gen_module_id_file --module_id_file_name "/tmp/tmpxft_000db684_00000000-4_main.module_id" "/tmp/tmpxft_000db684_00000000-5_main.cpp4.ii" 
#$ gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -E -x c++  -DCUDA_DOUBLE_MATH_FUNCTIONS -D__CUDACC__ -D__NVCC__  "-I/software/CUDA/12.4.0/bin/../targets/x86_64-linux/include"    -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=4 -D__CUDACC_VER_BUILD__=99 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=4 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include "cuda_runtime.h" -m64 "main.cu" -o "/tmp/tmpxft_000db684_00000000-9_main.cpp1.ii" 
#$ cicc --c++17 --gnu_version=120300 --display_error_number --orig_src_file_name "main.cu" --orig_src_path_name "/tmp/main.cu" --allow_managed   -arch compute_52 -m64 --no-version-ident -ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 --include_file_name "tmpxft_000db684_00000000-3_main.fatbin.c" -tused --module_id_file_name "/tmp/tmpxft_000db684_00000000-4_main.module_id" --gen_c_file_name "/tmp/tmpxft_000db684_00000000-6_main.cudafe1.c" --stub_file_name "/tmp/tmpxft_000db684_00000000-6_main.cudafe1.stub.c" --gen_device_file_name "/tmp/tmpxft_000db684_00000000-6_main.cudafe1.gpu"  "/tmp/tmpxft_000db684_00000000-9_main.cpp1.ii" -o "/tmp/tmpxft_000db684_00000000-6_main.ptx"
#$ ptxas -arch=sm_52 -m64  "/tmp/tmpxft_000db684_00000000-6_main.ptx"  -o "/tmp/tmpxft_000db684_00000000-10_main.sm_52.cubin" 
#$ fatbinary -64 --cicc-cmdline="-ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 " "--image3=kind=elf,sm=52,file=/tmp/tmpxft_000db684_00000000-10_main.sm_52.cubin" "--image3=kind=ptx,sm=52,file=/tmp/tmpxft_000db684_00000000-6_main.ptx" --embedded-fatbin="/tmp/tmpxft_000db684_00000000-3_main.fatbin.c" 
#$ rm /tmp/tmpxft_000db684_00000000-3_main.fatbin
#$ gcc -D__CUDA_ARCH__=520 -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -c -x c++  -DCUDA_DOUBLE_MATH_FUNCTIONS -Wno-psabi "-I/software/CUDA/12.4.0/bin/../targets/x86_64-linux/include"   -m64 "/tmp/tmpxft_000db684_00000000-6_main.cudafe1.cpp" -o "/tmp/tmpxft_000db684_00000000-11_main.o" 
#$ nvlink -m64 --arch=sm_52 --register-link-binaries="/tmp/tmpxft_000db684_00000000-7_a_dlink.reg.c"    "-L/software/CUDA/12.4.0/bin/../targets/x86_64-linux/lib/stubs" "-L/software/CUDA/12.4.0/bin/../targets/x86_64-linux/lib" -cpu-arch=X86_64 "/tmp/tmpxft_000db684_00000000-11_main.o"  -lcudadevrt  -o "/tmp/tmpxft_000db684_00000000-12_a_dlink.sm_52.cubin" --host-ccbin "gcc"
#$ fatbinary -64 --cicc-cmdline="-ftz=0 -prec_div=1 -prec_sqrt=1 -fmad=1 " -link "--image3=kind=elf,sm=52,file=/tmp/tmpxft_000db684_00000000-12_a_dlink.sm_52.cubin" --embedded-fatbin="/tmp/tmpxft_000db684_00000000-8_a_dlink.fatbin.c" 
#$ rm /tmp/tmpxft_000db684_00000000-8_a_dlink.fatbin
#$ gcc -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -c -x c++ -DFATBINFILE="\"/tmp/tmpxft_000db684_00000000-8_a_dlink.fatbin.c\"" -DREGISTERLINKBINARYFILE="\"/tmp/tmpxft_000db684_00000000-7_a_dlink.reg.c\"" -I. -D__NV_EXTRA_INITIALIZATION= -D__NV_EXTRA_FINALIZATION= -D__CUDA_INCLUDE_COMPILER_INTERNAL_HEADERS__  -Wno-psabi "-I/software/CUDA/12.4.0/bin/../targets/x86_64-linux/include"    -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=4 -D__CUDACC_VER_BUILD__=99 -D__CUDA_API_VER_MAJOR__=12 -D__CUDA_API_VER_MINOR__=4 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -m64 "/software/CUDA/12.4.0/bin/crt/link.stub" -o "/tmp/tmpxft_000db684_00000000-13_a_dlink.o" 
#$ g++ -D__CUDA_ARCH_LIST__=520 -D__NV_LEGACY_LAUNCH -m64 -Wl,--start-group "/tmp/tmpxft_000db684_00000000-13_a_dlink.o" "/tmp/tmpxft_000db684_00000000-11_main.o"   "-L/software/CUDA/12.4.0/bin/../targets/x86_64-linux/lib/stubs" "-L/software/CUDA/12.4.0/bin/../targets/x86_64-linux/lib"  -lcudadevrt  -lcudart_static  -lrt -lpthread  -ldl  -Wl,--end-group -o "a.out" 

Switching back to GCC 13.2:
As it is clear that cudafe++ fails I ran nvcc main.cu --verbose --keep and verified the failing cudafe++ call:

$ cudafe++ --c++17 --gnu_version=130200 --display_error_number --orig_src_file_name "main.cu" --orig_src_path_name "/tmp/main.cu" --allow_managed  --m64 --parse_templates --gen_c_file_name "main.cudafe1.cpp" --stub_file_name "main.cudafe1.stub.c" --gen_module_id_file --module_id_file_name "main.module_id" "main.cpp4.ii"
/usr/include/bits/floatn.h(86): error: invalid combination of type specifiers
  typedef __float128 _Float128;
                     ^

/usr/include/bits/floatn-common.h(214): error: invalid combination of type specifiers
  typedef float _Float32;
                ^

/usr/include/bits/floatn-common.h(251): error: invalid combination of type specifiers
  typedef double _Float64;
                 ^

/usr/include/bits/floatn-common.h(268): error: invalid combination of type specifiers
  typedef double _Float32x;
                 ^

/usr/include/bits/floatn-common.h(285): error: invalid combination of type specifiers
  typedef long double _Float64x;
                      ^

5 errors detected in the compilation of "main.cu".

And changing --gnu_version=130200 to --gnu_version=120300 makes it pass. Well, actually it then chokes on code from the GCC 13.2 standard library (the *.ii file is preprocessed source from GCC 13 after all) but only after it passed the typedef __float128 _Float128; line.

And indeed if I remove basically everything after those Float* typedefs the main.cpp4.ii can be process with cudafe++ --gnu_version=120300 [...] but not with cudafe++ --gnu_version=130200 [...]

Iā€™ll attach the original main.cpp4.ii as generated by --keep and the modified file (that passes with --gnu_version=120300 but not with --gnu_version=130200): src.tar.gz (128.4 KB)

I believe the issue is caused by the introduction of _Float32 &Co in GCC 13 as a native type. The following minimal program compiles fine in GCC 13.2 but not in GCC 12.3

int main(){
  return int(_Float32(1));
}

In any case: typedef float _Float32; from the system glibc is used and cudafe++ seems to choke on it when _Float32 already exists mistaking float as a type specifier (similar to long double)

g++ has some special handling for the glibc bits headers. I.e. if I try to compile the following source with g++ -c foo.cpp with GCC 13 it fails with ā€œredeclaration of C++ built-in type ā€˜_Float32ā€™ā€:

typedef float _Float32;

But if I claim that this code comes from a system header it succeeds:

// This is the original: # 214 "/usr/include/bits/floatn-common.h" 3 4
# 1 "header.h" 3
typedef float _Float32;

So can you check if your intermediate file passed to cudafe++ contains that typedef float _Float32 line from the system header? And if not, if the system header (/usr/include/bits/floatn-common.h or /usr/include/x86_64-linux-gnu/bits/floatn-common.h) contains it? I suspect that might be the difference here.

It doesnā€™t seem to. I have attached it.
main.cpp4.ii (1.1 MB)

Well, when I look at that main.cpp4.ii file, it doesnā€™t appear to be including from either of those. Instead it is including from

 /usr/local/gcc-13.2.0/lib/gcc/x86_64-linux-gnu/13.2.0/include-fixed/x86_64-linux-gnu/bits/floatn-common.h

(you can check it)

My Ubuntu 22.04 machine does not have a /usr/include/bits directory, but it does have a /usr/include/x86_64-linux-gnu/bits directory, and the floatn-common.h file there does have a

typedef float _Float32; 

line in it, but as I indicated, I donā€™t think that is getting included in the main.cpp4.ii file.

It seems you have already consumed this, but for completeness I observe that:

the directory /usr/local/gcc-13.2.0/lib/gcc/x86_64-linux-gnu/13.2.0/include-fixed/x86_64-linux-gnu/bits seems to contain only floatn.h and floatn-common.h

and the contents of the floatn-common.h file around the typedef in question look like:

/* The remaining of this file provides support for older compilers.  */
# if __HAVE_FLOAT16

#  if !__GNUC_PREREQ (7, 0) || (defined __cplusplus && !__GNUC_PREREQ (13, 0))
typedef float _Float16 __attribute__ ((__mode__ (__HF__)));
#  endif

#  if !__GNUC_PREREQ (7, 0)
#   define __builtin_huge_valf16() ((_Float16) __builtin_huge_val ())
#   define __builtin_inff16() ((_Float16) __builtin_inf ())
#   define __builtin_nanf16(x) ((_Float16) __builtin_nan (x))
#   define __builtin_nansf16(x) ((_Float16) __builtin_nans (x))
#  endif

# endif

# if __HAVE_FLOAT32

#  if !__GNUC_PREREQ (7, 0) || (defined __cplusplus && !__GNUC_PREREQ (13, 0))
typedef float _Float32;
#  endif

Oh sorry I forgot to click submit:

I just did that myself on a fresh Ubuntu machine and used the same installation instructions as you did. And it worked for me too with GCC 13.2!

A bit of investigation shows that my above suspicion is true: The preprocessed source does not contain the problematic typedef anymore because floatn-common.h is modified and copied to include-fixed in the GCC installation folder. That adds a preprocessor condition on top of the typedef that removes it for GCC 13.

However we were compiling GCC with --disable-fixincludes because we got bitten by that causing trouble on system updates (even with minor releases)