Dear Support team,
I have created some issues before with reproducers and in general aware about preprocessing flags etc. But, with code mix of C++, OpenACC and CUDA header includes, sometime also end-up with weird compilation errors when compiling preprocessed source file. And then wonder what is the exact recipe.
So my intention with this issue is to ask this naive question to (compiler) experts and settle this confusion once and for all :) : Give me a recipe to create a preprocessed reproducer especially when code is mix of C++, OpenACC/OpenMP and CUDA.
I can add later more examples later, let’s begin with this one:
I have simple code with like below:
#include <Random123/philox.h>
#ifdef __CUDACC__
#define g_k_qualifiers __device__ __constant__
#else
#define g_k_qualifiers
#endif
g_k_qualifiers philox4x32_key_t g_k{{0}};
and that end-up in compiler error as:
$ nvc++ foo.cpp -c -I/home/external/Random123/include -DR123_USE_SSE=0 -cuda
NVC++-F-0000-Internal compiler error. size of unknown type 0 (foo.cpp)
NVC++/x86-64 Linux 23.11-0: compilation aborted
So my next step is to preprocess file:
nvc++ foo.cpp -c -I/home/external/Random123/include -DR123_USE_SSE=0 -cuda -E -o bar.cpp
And then I thought I am ready to send this as a reproducer. But when checking this myself, I see:
$ nvc++ -cuda bar.cpp
"/usr/include/x86_64-linux-gnu/bits/types.h", line 155: error: invalid redeclaration of type name "__fsid_t" (declared at line 155)
typedef struct { int __val[2]; } __fsid_t;
^
"/usr/include/ctype.h", line 48: error: "_ISupper" has already been declared in the current scope
_ISupper = ((0) < 8 ? ((1 << (0)) << 8) : ((1 << (0)) >> 8)),
^
"/usr/include/ctype.h", line 49: error: "_ISlower" has already been declared in the current scope
_ISlower = ((1) < 8 ? ((1 << (1)) << 8) : ((1 << (1)) >> 8)),
^
"/usr/include/ctype.h", line 50: error: "_ISalpha" has already been declared in the current scope
_ISalpha = ((2) < 8 ? ((1 << (2)) << 8) : ((1 << (2)) >> 8)),
^
....
"/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/cuda/12.3/include/vector_types.h", line 300: error: invalid redeclaration of type name "ulonglong1" (declared at line 404)
struct __attribute__((device_builtin)) ulonglong1
^
"/opt/nvidia/hpc_sdk/Linux_x86_64/23.11/cuda/12.3/include/vector_types.h", line 305: error: invalid redeclaration of type name "longlong2" (declared at line 405)
struct __attribute__((device_builtin)) __attribute__((aligned(16))) longlong2
^
Error limit reached. Use -fmax-errors=N to change the limit, N=0 for unlimited.
100 errors detected in the compilati
Then I look at the available CLI flags from nvc++ -help
:
$ nvc++ --help | grep include
-idirafter<incdir> Add a directory to the end of the include file search path, after the standard include directories, and mark it as a system include directory
--include_directory<incdir>
Add directory to include file search path
-iquote<incdir> Add a directory to the beginning of the include file search path, and use it only when processing includes with quotes
--[no_]implicit_include
-include<name> compatibility: File to include at start of compilation
--no_preincludes Ignore all preincluded files: used for compiling preprocessed files
--preinclude<name> File to include at start of compilation
-cuda Add CUDA include paths. Link with the CUDA runtime libraries. Please refer to -gpu for target specific options
I am able remove some errors by mixing some of the flags above but then other errors appear. And hence this confusion.
So, it would be great if you could provide some general instructions about most common CLI flags to create a preprocessed reproducer and avoid above type of error when mixing C++ / CUDA / OpenMP/OpenACC.
Thanks in advance!