nvcc compilation fails on using c++ based header files

I am trying to compile a simple CUDA kernel, and I want to include some header files from the application-domain in my .cu file.

kernel.cu

#include <shrUtils.h>
#include <cutil_inline.h>
#include <cusparse.h>
#include <cublas.h>
#include “PCG.H” // Domain specific header file

void run()
{
return;
}

Compilation Process (Includes the path for all the dependencies)

SOURCE=matrices/lduMatrix/solvers/PCG/kernel.cu ; /usr/local/cuda/bin/nvcc -m64 -Dlinux64 -DWM_DP -O2 -DNoRepository --compiler-options -fno-strict-aliasing -IlnInclude -I. -I/home/rphull/OpenFOAM/OpenFOAM-1.6/src/OpenFOAM/lnInclude -I/home/rphull/OpenFOAM/OpenFOAM-1.6/src/OSspecific/POSIX/lnInclude -I/home/rphull/NVIDIA_GPU_Computing_SDK/shared/inc -I/home/rphull/NVIDIA_GPU_Computing_SDK/C/common/inc -DUNIX -c $SOURCE -o Make/linux64GccDPOpt/kernel.o

On compiling, the nvcc compiler gets mad on domain specific header files and report errors as follows:
lnInclude/UList.H(316): error: expected an operator
lnInclude/UList.H(316): error: invalid friend declaration
lnInclude/UList.H(316): error: expected a “;”
lnInclude/UList.H(320): warning: parsing restarts here after previous syntax error
lnInclude/InfoProxy.H(62): error: expected an operator
lnInclude/InfoProxy.H(62): error: invalid friend declaration
lnInclude/InfoProxy.H(62): error: expected a “;”
lnInclude/OSstream.H(54): warning: overloaded virtual function “Foam::Ostream::write” is only partially overridden in class “Foam::OSstream”
lnInclude/errorManip.H(79): error: expected an operator
lnInclude/errorManip.H(79): error: invalid friend declaration
lnInclude/errorManip.H(79): error: expected a “;”
lnInclude/errorManip.H(112): error: expected an operator
lnInclude/errorManip.H(112): error: invalid friend declaration
lnInclude/errorManip.H(112): error: expected a “;”
lnInclude/HashTable.H(426): error: expected an operator
lnInclude/HashTable.H(426): error: invalid friend declaration
lnInclude/HashTable.H(426): error: expected a “;”
lnInclude/HashTable.H(430): warning: parsing restarts here after previous syntax error
lnInclude/FixedList.H(338): error: expected an operator
lnInclude/FixedList.H(338): error: invalid friend declaration
lnInclude/FixedList.H(338): error: expected a “;”
lnInclude/FixedList.H(342): warning: parsing restarts here after previous syntax error
lnInclude/LList.H(336): error: expected an operator
lnInclude/LList.H(336): error: invalid friend declaration
lnInclude/LList.H(336): error: expected a “;”
lnInclude/LList.H(340): warning: parsing restarts here after previous syntax error
lnInclude/PtrList.H(302): error: expected an operator
lnInclude/PtrList.H(302): error: invalid friend declaration
lnInclude/PtrList.H(302): error: expected a “;”
lnInclude/LPtrList.H(269): error: expected an operator
lnInclude/LPtrList.H(269): error: invalid friend declaration

I am using CUDA 3.2 on a Fermi GPU. Please let me know if anyone knows a fix to this kind of problem. Thanks !!

I am trying to compile a simple CUDA kernel, and I want to include some header files from the application-domain in my .cu file.

kernel.cu

#include <shrUtils.h>
#include <cutil_inline.h>
#include <cusparse.h>
#include <cublas.h>
#include “PCG.H” // Domain specific header file

void run()
{
return;
}

Compilation Process (Includes the path for all the dependencies)

SOURCE=matrices/lduMatrix/solvers/PCG/kernel.cu ; /usr/local/cuda/bin/nvcc -m64 -Dlinux64 -DWM_DP -O2 -DNoRepository --compiler-options -fno-strict-aliasing -IlnInclude -I. -I/home/rphull/OpenFOAM/OpenFOAM-1.6/src/OpenFOAM/lnInclude -I/home/rphull/OpenFOAM/OpenFOAM-1.6/src/OSspecific/POSIX/lnInclude -I/home/rphull/NVIDIA_GPU_Computing_SDK/shared/inc -I/home/rphull/NVIDIA_GPU_Computing_SDK/C/common/inc -DUNIX -c $SOURCE -o Make/linux64GccDPOpt/kernel.o

On compiling, the nvcc compiler gets mad on domain specific header files and report errors as follows:
lnInclude/UList.H(316): error: expected an operator
lnInclude/UList.H(316): error: invalid friend declaration
lnInclude/UList.H(316): error: expected a “;”
lnInclude/UList.H(320): warning: parsing restarts here after previous syntax error
lnInclude/InfoProxy.H(62): error: expected an operator
lnInclude/InfoProxy.H(62): error: invalid friend declaration
lnInclude/InfoProxy.H(62): error: expected a “;”
lnInclude/OSstream.H(54): warning: overloaded virtual function “Foam::Ostream::write” is only partially overridden in class “Foam::OSstream”
lnInclude/errorManip.H(79): error: expected an operator
lnInclude/errorManip.H(79): error: invalid friend declaration
lnInclude/errorManip.H(79): error: expected a “;”
lnInclude/errorManip.H(112): error: expected an operator
lnInclude/errorManip.H(112): error: invalid friend declaration
lnInclude/errorManip.H(112): error: expected a “;”
lnInclude/HashTable.H(426): error: expected an operator
lnInclude/HashTable.H(426): error: invalid friend declaration
lnInclude/HashTable.H(426): error: expected a “;”
lnInclude/HashTable.H(430): warning: parsing restarts here after previous syntax error
lnInclude/FixedList.H(338): error: expected an operator
lnInclude/FixedList.H(338): error: invalid friend declaration
lnInclude/FixedList.H(338): error: expected a “;”
lnInclude/FixedList.H(342): warning: parsing restarts here after previous syntax error
lnInclude/LList.H(336): error: expected an operator
lnInclude/LList.H(336): error: invalid friend declaration
lnInclude/LList.H(336): error: expected a “;”
lnInclude/LList.H(340): warning: parsing restarts here after previous syntax error
lnInclude/PtrList.H(302): error: expected an operator
lnInclude/PtrList.H(302): error: invalid friend declaration
lnInclude/PtrList.H(302): error: expected a “;”
lnInclude/LPtrList.H(269): error: expected an operator
lnInclude/LPtrList.H(269): error: invalid friend declaration

I am using CUDA 3.2 on a Fermi GPU. Please let me know if anyone knows a fix to this kind of problem. Thanks !!

You haven’t actually shown the code that is failing, but I gather it is this from OpenFOAM. The simplest solution is don’t pass those C++ template files through nvcc. You can’t used the declarations they contain inside a CUDA kernel anyway, so there really isn’t a need to do it.

You haven’t actually shown the code that is failing, but I gather it is this from OpenFOAM. The simplest solution is don’t pass those C++ template files through nvcc. You can’t used the declarations they contain inside a CUDA kernel anyway, so there really isn’t a need to do it.

Thanks for your reply. You are right about the file, it’s from OpenFoam.

I have following questions:

  1. Can you explain more which type of declarations can’t be used inside a kernel?

  2. I am trying to include PCG.H in my kernel. PCG.H internally gets to transitive dependency on the header files like UList.H, InfoProxy.H etc. How can I avoid passing those through NVCC, my aim is to use some of the OpenFoam declarations as parameters to my kernel.

Thanks for your reply. You are right about the file, it’s from OpenFoam.

I have following questions:

  1. Can you explain more which type of declarations can’t be used inside a kernel?

  2. I am trying to include PCG.H in my kernel. PCG.H internally gets to transitive dependency on the header files like UList.H, InfoProxy.H etc. How can I avoid passing those through NVCC, my aim is to use some of the OpenFoam declarations as parameters to my kernel.

CUDA supports a fairly limited subset of C++ features. It is usually best to assume that any given C++ declaration won’t work, and then be pleasantly surprised when it does work, than vice versa. Because of the CUDA compilation model (all functions inline expanded, no linker), a lot of C++ class features don’t work in CUDA.

That approach almost certainly will fail. One of the big problems is that the GPU and host CPU have separate address spaces and pointers and memory allocations aren’t portable between them. So the huge problem with passing classes and structures as arguments is that you must build those classes and structures in GPU memory using host side API functions before you can pass them to a kernel. For complex graphs or trees, and “deep” structures with several levels of pointer indirection, it quickly becomes impractical. Ideally you want flat, linear memory to pass to kernels, so the best structure designs tend to be “structures of array” rather than “arrays of structures”, and working with indexes rather than pointers becomes preferrable for portability reasons.

I have used OpenFOAM a little bit, but I am not very familiar with its internals and can’t comment on its details, but often the best codes to extend to CUDA are MPI codes which have classes and container structures which are desiged to be “collapsed” into linear arrays of simple scalar types for message passing, or which have Fortran bindings (so they can pass C++ data into class ignorant Fortran 90 subroutines).

Cuda is built basing on C, not C++, so look at CUDA code like at C code. But you’re free to use C++ in host code.

CUDA works the best for arrays of aligned data, where each thread processes different part of the array (read about memory coalescing in CUDA) and is totally stuck when it works with big amounts of unaligned data (in my test kernel, coalesced access was 60 times faster than uncoalesced)