(copied my question from stack overflow)
I am trying to do something like that:
__global__ void foo()
{
// do stuff
}
__global__ void boo()
{
foo<<<m, n>>>();
}
but I am getting the error “kernel launch from __device__ or __global__ functions requires separate compilation mode”
I tried googling for an answer and I saw some results talking about “dynamic-parallelism” and it says that it requires compute capability 3 or above which I have(GTX 750 Ti compute capability 5).
I also so that I need to turn “rdc” flag on, while it does make the error go away it makes the compilation fail no matter what(even if I comment everything)
So how can I achieve my goal or what might be the problem?
(using cuda 11.0)
I also added “cudadevrt.lib;cudart.lib;” to input in linker in project properties
EDIT:
The error it gives when rdc is set to true:
Error MSB3721 The command ““C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin\nvcc.exe” -dlink -o “x64\Debug\crimson cuda.device-link.obj” -Xcompiler “/EHsc /W3 /nologo /Od /Zi /Fdx64\Debug\vc142.pdb /RTC1 /MDd " -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin/crt” -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\lib\x64” cudadevrt.lib cudart.lib cudart_static.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib -gencode=arch=compute_50,code=sm_50 -G --machine 64 x64\Debug\CrimsonNet.cu.obj x64\Debug\kernel.cu.obj" exited with code 1.
EDIT 2: I continued to investigate and it seems that the problem occur while linking the files which I don’t fully understand how it works when using rdc.