Call kernel inside CUDA kernel

(copied my question from stack overflow)
I am trying to do something like that:

__global__ void foo()
    // do stuff

__global__ void boo()
    foo<<<m, n>>>();

but I am getting the error “kernel launch from __device__ or __global__ functions requires separate compilation mode”

I tried googling for an answer and I saw some results talking about “dynamic-parallelism” and it says that it requires compute capability 3 or above which I have(GTX 750 Ti compute capability 5).
I also so that I need to turn “rdc” flag on, while it does make the error go away it makes the compilation fail no matter what(even if I comment everything)

So how can I achieve my goal or what might be the problem?
(using cuda 11.0)
I also added “cudadevrt.lib;cudart.lib;” to input in linker in project properties

The error it gives when rdc is set to true:

Error MSB3721 The command ““C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin\nvcc.exe” -dlink -o “x64\Debug\crimson cuda.device-link.obj” -Xcompiler “/EHsc /W3 /nologo /Od /Zi /Fdx64\Debug\vc142.pdb /RTC1 /MDd " -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin/crt” -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\lib\x64” cudadevrt.lib cudart.lib cudart_static.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib -gencode=arch=compute_50,code=sm_50 -G --machine 64 x64\Debug\ x64\Debug\" exited with code 1.

EDIT 2: I continued to investigate and it seems that the problem occur while linking the files which I don’t fully understand how it works when using rdc.

I suggest following the instructions I have posted on your SO question.

Fixed the problem, thank you @Robert_Crovella for trying to help. Solution is in my question on stackoverflow