Compilation using modern cmake add_executable produces slower kernel runtime than cuda_add_executable

Hello,

Building my cuda algorithm with the modern cmake function
add_executable(...)

makes the runtime execution of my algorithm ~10x slower than when using the old way:

find_package(CUDA REQUIRED)
cuda_add_executable(...)

My intuition would be that compilation options differ between the two methods but I cannot find which ones. Did anyone got a similar issue?

thanks