Kernels after a persistent kernel isn't executed unless running under Nsight System

Try:

CUDA_MODULE_LOADING=EAGER ./my_app

see here.