cudaDeviceSynchronize from device code is deprecated

FWIW, CUDA 12 has introduced new CDP functionality. This may also be of interest.