Real-time Profiler? Any other option then nvprof?

I’m trying to find some what real-time profiling data.
I have realized that most of nvprof data is produced at the end of cuda program or even after certain times of replays.
Since i want some what realtime data, i have tried to set start and end point for profiling however it is almost impossible to gain meaningful data since it is hard to set valid start and end point inside kernel.
Then i have looked some other profiler like “remotely”.
But remotely does not provide enough data for my needs.
Can you help me to find some kinds of profiling tool which enable me to get certain metrics or event at the middle of program or real-time?

Thank you