I am developing tools for P100 and V100 that catch CUDA calls, collection hardware utilization information. My original implementation is based on CUPTI. Since CUPTI is deprecated, I am thinking to replace it with Nsight Perf SDK. However, it is difficult to find any document. I appreciate your help.