With respect to the PCIE metrics, those are collected per-kernel (as are all metrics in Nsight Compute). This means that you would only see non-zero values here if the kernel would be accessing pinned memory mapped to the device during its runtime. If the kernel accesses “regular” device memory which has been transferred from the host to the device beforehand using e.g. a cudaMemcpy call, it would not be measured by this metric.
If you have further questions, it would be good to share the exact source code, Nsight Compute version and command, OS, GPU, and driver version, as those might be required to analyze the problem.