I had followed the following tutorial: Unified Memory for CUDA Beginners | NVIDIA Technical Blog
By this tutorial, Because I got GTX 1060 GPU which using Pascal Architecture, I should get some page faults.
But by my “nvprof” I have none:
nvprof --unified-memory-profiling per-process-device .\cuda_learnig.exe
==3560== NVPROF is profiling process 3560, command: .\cuda_learnig.exe
Max error: 0
==3560== Profiling application: .\cuda_learnig.exe
==3560== Warning: Found 53 invalid records in the result.
==3560== Warning: This can happen if device ran out of memory or if a device kernel was stopped due to an assertion.
==3560== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 100.00% 83.584us 1 83.584us 83.584us 83.584us add(int, float*, float*)
API calls: 72.72% 205.41ms 2 102.71ms 1.1742ms 204.24ms cudaMallocManaged
15.42% 43.542ms 1 43.542ms 43.542ms 43.542ms cuDevicePrimaryCtxRelease
10.55% 29.813ms 1 29.813ms 29.813ms 29.813ms cudaLaunchKernel
0.69% 1.9627ms 2 981.33us 699.08us 1.2636ms cudaFree
0.41% 1.1502ms 44 26.140us 364ns 596.60us cuDeviceGetAttribute
0.10% 287.00us 2 143.50us 120.34us 166.66us cuModuleUnload
0.08% 229.74us 1 229.74us 229.74us 229.74us cudaDeviceSynchronize
0.02% 51.783us 1 51.783us 51.783us 51.783us cuDeviceTotalMem
0.00% 9.4820us 1 9.4820us 9.4820us 9.4820us cuDeviceGetPCIBusId
0.00% 1.4580us 2 729ns 364ns 1.0940us cuDeviceGetCount
0.00% 1.0950us 2 547ns 365ns 730ns cuDeviceGet
0.00% 730ns 1 730ns 730ns 730ns cuDeviceGetName
==3560== Unified Memory profiling result:
Device "GeForce GTX 1060 (0)"
Count Avg Size Min Size Max Size Total Size Total Time Name
2048 4.0000KB 4.0000KB 4.0000KB 8.000000MB 17.63118ms Host To Device
146 84.164KB 32.000KB 1.0000MB 12.00000MB 3.055590ms Device To Host
Can you explain me why?