I first compiled my code with nvcc, then when I profile my code with nvprof:
nvprof ./test
Outputs what I’d expect:
==5911== Profiling application: ./test
==5911== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 100.00% 4.1506ms 1 4.1506ms 4.1506ms 4.1506ms render(vec3*, int, int, vec3, vec3, vec3, vec3)
API calls: 98.25% 690.10ms 1 690.10ms 690.10ms 690.10ms cudaMallocManaged
0.62% 4.3625ms 1 4.3625ms 4.3625ms 4.3625ms cudaDeviceSynchronize
0.52% 3.6506ms 1 3.6506ms 3.6506ms 3.6506ms cudaFree
0.32% 2.2770ms 1 2.2770ms 2.2770ms 2.2770ms cuDeviceGetPCIBusId
0.27% 1.8872ms 1 1.8872ms 1.8872ms 1.8872ms cudaLaunchKernel
0.01% 101.70us 101 1.0060us 800ns 2.3000us cuDeviceGetAttribute
0.00% 12.900us 2 6.4500us 1.3000us 11.600us cuDeviceGet
0.00% 6.0000us 3 2.0000us 900ns 3.0000us cuDeviceGetCount
0.00% 2.9000us 1 2.9000us 2.9000us 2.9000us cuDeviceGetName
0.00% 1.3000us 1 1.3000us 1.3000us 1.3000us cuDeviceTotalMem
0.00% 1.2000us 1 1.2000us 1.2000us 1.2000us cudaGetLastError
0.00% 1.1000us 1 1.1000us 1.1000us 1.1000us cuDeviceGetUuid
However when I introduce the --metrics flag, specifically to identify inst_fp_32 and inst_fp_64 I get the following:
==5856== NVPROF is profiling process 5856, command: ./test
==5856== Error: Internal profiling error 4190:38.
took 0.010927 seconds.
======== Profiling result:
No events/metrics were profiled.
======== Error: CUDA profiling error.
I’ve tried searching but can’t find anything related to 4190:38 and how I should go about fixing it.
Specifications:
- Compiled with nvcc: release 11.5, V11.5.119
- Arch is sm_61, for a gtx 1060 g6b
- Using WSL2 - Ubuntu 20.04 LTS
Any help would greatly be appreciated. Everything runs and compiles as I’d expect it to, only nvprof seems to be giving me this error.