It seems any of the recent visual profiler releases - at least CUDA 3.2, 4.0rc1 and 4.0rc2 - can’t actually profile code which is not run on the first CUDA device. After some hardware rearrangement in my development box yesterday, the device which used to be enumerated as 0 is now enumerated as 1. It is the only compute permissive device in the system. It now seems I can’t profile code anymore, because the compute profiler seems to be hardwired to look for profiler output from device 0 only. To illustrate:
If I start the compute visual profiler and run it without selecting a device, the code runs, then the visual profiler reports the usual “temp_compute_profiler_0_0.csv’ for application run 0 not found” error. I can sort of understand that - the profiler is looking for output from device 0 and the code ran on device 1 - although one might wonder why the profiler would offer to profile code on a compute prohibited device, but I digress…
If I select device 1 explicitly in the session settings and run the application, the program runs and the profiler also reports the “temp_compute_profiler_0_0.csv’ for application run 0 not found” error. When I look at the temporary directory, the driver has dropped the requisite csv files:
$ ls *.csv temp_compute_profiler_0_1.csv temp_compute_profiler_12_1.csv temp_compute_profiler_3_1.csv temp_compute_profiler_7_1.csv temp_compute_profiler_10_1.csv temp_compute_profiler_13_1.csv temp_compute_profiler_4_1.csv temp_compute_profiler_8_1.csv temp_compute_profiler_11_1.csv temp_compute_profiler_14_1.csv temp_compute_profiler_5_1.csv temp_compute_profiler_9_1.csv temp_compute_profiler_1_1.csv temp_compute_profiler_2_1.csv temp_compute_profiler_6_1.csv
but the compute profiler seems to be looking only for csv files which end in _0, which I presume signifies the device from which the profiler output was generated. The result is a compute profiler failure.
Has anyone else seen this? Am I doing something wrong? Is there some sort of work around. I can’t believe I am going to be stuck without access to the visual profiler until the 4.1 release cycle…
EDIT: It is possible to do a run in the compute profiler, then hand rename the .csv output from the first run and do a second run. The compute profiler will find the results of the first run and process the results. It correctly parses the csv headers and reports the correct session, context and device information. Seriously, fix this please.