I have a computer with 3 cuda devices (GTX 295 + 8800GTX), and I want to profile multiple GPU cuda code. The problem is that the profiler only shows usage of the 2nd and 3rd devices - the 1st device invariably shows no data, although it is clear that it is actually running kernals as well.
It also returns a empty column/header error at the completion of the profiling run, which I assume is correlated to the lack of data for the 1st device.
Any ideas what’s going on here?