My current system is a GTX480 installed as a secondary non-graphical card with CUDA 4.1 on linux. I have several applications that have worked with the visual profiler, but for one particular application I am interested in profiling, I cannot get the Kernel Memory Analysis to run completely.
I get the following error after 11 runs of my program which reads:
“Unable to collect metric and event values.
I’d like to know what I can do to get the dram_reads and dram_writes information so I can try to figure out where the bottlenecks in my code may lie. I’d also like to know if it is possible, and how to get around this error, which tells me in the Analysis view that there is “Insufficient Global Memory Load Data”. Is the command-line profiler still available in Cuda 4.1 and would that help?