this might be a stupid question, but I haven’t found anything about it:
what is the best way to track memory bandwidth usage in a fortran code
compiled with pgi?
For information, I want to have an idea of whether switching to dual-core opteron processors will be beneficial to my code.
Thanks in advance!
Sorry for the late reply. This is a tough problem and took awhile to gather some information. As far as I can tell, there is no publically available tool that can measure memory bandwidth. (If anyone knows otherwise, please post!) However, a friend suggested that you can download AMD’s CodeAnalyst tool. (Here ) and measure the following:
The best I can advise is counting the following event counters in CodeAnalyst:
Event- e0 Mem controller page access event
Event- e3 mem controller turnaround
These events has masks that can count the DRAM page misses, DRAM page hits, page conflicts etc. separately.
From these, one can compute approx memory bus usage by knowing:
Each mem controller page access event represent a 64-byte data transfer (for dual channel)
A page miss incur a penalty of Trcd memcycles
A page conflict has a penalty Trp + Trcd memcycles
R/W turnaround penalty: DRAM width (bytes) * 2 * 1 memcycles
W/R turnaround penalty: DRAM width (bytes) * 2 * (Tcl - 1) memcycles
DIMM turnaround penalty: DRAM width (bytes) * 2 * 2 memcycles
Tcl= CAS latency
Trp= row precharge time
Trcd= RAS-2-CAS delay
Sorry its complicated. But I do not know of any public tool that will compute this automatically. Of course, the indirect way to check whether an application is bounded by mem bandwidth is to rerun the application with lower memory bandwidth (but same latency) and see how much is the slowdown.
Hope this helps,