sysmem_read_throughput vs dram_read_throughput


Profiling on nvidia gpu with capability 3.x will produce system memory and dram throughput metrics. Dram measures the device(global, in my opinion) memory. What does system memory measure?



CUDA Profiler supports two dram metrics dram_read_transaction and dram_write_transaction. These metrics measure the read and write transactions for the device memory i.e. memory allocated on GPU for global, constant, and texture operations. For sysmem, we have sysmem_read_throughput and sysmem_write_throughput metrics. These metrics measure read and write throughput for the system memory i.e. memory allocated on system/host RAM.