We profile our kernel function on two machine.The src code is the same but some metrics differs a lot i.e.“shared_load_transactions”.
The profile results differ as below
Invocations Metric Name Metric Description Min Max Avg
Device "GeForce 940MX (0)"
Kernel: match_kernel
5 shared_load_transactions_per_request Shared Memory Load Transactions Per Request 1.000000 1.000000 1.000000
5 shared_store_transactions_per_request Shared Memory Store Transactions Per Request 0.000000 0.000000 0.000000
5 shared_efficiency Shared Memory Efficiency 100.00% 100.00% 100.00%
5 shared_store_transactions Shared Store Transactions 0 0 0
5 <b> shared_load_transactions Shared Load Transactions 1990656 1990656 1990656</b>
5 ipc Executed IPC 1.899034 1.903520 1.901302
5 achieved_occupancy Achieved Occupancy 0.093046 0.093062 0.093053
5 issued_ipc Issued IPC 1.899049 1.903535 1.901318
Invocations Metric Name Metric Description Min Max Avg
Device "Xavier (0)"
Kernel: match_kernel
6 shared_load_transactions_per_request Shared Memory Load Transactions Per Request 1.003864 1.004223 1.004103
6 shared_store_transactions_per_request Shared Memory Store Transactions Per Request 0.000000 0.000000 0.000000
6 shared_efficiency Shared Memory Efficiency 99.58% 99.62% 99.59%
6 shared_store_transactions Shared Store Transactions 0 0 0
6 <b> shared_load_transactions Shared Load Transactions 6661157 6663541 6662742</b>
6 ipc Executed IPC 2.713783 2.733202 2.724108
6 achieved_occupancy Achieved Occupancy 0.124009 0.124711 0.124416
6 issued_ipc Issued IPC 2.713786 2.733206 2.724112
What’s wrong?