How to calculate if a kernel is compute bound or memory bound based on Peak FLOPS and Peak Memory Bandwidth?

A kernel performs 36 floating-point operations and 7 32-bit word global
memory accesses per thread. For each of the following device properties,
indicate whether this kernel is compute-bound or memory-bound.
A. Peak FLOPS= 200 GFLOPS, Peak Memory Bandwidth= 100 GB/s
B. Peak FLOPS= 300 GFLOPS, Peak Memory Bandwidth= 250 GB/s

Can someone explain how I can approach solving this question?

@Robert_Crovella Can you kindly guide me with how to approach this question?

compute the ratio of flops/byte using the data given per thread. This also defines that ratio for the kernel as a whole. (36/28 = 1.28)

Then compute the same ratio for the machine definitions given in A, B (A= 2, B = 1.2). If the ratio computed from the question is larger than the ratio in A or B, then the kernel is compute-bound on that machine. If it is less than the ratio computed in A or B, then the kernel is memory bound on that machine. If it is identical, then the kernel is neither compute or memory bound, or it is both.

1 Like

@Robert_Crovella Thank you, nicely explained.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.