A kernel performs 36 floating-point operations and 7 32-bit word global
memory accesses per thread. For each of the following device properties,
indicate whether this kernel is compute-bound or memory-bound.
A. Peak FLOPS= 200 GFLOPS, Peak Memory Bandwidth= 100 GB/s
B. Peak FLOPS= 300 GFLOPS, Peak Memory Bandwidth= 250 GB/s
Can someone explain how I can approach solving this question?
compute the ratio of flops/byte using the data given per thread. This also defines that ratio for the kernel as a whole. (36/28 = 1.28)
Then compute the same ratio for the machine definitions given in A, B (A= 2, B = 1.2). If the ratio computed from the question is larger than the ratio in A or B, then the kernel is compute-bound on that machine. If it is less than the ratio computed in A or B, then the kernel is memory bound on that machine. If it is identical, then the kernel is neither compute or memory bound, or it is both.