Hi,
I use ncu to profile my kernel and want to see the roofline. It comes out with a roofline chart but it didn’t show the achieved value for my kernel?
Is it because my FLOPS is too small?
Actually, my kernel is about graph algorithm’s acceleration. Most of the operations are integer operations instead of floating points.
Which metric should I use if I want to get the achieved value through ncu for my kernel?
I do not have FP operations in kernel. Only integer operations like memory access operations. Can we get the achieved value in this case? I mean it’s ok if we do not have FLOPS. But should have similar metrics to the achieved performance?