I’m looking to create a benchmarking tool.
Why not take a look at the current benchmarking tools that exist within CUDA samples and generate an idea from those?
BandwidthTest is probably a good starting point. deviceQuery will also provide you some information.
Seldom seen such a broad question around here.
As an nVidia GPU can compute nearly everything, you could also benchmark nearly everything.
FLOPS
Best Answer hahaha.