Could you share code used to profile different cuBLAS, cuDNN throughput at different input dimentions?

isaaclee2313 · January 22, 2020, 10:55am

Could you share the code you’ve used for profiling the thorughput of cuBLAS, cuDNN operations at different input dimentions?

If not, what are some caveats in implementing one? Any advice / guide would be appreciated!