I tested performance of V100 and 2080ti using TensorRT and pyCuda. The tested model was ResNet50 and Inception_v1.
But in my code, V100 was slower than 2080ti. In many references, V100 has high throughput more than 2080ti always.
I think, It seems like that I can not use TensorRt and Cuda appropriately. How can I use them properly?
If you want any further information like my code or frozen graph, please let me know.