Why 4090 training slower than P100 even writing same piece of code?

Hi,

I’m asking this question because I saw benchmarks where 4090 is beating out Tesla p100 gpu. But when am doing same work in live class, I notice their seconds/iteration is 5(let say) and mine is 6.

Why is such difference even if am having better machine?

Is it because they are in Google Colab and am on local machine?