Description
Hi,
I’ve performed some tests to compare performances in a Windows 10 environment w.r.t a Ubuntu 22.04 one.
Software specs:
Windows | Ubuntu | |
---|---|---|
Drivers | 535.98 | 535.86.05 |
CUDA | 11.8 | 11.8 |
CuDNN | 8.7.0 | 8.7.0 |
TensorRT | 8.5.1 | 8.5.1 |
Test setup:
- Windows : install drivers, cuda, cudnn and tensorrt locally;
- Ubuntu: build the TensorRT container with versions shown in table
Then:
- Export a yolo .pt model in TensorRT using the provided script by ultralytics (on Windows and inside the container);
- Run inference on the same batch of images, with the same script and performing the same pre-post processing operations
- Get inference times
What we experience is a stability in Ubuntu, with times in the range 8ms - 11ms for a batch of three images, while on Windows the time remains stable for the first runs, diverging to 40ms for the others.
Using less recent driver versions (< 470) and keeping unchanged CUDA, CuDNN and TensorRT the performances are aligned, but we loose the support for new NVIDIA features.
Are you aware of driver issues in Windows?
Thanks
Environment
TensorRT Version: 8.5.1
GPU Type: Quadro T1000
Nvidia Driver Version: 535.98 / 535.86.05
CUDA Version: 11.8
CUDNN Version: 8.7.0
Operating System + Version: Windows 10 / Ubuntu 22.04
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):
Relevant Files
Times on Windows 10
Inference time : 0.010017156600952148
Inference time : 0.011110782623291016
Inference time : 0.007866144180297852
Inference time : 0.006834983825683594
Inference time : 0.009020566940307617
Inference time : 0.01261758804321289
Inference time : 0.008991241455078125
Inference time : 0.008015632629394531
…
…
…
Inference time : 0.052073001861572266
Inference time : 0.048119306564331055
Inference time : 0.04847455024719238
Inference time : 0.05039215087890625
Inference time : 0.04845285415649414
Inference time : 0.05358147621154785
Inference time : 0.04916524887084961
Inference time : 0.05600476264953613
Inference time : 0.059531450271606445
Inference time : 0.055614471435546875
Inference time : 0.05081486701965332
Inference time : 0.05752444267272949
Inference time : 0.052495479583740234
Inference time : 0.049124956130981445
Steps To Reproduce
- Export a yolo .pt model in TensorRT using the provided script by ultralytics (on Windows and inside the container);
- Run inference on the same batch of images, with the same script and performing the same pre-post processing operations
- Get inference times