Slow real-time inference using WSL2

mutsuyuki · August 1, 2022, 8:23pm

I tried running real-time inference on Docker on WSL2.
The inference, which takes about 30ms in the Linux environment, takes about 80ms in the WSL2 environment.
(The inference is face detection of a model named MTCNN implemented by pytorch.)

My environments
Host OS : Windows 11
WSL type : WSL2 (Ubuntu 20.04)
Docker image : nvcr.io/nvidia/tensorflow:20.10-tf1-py3 (Ubuntu 18.04.5)
GPU : GTX 3080 (laptop)
Cuda : 11.1
(I need to use both tensorflow 1.15 and pytorch in my software, so I installed pytorch additionally based on the above Docker Image.)

My Questions

In a WSL2 environment, does performing inference on small batches cause slowdowns?
Is there a possibility that inference on a Docker environment on WSL2 will further slow down the inference speed?
Are there any settings to avoid these slowdown?

Other my looked at
I read the following post about WSL2.

Looking at figure4, it appears that WSL2 is at a speed disadvantage when the number of batches is small.
On the other hand, figure8 shows that asynchronous communication makes Cuda startup from WSL2 faster.

Does figure8 introduce a method that can run faster in smaller batches?
I can’t understand the article well, so I want to know is it possible to resolve delay.

rboissel · August 3, 2022, 8:06pm

Hello,

In a WSL2 environment, does performing inference on small batches cause slowdowns?

Yes, the bigger the workload the less overhead you will see

Is there a possibility that inference on a Docker environment on WSL2 will further slow down the inference speed?

Usually the slow down introduced by container is minor

Are there any settings to avoid these slowdown?

Also it is not a way to completely avoid slow down make sure of the following:

Run with GPU Hardware Accelerated Scheduling enabled: Hardware Accelerated GPU Scheduling - DirectX Developer Blog (microsoft.com)
Make sure your OS and WSL environment is up to date (you can run wsl --update).
Try increasing the size of your workload if possible.