Stylegan 2 performance issue on RTX 3090

I am running Stylegan 2 model on 4x RTX 3090 and I observed that it is taking a long time to start up the training than as in 1x RTX 3090. Although, as training starts, it gets finished up earlier in 4x than in 1x. I am using CUDA 11.1 and TensorFlow 1.14 in both the GPUs.

Secondly, When I am using 1x RTX 2080ti, with CUDA 10.2 and TensorFlow 1.14, it is taking less amount to start the training as compared to 1x RTX 3090 with 11.1 CUDA and Tensorflow 1.14. Tentatively, it is taking 5 min in 1x RTX 2080ti, 30-35 minutes in 1x RTX 3090, and 1.5 hrs in 4x RTX 3090 to start the training .

I’ll be grateful if anyone can help me to resolve this issue. I am planning to purchase 4x RTX 3090 GPUs as soon as possible, but it is not giving the desired performance.

I am using Ubuntu 16.04, Core™ i9-10980XE CPU, and 32 GB ram both in 2080ti and 3090 machines.

Edit: I have done debugging and found-out that line number 162 of stylegan2/training_loop.py at master · NVlabs/stylegan2 · GitHub consuming 16 minutes in RTX 3090.

Might be you were confusing 3090 for 3090ti?
I’m watching on the issue. it seems it has the same kind of problem startup a training process on 3090 compared with 1080Ti

Thanks @true_artificial_intellige, I have updated my query. Please let me know whenever you will find out the cause and solution for it.

Yes. base on Cuda 11.1 , you will take 30 minitues to start tensorflow.

that is correct. no problem.

But you can use following " export CUDA_CACHE_MAXSIZE=2147483648" to force that tensorflow to be started shortly at next time. so only first time you will take 30 minutes.

Can you please tell me if it is important to use specifically Ubuntu 16.04? I’ve been trying to launch StyleGAN2 on Ubuntu 18.04, Tensorflow 1.14, CUDA 11.1, but no luck. It just crashes with ‘segmentation fault’ runtime error early.
Any suggestion is much appreciated. Thank you!

Stylegan2 will not even give results with ubuntu 16.04 if you using CUDA 11.1 and TF 1.14 because CUDA 11.1 & TF 1.14 are incompatible.
In my case, with Ubuntu 16.04, CUDA 11.1 & TF 1.14, code ran but the resulting image is just a black screen.