This is my test result in 4 1080Ti:
This is the test result in 1 1080Ti:
I just change my batchsize from 32 to 128, the dataset and model are same.
why the time is 300ms/batch in 4 1080Ti instead of 80ms.
When I increase the number of GPUs to 4, the amount of data also increases by 4 times. Shouldn’t the time be close?
Can someone help me?