GPU Memory on K80 vs V100

Hi,

We have build an image classification model with pre-trained ResNet50. The versions for pytorch are m torch==1.4.0 and torchvision==0.5.0.

The model was deployed initially on a K80 based VM in Azure.Later the application and the model was moved to a Tesla V100 based GPU on AWS . When observed the process is taking more memory than what it was taking inside the K80 VM. (Details pasted below)

Since we are deploying other models as well in this single GPU machine , we are started getting out of memory errors. But the same containers are running on K80 instance without any issue.

Please let me know , what is causing the spiking the GPU memory and how to resolve this.
Thank you

(Since i’m a new user can’t upload screenshots, pasted nvidia-smi ouput)

K80, below is the memory usage.

NVIDIA-SMI 440.100 Driver Version: 440.100 CUDA Version: 10.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla K80 Off | 00000000:00:1E.0 Off | 0 |
| N/A 42C P0 70W / 149W | 2535MiB / 11441MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 2825 C /usr/bin/python3 504MiB |
| 0 2826 C /usr/bin/python3 504MiB |
| 0 2827 C /usr/bin/python3 504MiB |
| 0 2828 C /usr/bin/python3 504MiB |
| 0 2829 C /usr/bin/python3 504MiB |
±----------------------------------------------------------------------------+

On a V100

NVIDIA-SMI 440.95.01 Driver Version: 440.95.01 CUDA Version: 10.2 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-SXM2… Off | 00000000:00:1E.0 Off | 0 |
| N/A 44C P0 38W / 300W | 6376MiB / 16160MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 3263 C /usr/bin/python3 1273MiB |
| 0 3264 C /usr/bin/python3 1273MiB|
| 0 3265 C /usr/bin/python3 1273MiB|
| 0 3266 C /usr/bin/python3 1273MiB|
| 0 3268 C /usr/bin/python3 1273MiB|
±----------------------------------------------------------------------------+