GTX 1080 doesn't release memory

I have a new computer specifically to use this GTX 1080 for deep learning.
The problem I am experiencing is that I am getting ResourceExhaustion errors from Tensorflow even though I am feeding it very small batches which should fit on the GPU.

At first nvidia-smi shows that x-server and gnome-shell are taking up a couple hundred MB of memory.

Sat Nov  4 23:58:15 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 387.22                 Driver Version: 387.22                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:03:00.0  On |                  N/A |
| 24%   40C    P0    46W / 180W |    274MiB /  8105MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0       698      G   /usr/lib/xorg-server/Xorg                     18MiB |
|    0       733      G   /usr/bin/gnome-shell                          28MiB |
|    0       924      G   /usr/lib/xorg-server/Xorg                    100MiB |
|    0       974      G   /usr/bin/gnome-shell                         124MiB |
+-----------------------------------------------------------------------------+

Then I create a VGG network in Keras and it loads the weights of the network with ImageNet weights. Then I import keras in a python shell. As soon as I have done this, memory usage is 7721MiB / 8105MiB and it stays that way until I terminate the python process.

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 387.22                 Driver Version: 387.22                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:03:00.0  On |                  N/A |
| 24%   41C    P2    44W / 180W |   7721MiB /  8105MiB |      1%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0       698      G   /usr/lib/xorg-server/Xorg                     18MiB |
|    0       733      G   /usr/bin/gnome-shell                          28MiB |
|    0       924      G   /usr/lib/xorg-server/Xorg                    100MiB |
|    0       974      G   /usr/bin/gnome-shell                         124MiB |
|    0      6264      C   /usr/bin/python3                            7437MiB |
+-----------------------------------------------------------------------------+

I think that there must be an issue with my GPU not releasing memory.

Here is my system:
MSI - X99A TOMAHAWK ATX LGA2011-3 Motherboard
Intel - Xeon E5-1620 V4 3.5GHz Quad-Core Processor
G.Skill - Ripjaws V Series 16GB (2 x 8GB) DDR4-3000 Memory
Gigabyte - GeForce GTX 1080 8GB Video Card
4.13.9-1-ARCH x86_64 GNU/Linux

Above I mentioned that I am loading Keras and Tensorflow, here are the versions:
Tensorflow 1.4.0
Keras 2.0.9
Cuda 9.0.176

Please let me know if any other information is needed…
Thank you

After writing this, I’ve tried playing some games as well as running CUDA sample tests (all passing), I’m thinking this might be an issue with Tensorflow 1.4 and CUDA 9, though I am not sure. I will try downgrading to CUDA 8 and post back here.

Can someone tell me if it is normal that the memory usage stays high despite that all operations are done? Should the GPU not release the memory?