cudaDeviceSynchronize return failure code 39

Dear all:

my machine is :
ubuntu 1606
cuda 9.0
v100 16GB card*4

I first run lots of cudnn operations with very large tensor close to 2GB size.
at the end I run a final cudaDeviceSynchronize, it wait for a very long time, and return a fail message like this:

Cuda failure: 39

I can not find such failure code on documentation and internet, can any help?