I run my application in container and get errors sometimes. When cache in the container reach limit of the cgroup(use dd, cp or get a file remote), deviceQuery would failed sometimes.
NVIDIA-SMI 375.66 Driver Version: 375.66
cuda: 8.0
# ./deviceQuery
./deviceQuery Starting...
CUDA Device Query (Runtime API) version (CUDART static linking)
cudaGetDeviceCount returned 3
-> initialization error
Result = FAIL