nvidia-docker error - Check failed: error == cudaSuccess (35 vs. 0) CUDA driver version is insuffic

I am using nvidia-docker, with the nvidia/cuda:7.5-cudnn5-devel image.
I hit a compilation error when trying to compile caffe:

E0727 14:59:57.959830   674 common.cpp:113] Cannot create Cublas handle. Cublas won't be available.
E0727 14:59:57.960569   674 common.cpp:120] Cannot create Curand generator. Curand won't be available.
F0727 14:59:57.961776   674 syncedmem.hpp:18] Check failed: error == cudaSuccess (35 vs. 0)  CUDA driver version is insufficient for CUDA runtime version

some of the compile output -

-- Added CUDA NVCC flags for: sm_20 sm_21 sm_30 sm_35 sm_50
-- OpenCV found (/usr/share/OpenCV)
-- Found Atlas (include: /usr/include, library: /usr/lib/libatlas.so)
-- NumPy ver. 1.11.1 found (include: /usr/local/lib/python2.7/dist-packages/numpy/core/include)
-- Boost version: 1.54.0
-- Found the following Boost libraries:
--   python
-- Could NOT find Doxygen (missing:  DOXYGEN_EXECUTABLE)
--
-- ******************* Caffe Configuration Summary *******************
-- General:
--   Version           :   1.0.0-rc3
--   Git               :   42cd785
--   System            :   Linux
--   C++ compiler      :   /usr/bin/c++
--   Release CXX flags :   -O3 -DNDEBUG -fPIC -Wall -Wno-sign-compare -Wno-uninitialized
--   Debug CXX flags   :   -g -fPIC -Wall -Wno-sign-compare -Wno-uninitialized
--   Build type        :   Release
--
--   BUILD_SHARED_LIBS :   ON
--   BUILD_python      :   ON
--   BUILD_matlab      :   OFF
--   BUILD_docs        :   ON
--   CPU_ONLY          :   OFF
--   USE_OPENCV        :   ON
--   USE_LEVELDB       :   ON
--   USE_LMDB          :   ON
--   ALLOW_LMDB_NOLOCK :   OFF
--
-- Dependencies:
--   BLAS              :   Yes (Atlas)
--   Boost             :   Yes (ver. 1.54)
--   glog              :   Yes
--   gflags            :   Yes
--   protobuf          :   Yes (ver. 2.5.0)
--   lmdb              :   Yes (ver. 0.9.10)
--   LevelDB           :   Yes (ver. 1.15)
--   Snappy            :   Yes (ver. 1.1.0)
--   OpenCV            :   Yes (ver. 2.4.8)
--   CUDA              :   Yes (ver. 7.5)

did you solve the problem ?

I am tying to compile caffe:

https://github.com/CMU-Perceptual-Computing-Lab/caffe_train

I run successfully :

make all
make test

But when I run

make runtest

I get the following error, please note line 9-10, 90-:

Cuda number of devices: 1
Setting to use device 1
Current device id: 0
Current device name: GeForce GTX 950M
Note: Randomizing tests' orders with a seed of 48454 .
[==========] Running 2081 tests from 277 test cases.
[----------] Global test environment set-up.
[----------] 10 tests from PowerLayerTest/0, where TypeParam = caffe::CPUDevice<float>
[b][ RUN      ] PowerLayerTest/0.TestPowerTwo
E0504 19:57:11.898780 14435 common.cpp:113] Cannot create Cublas handle. Cublas won't be available.[/b]
[       OK ] PowerLayerTest/0.TestPowerTwo (550 ms)
[ RUN      ] PowerLayerTest/0.TestPowerOne
[       OK ] PowerLayerTest/0.TestPowerOne (0 ms)
[ RUN      ] PowerLayerTest/0.TestPowerOneGradient
[       OK ] PowerLayerTest/0.TestPowerOneGradient (1 ms)
[ RUN      ] PowerLayerTest/0.TestPower
[       OK ] PowerLayerTest/0.TestPower (0 ms)
[ RUN      ] PowerLayerTest/0.TestPowerGradient
[       OK ] PowerLayerTest/0.TestPowerGradient (3 ms)
[ RUN      ] PowerLayerTest/0.TestPowerGradientShiftZero
[       OK ] PowerLayerTest/0.TestPowerGradientShiftZero (5 ms)
[ RUN      ] PowerLayerTest/0.TestPowerTwoGradient
[       OK ] PowerLayerTest/0.TestPowerTwoGradient (1 ms)
[ RUN      ] PowerLayerTest/0.TestPowerTwoScaleHalfGradient
[       OK ] PowerLayerTest/0.TestPowerTwoScaleHalfGradient (2 ms)
[ RUN      ] PowerLayerTest/0.TestPowerZero
[       OK ] PowerLayerTest/0.TestPowerZero (0 ms)
[ RUN      ] PowerLayerTest/0.TestPowerZeroGradient
[       OK ] PowerLayerTest/0.TestPowerZeroGradient (1 ms)
[----------] 10 tests from PowerLayerTest/0 (563 ms total)

[----------] 3 tests from SplitLayerTest/1, where TypeParam = caffe::CPUDevice<double>
[ RUN      ] SplitLayerTest/1.Test
[       OK ] SplitLayerTest/1.Test (0 ms)
[ RUN      ] SplitLayerTest/1.TestGradient
[       OK ] SplitLayerTest/1.TestGradient (3 ms)
[ RUN      ] SplitLayerTest/1.TestSetup
[       OK ] SplitLayerTest/1.TestSetup (0 ms)
[----------] 3 tests from SplitLayerTest/1 (3 ms total)

[----------] 2 tests from EuclideanLossLayerTest/1, where TypeParam = caffe::CPUDevice<double>
[ RUN      ] EuclideanLossLayerTest/1.TestGradient
[       OK ] EuclideanLossLayerTest/1.TestGradient (1 ms)
[ RUN      ] EuclideanLossLayerTest/1.TestForward
[       OK ] EuclideanLossLayerTest/1.TestForward (0 ms)
[----------] 2 tests from EuclideanLossLayerTest/1 (1 ms total)

[----------] 8 tests from SliceLayerTest/3, where TypeParam = caffe::GPUDevice<double>
[ RUN      ] SliceLayerTest/3.TestSetupChannels
[       OK ] SliceLayerTest/3.TestSetupChannels (9 ms)
[ RUN      ] SliceLayerTest/3.TestSliceAcrossNum
[       OK ] SliceLayerTest/3.TestSliceAcrossNum (1 ms)
[ RUN      ] SliceLayerTest/3.TestTrivialSlice
[       OK ] SliceLayerTest/3.TestTrivialSlice (3 ms)
[ RUN      ] SliceLayerTest/3.TestSetupNum
[       OK ] SliceLayerTest/3.TestSetupNum (2 ms)
[ RUN      ] SliceLayerTest/3.TestGradientAcrossNum
[       OK ] SliceLayerTest/3.TestGradientAcrossNum (411 ms)
[ RUN      ] SliceLayerTest/3.TestGradientAcrossChannels
[       OK ] SliceLayerTest/3.TestGradientAcrossChannels (414 ms)
[ RUN      ] SliceLayerTest/3.TestGradientTrivial
[       OK ] SliceLayerTest/3.TestGradientTrivial (18 ms)
[ RUN      ] SliceLayerTest/3.TestSliceAcrossChannels
[       OK ] SliceLayerTest/3.TestSliceAcrossChannels (2 ms)
[----------] 8 tests from SliceLayerTest/3 (860 ms total)

[----------] 8 tests from LRNLayerTest/0, where TypeParam = caffe::CPUDevice<float>
[ RUN      ] LRNLayerTest/0.TestForwardAcrossChannelsLargeRegion
[       OK ] LRNLayerTest/0.TestForwardAcrossChannelsLargeRegion (0 ms)
[ RUN      ] LRNLayerTest/0.TestSetupWithinChannel
[       OK ] LRNLayerTest/0.TestSetupWithinChannel (0 ms)
[ RUN      ] LRNLayerTest/0.TestSetupAcrossChannels
[       OK ] LRNLayerTest/0.TestSetupAcrossChannels (0 ms)
[ RUN      ] LRNLayerTest/0.TestGradientAcrossChannelsLargeRegion
[       OK ] LRNLayerTest/0.TestGradientAcrossChannelsLargeRegion (533 ms)
[ RUN      ] LRNLayerTest/0.TestForwardWithinChannel
[       OK ] LRNLayerTest/0.TestForwardWithinChannel (0 ms)
[ RUN      ] LRNLayerTest/0.TestForwardAcrossChannels
[       OK ] LRNLayerTest/0.TestForwardAcrossChannels (0 ms)
[ RUN      ] LRNLayerTest/0.TestGradientAcrossChannels
[       OK ] LRNLayerTest/0.TestGradientAcrossChannels (483 ms)
[ RUN      ] LRNLayerTest/0.TestGradientWithinChannel
[       OK ] LRNLayerTest/0.TestGradientWithinChannel (438 ms)
[----------] 8 tests from LRNLayerTest/0 (1454 ms total)

[----------] 50 tests from NeuronLayerTest/2, where TypeParam = caffe::GPUDevice<float>
[ RUN      ] NeuronLayerTest/2.TestLogGradient
[       OK ] NeuronLayerTest/2.TestLogGradient (15 ms)
[ RUN      ] NeuronLayerTest/2.TestLogLayerBase2Shift1Scale3
F0504 19:57:14.253590 14435 math_functions.cu:85] Check failed: status == CUBLAS_STATUS_SUCCESS (1 vs. 0)  CUBLAS_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
    @     0x2b111fffadaa  (unknown)
    @     0x2b111ffface4  (unknown)
    @     0x2b111fffa6e6  (unknown)
    @     0x2b111fffd687  (unknown)
    @     0x2b1122183d17  caffe::caffe_gpu_scal<>()
    @     0x2b1122176279  caffe::LogLayer<>::Forward_gpu()
    @           0x477e46  caffe::Layer<>::Forward()
    @           0x548d90  caffe::NeuronLayerTest<>::TestLogForward()
    @           0x8fca63  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @           0x8f3747  testing::Test::Run()
    @           0x8f37ee  testing::TestInfo::Run()
    @           0x8f38f5  testing::TestCase::Run()
    @           0x8f6c38  testing::internal::UnitTestImpl::RunAllTests()
    @           0x8f6ec7  testing::UnitTest::Run()
    @           0x46cbbf  main
    @     0x2b1122ff1f45  (unknown)
    @           0x474819  (unknown)
    @              (nil)  (unknown)
make: *** [runtest] Aborted (core dumped)

A thread with my problem points to this this discussion to solve the problem:
https://github.com/BVLC/caffe/issues/5564

when i run “make runtest -j8” get fail:
Cannot create Cublas handle. Cublas won’t be available.

but I run " sudo make runtest -j8" is ok.