Could not run deviceQuery with CUDA 10.1 with Tesla V100 on CentOS 7

Dear experts:

I would like to install CUDA10.1 and run my Pytorch scripts under CentOS 7 with Tesla V100 installed.
The remote compute unit is under CentOS 7 and i have successfully installed nvidia driver v418.87.01

$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 418.87.01 Wed Sep 25 06:00:38 UTC 2019
GCC version: gcc version 4.8.5 20150623 (Red Hat 4.8.5-39) (GCC)

and nvidia-smi gives:
sudo nvidia-smi
Wed Dec 4 22:54:39 2019
±----------------------------------------------------------------------------+
| NVIDIA-SMI 418.87.01 Driver Version: 418.87.01 CUDA Version: 10.1 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla V100-PCIE… On | 00000000:06:10.0 Off | 0 |
| N/A 34C P0 27W / 250W | 0MiB / 32480MiB | 0% Default |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
±----------------------------------------------------------------------------+
after installation of cuda complier, nvcc gives:

$nvcc --version
nvcc: NVIDIA ® Cuda compiler driver
Copyright © 2005-2019 NVIDIA Corporationn
Built on Fri_Feb__8_19:08:17_PST_2019
Cuda compilation tools, release 10.1, V10.1.105

And I can successfully make the samples of nvidia Cuda. and running deviceQuery from /bin
gives no anwser:

/usr/local/cuda-10.1/samples/bin/x86_64/linux/release/deviceQuery
/usr/local/cuda-10.1/samples/bin/x86_64/linux/release/deviceQuery Starting…

CUDA Device Query (Runtime API) version (CUDART static linking)

then the system stops/freeze here, infinitive wait

same situation happens when i run some simple pytorch cuda related commands.
could someone explains me what is the problem and how i can fix it ?

thanks in advance.

nobody knows the reason ?