How can grid a100 use cuda tookit?

it seems cuda program cannot find gpu,but nvidia-smi can.here is my info
nvidia-smi -a:
Driver Version : 470.141.03
CUDA Version : 11.4

Attached GPUs : 1
GPU 00000000:00:08.0
Product Name : GRID A100D-1-10C
Product Brand : NVIDIA Virtual Compute Server
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : Enabled
MIG Mode
Current : Enabled
Pending : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : N/A
GPU UUID : GPU-bc83451e-649c-11ed-8386-4817b06ea88b
Minor Number : 0
VBIOS Version : 00.00.00.00.00
MultiGPU Board : No
Board ID : 0x8
GPU Part Number : N/A
Module ID : N/A
Inforom Version
Image Version : N/A
OEM Object : N/A
ECC Object : N/A
Power Management Object : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GSP Firmware Version : N/A
GPU Virtualization Mode
Virtualization Mode : VGPU
Host VGPU Mode : N/A
vGPU Software Licensed Product
Product Name : NVIDIA Virtual Compute Server
License Status : Licensed (Expiry: 2022-11-19 6:2:59 GMT)
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x00
Device : 0x08
Domain : 0x0000
Device Id : 0x20B510DE
Bus Id : 00000000:00:08.0
Sub System Id : 0x159110DE
GPU Link Info
PCIe Generation
Max : N/A
Current : N/A
Link Width
Max : N/A
Current : N/A
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : N/A
Replay Number Rollovers : N/A
Tx Throughput : N/A
Rx Throughput : N/A
Fan Speed : N/A
Performance State : P0
Clocks Throttle Reasons : N/A
FB Memory Usage
Total : 10235 MiB
Used : 1441 MiB
Free : 8794 MiB
BAR1 Memory Usage
Total : 4096 MiB
Used : 0 MiB
Free : 4096 MiB
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
Ecc Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows : N/A
Temperature
GPU Current Temp : N/A
GPU Shutdown Temp : N/A
GPU Slowdown Temp : N/A
GPU Max Operating Temp : N/A
GPU Target Temperature : N/A
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
Power Readings
Power Management : N/A
Power Draw : N/A
Power Limit : N/A
Default Power Limit : N/A
Enforced Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1512 MHz
Video : 1275 MHz
Applications Clocks
Graphics : N/A
Memory : N/A
Default Applications Clocks
Graphics : N/A
Memory : N/A
Max Clocks
Graphics : N/A
SM : N/A
Memory : N/A
Video : N/A
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : N/A
Processes : None


lspci |grep -i nvidia
00:08.0 3D controller: NVIDIA Corporation GA100 [A100 PCIe 80GB] (rev a1)

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

and testing with deviceQuery
CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 100
→ no CUDA-capable device is detected
Result = FAIL

can anyone could give me some advice?
Thanks.