Where is the mdev_supported_types when using A100 80GB PCIe ?
root@gpu004:~# ls /sys/bus/pci/devices/0000:01:00.0/
aer_dev_correctable broken_parity_status current_link_speed dma_mask_bits i2c-4 irq max_link_speed msi_irqs remove resource resource1_resize resource3_wc sriov_offset sriov_vf_total_msix uevent virtfn10 virtfn14 virtfn18 virtfn4 virtfn8
aer_dev_fatal class current_link_width driver i2c-5 link max_link_width numa_node rescan resource0 resource1_wc revision sriov_stride subsystem vendor virtfn11 virtfn15 virtfn19 virtfn5 virtfn9
aer_dev_nonfatal config d3cold_allowed driver_override iommu local_cpulist modalias power reset resource0_resize resource3 sriov_drivers_autoprobe sriov_totalvfs subsystem_device virtfn0 virtfn12 virtfn16 virtfn2 virtfn6
ari_enabled consistent_dma_mask_bits device enable iommu_group local_cpus msi_bus power_state reset_method resource1 resource3_resize sriov_numvfs sriov_vf_device subsystem_vendor virtfn1 virtfn13 virtfn17 virtfn3 virtfn7
root@gpu004:~# ls /sys/bus/pci/devices/0000:01:00.0/virtfn0
ari_enabled class consistent_dma_mask_bits current_link_width device driver enable iommu_group link local_cpus max_link_width msi_bus physfn power_state reset_method resource0 resource1_wc resource3_wc sriov_vf_msix_count subsystem_device uevent
broken_parity_status config current_link_speed d3cold_allowed dma_mask_bits driver_override iommu irq local_cpulist max_link_speed modalias numa_node power reset resource resource1 resource3 revision subsystem subsystem_vendor vendor
root@gpu004:~# ls /sys/class/mdev_bus/*/mdev_supported_types
ls: cannot access '/sys/class/mdev_bus/*/mdev_supported_types': No such file or directory
root@gpu004:~# uname -a
Linux gpu004 6.5.0-44-generic #44~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Jun 18 14:36:16 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
root@gpu004:~# ./NVIDIA-Linux-x86_64-550.90.05-vgpu-kvm.run -no-drm
root@gpu004:~# lsmod | grep nvidia
nvidia_vgpu_vfio 106496 10
nvidia 54300672 3
vfio 65536 2 nvidia_vgpu_vfio,vfio_iommu_type1
kvm 1409024 2 kvm_amd,nvidia_vgpu_vfio
irqbypass 12288 2 nvidia_vgpu_vfio,kvm
mdev 24576 1 nvidia_vgpu_vfio
drm 765952 5 drm_kms_helper,ast,drm_shmem_helper,nvidia
root@gpu004:~# nvidia-smi -L
GPU 0: NVIDIA A100 80GB PCIe (UUID: GPU-74a6fa1b-b700-76d4-06f2-11d21415346b)
GPU 1: NVIDIA A100 80GB PCIe (UUID: GPU-39fd6ab0-4a0b-b355-7be8-040b9f0db869)
GPU 2: NVIDIA A100 80GB PCIe (UUID: GPU-9cd9c7fa-4e7f-37c6-cd79-e788ab4f69aa)
GPU 3: NVIDIA A100 80GB PCIe (UUID: GPU-fae63a05-f25d-12d9-5536-211387befa5d)
root@gpu004:~# nvidia-smi mig -i 0 -cgi 0
Successfully created GPU instance ID 0 on GPU 0 using profile MIG 7g.80gb (ID 0)
root@gpu004:~# nvidia-smi mig -i 1 -cgi 0
Successfully created GPU instance ID 0 on GPU 1 using profile MIG 7g.80gb (ID 0)
root@gpu004:~# nvidia-smi mig -i 2 -cgi 19
Successfully created GPU instance ID 9 on GPU 2 using profile MIG 1g.10gb (ID 19)
root@gpu004:~# nvidia-smi mig -i 3 -cgi 19
Successfully created GPU instance ID 9 on GPU 3 using profile MIG 1g.10gb (ID 19)
root@gpu004:~# /usr/lib/nvidia/sriov-manage -e ALL
Enabling VFs on 0000:01:00.0
Enabling VFs on 0000:41:00.0
Enabling VFs on 0000:81:00.0
Enabling VFs on 0000:c4:00.0
Note: After enabling VFs, the output of nvidia-smi does not show the MIGs anymore.
root@gpu004:~# nvidia-smi
Wed Jul 17 19:05:42 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.90.05 Driver Version: 550.90.05 CUDA Version: N/A |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A100 80GB PCIe On | 00000000:01:00.0 Off | On |
| N/A 46C P0 97W / 300W | 0MiB / 81920MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA A100 80GB PCIe On | 00000000:41:00.0 Off | On |
| N/A 47C P0 105W / 300W | 0MiB / 81920MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+
| 2 NVIDIA A100 80GB PCIe On | 00000000:81:00.0 Off | On |
| N/A 46C P0 99W / 300W | 0MiB / 81920MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+
| 3 NVIDIA A100 80GB PCIe On | 00000000:C4:00.0 Off | On |
| N/A 47C P0 101W / 300W | 0MiB / 81920MiB | N/A Default |
| | | Enabled |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| MIG devices: |
+------------------+----------------------------------+-----------+-----------------------+
| GPU GI CI MIG | Memory-Usage | Vol| Shared |
| ID ID Dev | BAR1-Usage | SM Unc| CE ENC DEC OFA JPG |
| | | ECC| |
|==================+==================================+===========+=======================|
| No MIG devices found |
+-----------------------------------------------------------------------------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
root@gpu004:~# nvidia-smi -q
==============NVSMI LOG==============
Timestamp : Wed Jul 17 19:13:11 2024
Driver Version : 550.90.05
CUDA Version : Not Found
vGPU Driver Capability
Heterogenous Multi-vGPU : Supported
Attached GPUs : 4
GPU 00000000:01:00.0
Product Name : NVIDIA A100 80GB PCIe
Product Brand : NVIDIA
Product Architecture : Ampere
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : N/A
vGPU Device Capability
Fractional Multi-vGPU : Not Supported
Heterogeneous Time-Slice Profiles : Supported
Heterogeneous Time-Slice Sizes : Not Supported
MIG Mode
Current : Enabled
Pending : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1324321032234
GPU UUID : GPU-74a6fa1b-b700-76d4-06f2-11d21415346b
Minor Number : 0
VBIOS Version : 92.00.68.00.01
MultiGPU Board : No
Board ID : 0x100
Board Part Number : 900-21001-0020-000
GPU Part Number : 20B5-893-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : 1001.0230.00.03
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : Host VGPU
Host VGPU Mode : SR-IOV
vGPU Heterogeneous Mode : Disabled
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : No
GSP Firmware Version : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x2
Device Id : 0x20B510DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x153310DE
GPU Link Info
PCIe Generation
Max : 4
Current : 4
Device Current : 4
Device Max : 4
Host Max : N/A
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : N/A
Performance State : P0
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 81920 MiB
Reserved : 699 MiB
Used : 0 MiB
Free : 81222 MiB
BAR1 Memory Usage
Total : 131072 MiB
Used : 1 MiB
Free : 131071 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
JPEG : N/A
OFA : N/A
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
SRAM Threshold Exceeded : No
Aggregate Uncorrectable SRAM Sources
SRAM L2 : 0
SRAM SM : 0
SRAM Microcontroller : 0
SRAM PCIE : 0
SRAM Other : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 640 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 46 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 92 C
GPU Slowdown Temp : 89 C
GPU Max Operating Temp : 85 C
GPU Target Temperature : N/A
Memory Current Temp : 56 C
Memory Max Operating Temp : 95 C
GPU Power Readings
Power Draw : 97.54 W
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
Min Power Limit : 150.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1512 MHz
Video : 1275 MHz
Applications Clocks
Graphics : 1410 MHz
Memory : 1512 MHz
Default Applications Clocks
Graphics : 1410 MHz
Memory : 1512 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1512 MHz
Video : 1290 MHz
Max Customer Boost Clocks
Graphics : 1410 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 900.000 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes : None
GPU 00000000:41:00.0
Product Name : NVIDIA A100 80GB PCIe
Product Brand : NVIDIA
Product Architecture : Ampere
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : N/A
vGPU Device Capability
Fractional Multi-vGPU : Not Supported
Heterogeneous Time-Slice Profiles : Supported
Heterogeneous Time-Slice Sizes : Not Supported
MIG Mode
Current : Enabled
Pending : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1324321150478
GPU UUID : GPU-39fd6ab0-4a0b-b355-7be8-040b9f0db869
Minor Number : 1
VBIOS Version : 92.00.68.00.01
MultiGPU Board : No
Board ID : 0x4100
Board Part Number : 900-21001-0020-000
GPU Part Number : 20B5-893-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : 1001.0230.00.03
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : Host VGPU
Host VGPU Mode : SR-IOV
vGPU Heterogeneous Mode : Disabled
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : No
GSP Firmware Version : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x41
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x2
Device Id : 0x20B510DE
Bus Id : 00000000:41:00.0
Sub System Id : 0x153310DE
GPU Link Info
PCIe Generation
Max : 4
Current : 4
Device Current : 4
Device Max : 4
Host Max : N/A
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : N/A
Performance State : P0
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 81920 MiB
Reserved : 699 MiB
Used : 0 MiB
Free : 81222 MiB
BAR1 Memory Usage
Total : 131072 MiB
Used : 1 MiB
Free : 131071 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
JPEG : N/A
OFA : N/A
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
SRAM Threshold Exceeded : No
Aggregate Uncorrectable SRAM Sources
SRAM L2 : 0
SRAM SM : 0
SRAM Microcontroller : 0
SRAM PCIE : 0
SRAM Other : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 640 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 47 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 92 C
GPU Slowdown Temp : 89 C
GPU Max Operating Temp : 85 C
GPU Target Temperature : N/A
Memory Current Temp : 58 C
Memory Max Operating Temp : 95 C
GPU Power Readings
Power Draw : 105.54 W
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
Min Power Limit : 150.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1512 MHz
Video : 1275 MHz
Applications Clocks
Graphics : 1410 MHz
Memory : 1512 MHz
Default Applications Clocks
Graphics : 1410 MHz
Memory : 1512 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1512 MHz
Video : 1290 MHz
Max Customer Boost Clocks
Graphics : 1410 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 893.750 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes : None
GPU 00000000:81:00.0
Product Name : NVIDIA A100 80GB PCIe
Product Brand : NVIDIA
Product Architecture : Ampere
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : N/A
vGPU Device Capability
Fractional Multi-vGPU : Not Supported
Heterogeneous Time-Slice Profiles : Supported
Heterogeneous Time-Slice Sizes : Not Supported
MIG Mode
Current : Enabled
Pending : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1324321150993
GPU UUID : GPU-9cd9c7fa-4e7f-37c6-cd79-e788ab4f69aa
Minor Number : 2
VBIOS Version : 92.00.68.00.01
MultiGPU Board : No
Board ID : 0x8100
Board Part Number : 900-21001-0020-000
GPU Part Number : 20B5-893-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : 1001.0230.00.03
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : Host VGPU
Host VGPU Mode : SR-IOV
vGPU Heterogeneous Mode : Disabled
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : No
GSP Firmware Version : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x81
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x2
Device Id : 0x20B510DE
Bus Id : 00000000:81:00.0
Sub System Id : 0x153310DE
GPU Link Info
PCIe Generation
Max : 4
Current : 4
Device Current : 4
Device Max : 4
Host Max : N/A
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : N/A
Performance State : P0
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 81920 MiB
Reserved : 699 MiB
Used : 0 MiB
Free : 81222 MiB
BAR1 Memory Usage
Total : 131072 MiB
Used : 1 MiB
Free : 131071 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
JPEG : N/A
OFA : N/A
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
SRAM Threshold Exceeded : No
Aggregate Uncorrectable SRAM Sources
SRAM L2 : 0
SRAM SM : 0
SRAM Microcontroller : 0
SRAM PCIE : 0
SRAM Other : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 640 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 46 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 92 C
GPU Slowdown Temp : 89 C
GPU Max Operating Temp : 85 C
GPU Target Temperature : N/A
Memory Current Temp : 58 C
Memory Max Operating Temp : 95 C
GPU Power Readings
Power Draw : 99.11 W
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
Min Power Limit : 150.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1512 MHz
Video : 1275 MHz
Applications Clocks
Graphics : 1410 MHz
Memory : 1512 MHz
Default Applications Clocks
Graphics : 1410 MHz
Memory : 1512 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1512 MHz
Video : 1290 MHz
Max Customer Boost Clocks
Graphics : 1410 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 900.000 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes : None
GPU 00000000:C4:00.0
Product Name : NVIDIA A100 80GB PCIe
Product Brand : NVIDIA
Product Architecture : Ampere
Display Mode : Enabled
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : N/A
vGPU Device Capability
Fractional Multi-vGPU : Not Supported
Heterogeneous Time-Slice Profiles : Supported
Heterogeneous Time-Slice Sizes : Not Supported
MIG Mode
Current : Enabled
Pending : Enabled
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1324321057915
GPU UUID : GPU-fae63a05-f25d-12d9-5536-211387befa5d
Minor Number : 3
VBIOS Version : 92.00.68.00.01
MultiGPU Board : No
Board ID : 0xc400
Board Part Number : 900-21001-0020-000
GPU Part Number : 20B5-893-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : 1001.0230.00.03
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : Host VGPU
Host VGPU Mode : SR-IOV
vGPU Heterogeneous Mode : Disabled
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : No
GSP Firmware Version : N/A
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0xC4
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x2
Device Id : 0x20B510DE
Bus Id : 00000000:C4:00.0
Sub System Id : 0x153310DE
GPU Link Info
PCIe Generation
Max : 4
Current : 4
Device Current : 4
Device Max : 4
Host Max : N/A
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Inbound : N/A
Atomic Caps Outbound : N/A
Fan Speed : N/A
Performance State : P0
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 81920 MiB
Reserved : 699 MiB
Used : 0 MiB
Free : 81222 MiB
BAR1 Memory Usage
Total : 131072 MiB
Used : 1 MiB
Free : 131071 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : N/A
Memory : N/A
Encoder : N/A
Decoder : N/A
JPEG : N/A
OFA : N/A
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
SRAM Threshold Exceeded : No
Aggregate Uncorrectable SRAM Sources
SRAM L2 : 0
SRAM SM : 0
SRAM Microcontroller : 0
SRAM PCIE : 0
SRAM Other : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 640 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 47 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 92 C
GPU Slowdown Temp : 89 C
GPU Max Operating Temp : 85 C
GPU Target Temperature : N/A
Memory Current Temp : 59 C
Memory Max Operating Temp : 95 C
GPU Power Readings
Power Draw : 102.35 W
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
Min Power Limit : 150.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1512 MHz
Video : 1275 MHz
Applications Clocks
Graphics : 1410 MHz
Memory : 1512 MHz
Default Applications Clocks
Graphics : 1410 MHz
Memory : 1512 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 1410 MHz
SM : 1410 MHz
Memory : 1512 MHz
Video : 1290 MHz
Max Customer Boost Clocks
Graphics : 1410 MHz
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 881.250 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes : None