Hello,
I’m experiencing an issue with NVLink on my system, where NVLink is not active, despite being physically connected on both pairs of NVIDIA Quadro RTX A5000 (24GB) GPUs. Below are the details of the configuration and status:
NVLink Serial Numbers:
- NVLink #1: 1423123003178
- NVLink #2: 1423123003092
Status:
- GPU 0: NVIDIA RTX A5000 (UUID: GPU-e44e0d52-a8ed-9245-289b-9a5207059c23)
- NVML: Unable to retrieve NVLink information as all links are inactive
- GPU 1: NVIDIA RTX A5000 (UUID: GPU-bdd2e519-bfe1-1ab7-8f37-647d00c8b2b8)
- NVML: Unable to retrieve NVLink information as all links are inactive
- GPU 2: NVIDIA RTX A5000 (UUID: GPU-cd480f1e-22f0-a845-0117-b41b364d66bc)
- NVML: Unable to retrieve NVLink information as all links are inactive
- GPU 3: NVIDIA RTX A5000 (UUID: GPU-055aaa7c-2b78-7c85-14aa-23e8890a5a49)
- NVML: Unable to retrieve NVLink information as all links are inactive
Topology:
GPU0 | GPU1 | GPU2 | GPU3 | CPU Affinity | NUMA Affinity | GPU NUMA ID | |
---|---|---|---|---|---|---|---|
GPU0 | X | NODE | SYS | SYS | 0-27,56-83 | 0 | N/A |
GPU1 | NODE | X | SYS | SYS | 0-27,56-83 | 0 | N/A |
GPU2 | SYS | SYS | X | NODE | 28-55,84-111 | 1 | N/A |
GPU3 | SYS | SYS | NODE | X | 28-55,84-111 | 1 | N/A |
Despite the physical connection of the NVLink bridges, all links remain inactive, and I am unable to retrieve NVLink information using NVML.
All cards run on PCI-E 16x, as can be seen from hwinfo and nvidia smi log!
I also checked all the driver versions from 535-560, it didn’t give any results.
I’ve attached the HWinfo report for more details. Could anyone advise on potential troubleshooting steps or known issues that might cause NVLink to remain inactive in this setup?
Thanks in advance for your help!
hw.txt (145.7 KB)
nvidia-smi -q
==============NVSMI LOG==============
Timestamp : Mon Aug 26 13:58:46 2024
Driver Version : 560.28.03
CUDA Version : 12.6
Attached GPUs : 4
GPU 00000000:31:00.0
Product Name : NVIDIA RTX A5000
Product Brand : NVIDIA RTX
Product Architecture : Ampere
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1322722060255
GPU UUID : GPU-e44e0d52-a8ed-9245-289b-9a5207059c23
Minor Number : 0
VBIOS Version : 94.02.6D.00.0D
MultiGPU Board : No
Board ID : 0x3100
Board Part Number : 900-5G132-2200-000
GPU Part Number : 2231-850-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G132.0500.00.01
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : No
GSP Firmware Version : 560.28.03
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x31
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x223110DE
Bus Id : 00000000:31:00.0
Sub System Id : 0x147E10DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 4
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 23028 MiB
Reserved : 413 MiB
Used : 21227 MiB
Free : 1390 MiB
BAR1 Memory Usage
Total : 32768 MiB
Used : 11 MiB
Free : 32757 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
SRAM Threshold Exceeded : No
Aggregate Uncorrectable SRAM Sources
SRAM L2 : 0
SRAM SM : 0
SRAM Microcontroller : 0
SRAM PCIE : 0
SRAM Other : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 25 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 90 C
GPU Target Temperature : 84 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Power Draw : 19.27 W
Current Power Limit : 230.00 W
Requested Power Limit : 230.00 W
Default Power Limit : 230.00 W
Min Power Limit : 100.00 W
Max Power Limit : 230.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 0 MHz
SM : 0 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : 1695 MHz
Memory : 8001 MHz
Default Applications Clocks
Graphics : 1695 MHz
Memory : 8001 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 8001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 0.000 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11776
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 20394 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11777
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11778
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11779
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
Capabilities
EGM : disabled
GPU 00000000:4B:00.0
Product Name : NVIDIA RTX A5000
Product Brand : NVIDIA RTX
Product Architecture : Ampere
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1322822010361
GPU UUID : GPU-bdd2e519-bfe1-1ab7-8f37-647d00c8b2b8
Minor Number : 1
VBIOS Version : 94.02.6D.00.0D
MultiGPU Board : No
Board ID : 0x4b00
Board Part Number : 900-5G132-2200-000
GPU Part Number : 2231-850-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G132.0500.00.01
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : No
GSP Firmware Version : 560.28.03
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x4B
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x223110DE
Bus Id : 00000000:4B:00.0
Sub System Id : 0x147E10DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 4
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 23028 MiB
Reserved : 413 MiB
Used : 21227 MiB
Free : 1390 MiB
BAR1 Memory Usage
Total : 32768 MiB
Used : 11 MiB
Free : 32757 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
SRAM Threshold Exceeded : No
Aggregate Uncorrectable SRAM Sources
SRAM L2 : 0
SRAM SM : 0
SRAM Microcontroller : 0
SRAM PCIE : 0
SRAM Other : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 24 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 90 C
GPU Target Temperature : 84 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Power Draw : 13.94 W
Current Power Limit : 230.00 W
Requested Power Limit : 230.00 W
Default Power Limit : 230.00 W
Min Power Limit : 100.00 W
Max Power Limit : 230.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 0 MHz
SM : 0 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : 1695 MHz
Memory : 8001 MHz
Default Applications Clocks
Graphics : 1695 MHz
Memory : 8001 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 8001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 0.000 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11776
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11777
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 20394 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11778
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11779
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
Capabilities
EGM : disabled
GPU 00000000:B1:00.0
Product Name : NVIDIA RTX A5000
Product Brand : NVIDIA RTX
Product Architecture : Ampere
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1322722057363
GPU UUID : GPU-cd480f1e-22f0-a845-0117-b41b364d66bc
Minor Number : 2
VBIOS Version : 94.02.6D.00.0D
MultiGPU Board : No
Board ID : 0xb100
Board Part Number : 900-5G132-2200-000
GPU Part Number : 2231-850-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G132.0500.00.01
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : No
GSP Firmware Version : 560.28.03
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0xB1
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x223110DE
Bus Id : 00000000:B1:00.0
Sub System Id : 0x147E10DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 4
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 4 KB/s
Rx Throughput : 1 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 23028 MiB
Reserved : 413 MiB
Used : 20887 MiB
Free : 1730 MiB
BAR1 Memory Usage
Total : 32768 MiB
Used : 11 MiB
Free : 32757 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
SRAM Threshold Exceeded : No
Aggregate Uncorrectable SRAM Sources
SRAM L2 : 0
SRAM SM : 0
SRAM Microcontroller : 0
SRAM PCIE : 0
SRAM Other : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 21 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 90 C
GPU Target Temperature : 84 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Power Draw : 12.61 W
Current Power Limit : 230.00 W
Requested Power Limit : 230.00 W
Default Power Limit : 230.00 W
Min Power Limit : 100.00 W
Max Power Limit : 230.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 0 MHz
SM : 0 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : 1695 MHz
Memory : 8001 MHz
Default Applications Clocks
Graphics : 1695 MHz
Memory : 8001 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 8001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 0.000 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11776
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11777
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11778
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 20054 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11779
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
Capabilities
EGM : disabled
GPU 00000000:CA:00.0
Product Name : NVIDIA RTX A5000
Product Brand : NVIDIA RTX
Product Architecture : Ampere
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1322822008481
GPU UUID : GPU-055aaa7c-2b78-7c85-14aa-23e8890a5a49
Minor Number : 3
VBIOS Version : 94.02.6D.00.0D
MultiGPU Board : No
Board ID : 0xca00
Board Part Number : 900-5G132-2200-000
GPU Part Number : 2231-850-A1
FRU Part Number : N/A
Module ID : 1
Inforom Version
Image Version : G132.0500.00.01
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Reset Status
Reset Required : No
Drain and Reset Recommended : No
GSP Firmware Version : 560.28.03
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0xCA
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x223110DE
Bus Id : 00000000:CA:00.0
Sub System Id : 0x147E10DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 4
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 0 KB/s
Rx Throughput : 0 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Sparse Operation Mode : N/A
FB Memory Usage
Total : 23028 MiB
Reserved : 413 MiB
Used : 20887 MiB
Free : 1730 MiB
BAR1 Memory Usage
Total : 32768 MiB
Used : 11 MiB
Free : 32757 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
Gpu : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
ECC Mode
Current : Enabled
Pending : Enabled
ECC Errors
Volatile
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
Aggregate
SRAM Correctable : 0
SRAM Uncorrectable Parity : 0
SRAM Uncorrectable SEC-DED : 0
DRAM Correctable : 0
DRAM Uncorrectable : 0
SRAM Threshold Exceeded : No
Aggregate Uncorrectable SRAM Sources
SRAM L2 : 0
SRAM SM : 0
SRAM Microcontroller : 0
SRAM PCIE : 0
SRAM Other : 0
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 25 C
GPU T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 90 C
GPU Target Temperature : 84 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Power Draw : 16.17 W
Current Power Limit : 230.00 W
Requested Power Limit : 230.00 W
Default Power Limit : 230.00 W
Min Power Limit : 100.00 W
Max Power Limit : 230.00 W
GPU Memory Power Readings
Power Draw : N/A
Module Power Readings
Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Clocks
Graphics : 0 MHz
SM : 0 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : 1695 MHz
Memory : 8001 MHz
Default Applications Clocks
Graphics : 1695 MHz
Memory : 8001 MHz
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 8001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Voltage
Graphics : 0.000 mV
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Bandwidth : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11776
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11777
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11778
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 266 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 11779
Type : C
Name : /opt/tritonserver/bin/tritonserver
Used GPU Memory : 20054 MiB
Capabilities
EGM : disabled