Isaacsim crash when update GPU driver

Important: Isaac Sim support

Note: For Isaac Sim support, the community is gradually transitioning from this forum to the Isaac Sim GitHub repository so that questions and issues can be tracked, searched, and resolved more efficiently in one place. Whenever possible, please create a GitHub Discussion or Issue there instead of starting a new forum topic.

Note: For any Isaac Lab topics, please submit your topic to its GitHub repo ( GitHub - isaac-sim/IsaacLab: Unified framework for robot learning built on NVIDIA Isaac Sim · GitHub ) following the instructions provided on Isaac Lab’s Contributing Guidelines ( Contribution Guidelines — Isaac Lab Documentation ).

Please provide all relevant details below before submitting your post. This will help the community provide more accurate and timely assistance. After submitting, you can check the appropriate boxes. Remember, you can always edit your post later to include additional information if needed.

6.0.0
5.1.0
5.0.0
4.5.0
4.2.0
4.1.0
4.0.0
4.5.0
2023.1.1
2023.1.0-hotfix.1
Other (please specify):

Operating System

Ubuntu 24.04
Ubuntu 22.04
Ubuntu 20.04
Windows 11
Windows 10
Other (please specify):

GPU Information

  • Model:NVIDIA RTX A6000
  • Driver Version:610.43.02

Topic Description

Detailed Description

I was collecting data using Isaac Sim 5.1 on a GPU cluster. Midway through the collection process, one of the GPUs went offline; after it came back online, the driver version had updated from 580 to 610, and since then, Isaac Sim has been crashing continuously.

Error Messages

[11.135s] app ready
2026-06-02T01:34:54Z [0ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] A crash has occurred. If a debugger should for up to that long for a debugger to attach before processing or sending the crash report.
2026-06-02T01:34:54Z [87ms] [Error] [carb.crashreporter-breakpad.plugin] [crash] Wrote dump file '/home/wli/Yifan/isaacsim/kit
2026-06-02T01:34:54Z [92ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] preventing upload of minidump due to user o
2026-06-02T01:34:54Z [94ms] [Error] [carb.crashreporter-breakpad.plugin] [crash] dump file size is 6619768 bytes, file is
2026-06-02T01:34:54Z [98ms] [Fatal] [carb.crashreporter-breakpad.plugin] [crash] Crash detected in pid 532629 thread 532661
2026-06-02T01:34:54Z [100ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] Crash metadata:
2026-06-02T01:34:54Z [102ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] BuildGitlabJobID = ‘209507876’
2026-06-02T01:34:54Z [104ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] CarbSdkVersion = '206.6+release.9587.07f1
2026-06-02T01:34:54Z [106ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] CrashTime = 'Tue Jun 2 01:34:54 2026 GMT
2026-06-02T01:34:54Z [108ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] DriverShaderCacheWrapper = ‘disabled’
2026-06-02T01:34:54Z [110ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] DumpId = '152cbb1c-6265-42c1-f0d208b6-665
2026-06-02T01:34:54Z [112ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] LastUploadStatus = ‘0’
2026-06-02T01:34:54Z [114ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] ProductName = ‘OmniverseKit’
2026-06-02T01:34:54Z [116ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] PythonTracebackStatus = '‘py-spy’ success
2026-06-02T01:34:54Z [119ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] RetryCount = ‘0’
2026-06-02T01:34:54Z [121ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] StartupTime = ‘1780364082’
2026-06-02T01:34:54Z [123ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UploadSuccessful = ‘0’
2026-06-02T01:34:54Z [125ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UptimeSeconds = ‘12’
2026-06-02T01:34:54Z [127ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UserStory = ‘’
2026-06-02T01:34:54Z [129ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UserStoryStatus = 'Running in headless mo
2026-06-02T01:34:54Z [132ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] Version = '107.3.3+production.229672.69cb
2026-06-02T01:34:54Z [134ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] aftermath_enabled = ‘1’
2026-06-02T01:34:54Z [136ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] aftermath_flags = ‘3’
2026-06-02T01:34:54Z [138ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] aftermath_status = ‘auto-enabled’
2026-06-02T01:34:54Z [140ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] aftermath_version = ‘2024.1’
2026-06-02T01:34:54Z [143ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] appName = ‘Isaac-Sim Python’
2026-06-02T01:34:54Z [145ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] appState = ‘started’
2026-06-02T01:34:54Z [147ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] appVersion = ‘5.1.0’
2026-06-02T01:34:54Z [149ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] autoloadExts = ‘’
2026-06-02T01:34:54Z [151ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildBranch = ‘production’
2026-06-02T01:34:54Z [153ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildCi = ‘gl’
2026-06-02T01:34:54Z [156ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildConfig = ‘release’
2026-06-02T01:34:54Z [158ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildGitlabJobName = 'kit-build-release-l
2026-06-02T01:34:54Z [160ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildGitlabJobStage = ‘kit-build’
2026-06-02T01:34:54Z [162ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildGitlabPipelineID = ‘34875094’
2026-06-02T01:34:54Z [165ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildHash = ‘69cbf6ad’
2026-06-02T01:34:54Z [166ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildMajor = ‘107’
2026-06-02T01:34:54Z [168ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildMinor = ‘3’
2026-06-02T01:34:54Z [170ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildNumber = ‘229672’
2026-06-02T01:34:54Z [172ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildPatch = ‘3’
2026-06-02T01:34:54Z [174ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildVersion = ‘107.3.3’
2026-06-02T01:34:54Z [176ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] carboniteFrameworkVersion = '206.6+releas
2026-06-02T01:34:54Z [178ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] carboniteSdkVersion = '206.6+release.9587
2026-06-02T01:34:54Z [181ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] commandLine = 'python3 /home//Y
2026-06-02T01:34:54Z [183ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuCoreLimited = ‘96’
2026-06-02T01:34:54Z [185ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuId = 'Intel64 Family 6 Model 143 Stepp
2026-06-02T01:34:54Z [187ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuLogicalCoresBareMetal = ‘96’
2026-06-02T01:34:54Z [189ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuName = ‘Intel(R) Xeon(R) Gold 5418Y’
2026-06-02T01:34:54Z [191ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuPhysicalCoresBareMetal = ‘48’
2026-06-02T01:34:54Z [193ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuUsageQuota = ‘-1.000000’
2026-06-02T01:34:54Z [195ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuVendor = ‘GenuineIntel’
2026-06-02T01:34:54Z [197ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] desktopOrigin = ‘(0, 0)’
2026-06-02T01:34:54Z [199ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] desktopSize = ‘4000x2560’
2026-06-02T01:34:54Z [201ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] displayCount = ‘1’
2026-06-02T01:34:54Z [203ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] displayRes_0 = ‘4000x2560x32bit@0Hz’
2026-06-02T01:34:54Z [206ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] environmentName = ‘Individual’
2026-06-02T01:34:54Z [208ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] externalBuild = ‘1’
2026-06-02T01:34:54Z [210ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_0 = ‘610.43’
2026-06-02T01:34:54Z [212ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_1 = ‘610.43’
2026-06-02T01:34:54Z [214ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_2 = ‘610.43’
2026-06-02T01:34:54Z [216ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_3 = ‘610.43’
2026-06-02T01:34:54Z [218ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_4 = ‘610.43’
2026-06-02T01:34:54Z [221ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_5 = ‘610.43’
2026-06-02T01:34:54Z [223ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_6 = ‘610.43’
2026-06-02T01:34:54Z [225ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_7 = ‘610.43’
2026-06-02T01:34:54Z [227ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuVRAM_0 = ‘51784974336’
2026-06-02T01:34:54Z [230ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuVRAM_1 = ‘51784974336’
2026-06-02T01:34:54Z [232ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuVRAM_2 = ‘51784974336’
2026-06-02T01:34:54Z [234ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuVRAM_3 = ‘51784974336’
2026-06-02T01:34:54Z [236ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuVRAM_4 = ‘51784974336’
2026-06-02T01:34:54Z [238ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuVRAM_5 = ‘51784974336’
2026-06-02T01:34:54Z [240ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuVRAM_6 = ‘51784974336’
2026-06-02T01:34:54Z [241ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuVRAM_7 = ‘51784974336’
2026-06-02T01:34:54Z [243ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpu_0 = ‘NVIDIA RTX A6000’
2026-06-02T01:34:54Z [246ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpu_1 = ‘NVIDIA RTX A6000’
2026-06-02T01:34:54Z [248ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpu_2 = ‘NVIDIA RTX A6000’
2026-06-02T01:34:54Z [250ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpu_3 = ‘NVIDIA RTX A6000’
2026-06-02T01:34:54Z [253ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpu_4 = ‘NVIDIA RTX A6000’
2026-06-02T01:34:54Z [255ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpu_5 = ‘NVIDIA RTX A6000’
2026-06-02T01:34:54Z [257ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpu_6 = ‘NVIDIA RTX A6000’
2026-06-02T01:34:54Z [259ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpu_7 = ‘NVIDIA RTX A6000’
2026-06-02T01:34:54Z [261ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] kitRendererDriverVersion = ‘610.43’
2026-06-02T01:34:54Z [263ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lastCommand = ‘MJCFCreateImportConfig’
2026-06-02T01:34:54Z [265ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lastCommands = 'URDFCreateImportConfig,MJ
2026-06-02T01:34:54Z [267ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_isaacSim_buildBranch = ‘release’
2026-06-02T01:34:54Z [270ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_isaacSim_buildDate = 'Fri Oct 17 03:5
2026-06-02T01:34:54Z [272ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_isaacSim_buildHash = ‘9c81211’
2026-06-02T01:34:54Z [278ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_isaacSim_buildRepo = 'https://gitlab-
2026-06-02T01:34:54Z [280ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_isaacSim_buildVersion = ‘5.1.0-rc.19’
2026-06-02T01:34:54Z [282ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_physx_buildBranch = ‘HEAD’
2026-06-02T01:34:54Z [285ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_physx_buildDate = ‘Oct-03-2025’
2026-06-02T01:34:54Z [287ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_physx_buildHash = ‘3a61992’
2026-06-02T01:34:54Z [289ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_physx_buildRepo = 'gitlab-master.nvid
2026-06-02T01:34:54Z [291ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] lib_physx_buildVersion = ‘107.3.26’
2026-06-02T01:34:54Z [293ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] memoryStats = '(avail/total) RAM: 240.285
2026-06-02T01:34:54Z [295ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] nvidia-smi = ’
==============NVSMI LOG==============

Timestamp : Mon Jun 1 21:34:44 2026
Driver Version : 610.43.02 [Deprecated; will be removed in CUDA 14.0. Use KMD Version
CUDA Version : 13.3 [Deprecated; will be removed in CUDA 14.0. Use CUDA UMD Version
KMD Version : 610.43.02
CUDA UMD Version : 13.3

Attached GPUs : 8
GPU 00000000:17:00.0
Product Name : NVIDIA RTX A6000
Product Brand : NVIDIA RTX
Product Architecture : Ampere
Display Mode : Requested functionality has been deprecated
Display Attached : No
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1713724008907
GPU UUID : GPU-45662d1a-989d-389f-b81f-8bef23e5e5f4
GPU PDI : 0x868d183ea49f6db3
Minor Number : 0
VBIOS Version : 94.02.5C.00.07
MultiGPU Board : No
Board ID : 0x1700
Board Part Number : 900-5G133-0100-001
GPU Part Number : 2230-875-A1
FRU Part Number : N/A
Platform Info
Chassis Serial Number : N/A
Slot Number : N/A
Tray Index : N/A
Host ID : N/A
Peer Type : N/A
Module Id : 1
GPU Fabric GUID : N/A
Inforom Version
Image Version : G133.0500.00.05
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
Inforom Time Data
Time Run : 5007757 seconds
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Recovery Action : None
GSP Firmware Version : 610.43.02
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x17
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x223010DE
Bus Id : 00000000:17:00.0
Sub System Id : 0x14591028
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 5
Link Width
Max : 16x
Current : 8x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 634 KB/s
Rx Throughput : 439 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Clocks Event Reasons Counters
SW Power Capping : 280330828183 us
Sync Boost : 0 us
SW Thermal Slowdown : 371414822 us
HW Thermal Slowdown : 0 us
HW Power Braking : 0 us
Sparse Operation Mode : N/A
FB Memory Usage
Total : 49140 MiB
Reserved : 602 MiB
Used : 2640 MiB
Free : 45900 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 4 MiB
Free : 252 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
GPU : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
DRAM Encryption Mode
Current : N/A
Pending : N/A
ECC Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
SRAM Threshold Exceeded : N/A
Aggregate Uncorrectable SRAM Sources
SRAM L2 : N/A
SRAM SM : N/A
SRAM Microcontroller : N/A
SRAM PCIE : N/A
SRAM Other : N/A
Channel Repair Pending : No
TPC Repair Pending : No
Unrepairable Memory : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Inactive Correctable Error : 0
Uncorrectable Error : 0
Inactive Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 27 C
GPU Current T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 93 C
GPU Target Temperature Specification : 84 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Average Power Draw : 18.38 W
Instantaneous Power Draw : 18.49 W
GPU Ceiling Power Limit
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
GPU Base Power
Current Base Power : N/A
Requested Base Power : N/A
Default Base Power : N/A
Min Power Limit : 100.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Module Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Power Smoothing : N/A
Workload Power Profiles
Requested Profiles : N/A
Enforced Profiles : N/A
EDPp Multiplier : N/A
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : Requested functionality has been deprecated
Memory : Requested functionality has been deprecated
Default Applications Clocks
Graphics : Requested functionality has been deprecated
Memory : Requested functionality has been deprecated
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 8001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Summary : N/A
Bandwidth : N/A
Route Recovery in progress : N/A
Route Unhealthy : N/A
Access Timeout Recovery : N/A
Incorrect Configuration : N/A
Partition Assigned : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 5463
Type : G
Name : …/Xorg
Used GPU Memory : 4 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 2738815
Type : C
Name : python
Used GPU Memory : 2616 MiB
Capabilities
EGM : disabled

GPU 00000000:3D:00.0
Product Name : NVIDIA RTX A6000
Product Brand : NVIDIA RTX
Product Architecture : Ampere
Display Mode : Requested functionality has been deprecated
Display Attached : No
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1710324018356
GPU UUID : GPU-d9bee061-4bc8-1d23-be90-01bfcee446c2
GPU PDI : 0x6fe464a6f7e0e9b6
Minor Number : 1
VBIOS Version : 94.02.5C.00.02
MultiGPU Board : No
Board ID : 0x3d00
Board Part Number : 900-5G133-0000-000
GPU Part Number : 2230-875-A1
FRU Part Number : N/A
Platform Info
Chassis Serial Number : N/A
Slot Number : N/A
Tray Index : N/A
Host ID : N/A
Peer Type : N/A
Module Id : 1
GPU Fabric GUID : N/A
Inforom Version
Image Version : G133.0500.00.05
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
Inforom Time Data
Time Run : 3706095 seconds
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Recovery Action : None
GSP Firmware Version : 610.43.02
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x3D
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x223010DE
Bus Id : 00000000:3D:00.0
Sub System Id : 0x145910DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 5
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 634 KB/s
Rx Throughput : 390 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Clocks Event Reasons Counters
SW Power Capping : 294537352400 us
Sync Boost : 0 us
SW Thermal Slowdown : 176707802478 us
HW Thermal Slowdown : 0 us
HW Power Braking : 0 us
Sparse Operation Mode : N/A
FB Memory Usage
Total : 49140 MiB
Reserved : 602 MiB
Used : 2388 MiB
Free : 46152 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 4 MiB
Free : 252 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
GPU : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
DRAM Encryption Mode
Current : N/A
Pending : N/A
ECC Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
SRAM Threshold Exceeded : N/A
Aggregate Uncorrectable SRAM Sources
SRAM L2 : N/A
SRAM SM : N/A
SRAM Microcontroller : N/A
SRAM PCIE : N/A
SRAM Other : N/A
Channel Repair Pending : No
TPC Repair Pending : No
Unrepairable Memory : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Inactive Correctable Error : 0
Uncorrectable Error : 0
Inactive Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 27 C
GPU Current T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 93 C
GPU Target Temperature Specification : 84 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Average Power Draw : 24.17 W
Instantaneous Power Draw : 24.25 W
GPU Ceiling Power Limit
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
GPU Base Power
Current Base Power : N/A
Requested Base Power : N/A
Default Base Power : N/A
Min Power Limit : 100.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Module Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Power Smoothing : N/A
Workload Power Profiles
Requested Profiles : N/A
Enforced Profiles : N/A
EDPp Multiplier : N/A
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : Requested functionality has been deprecated
Memory : Requested functionality has been deprecated
Default Applications Clocks
Graphics : Requested functionality has been deprecated
Memory : Requested functionality has been deprecated
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 8001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Summary : N/A
Bandwidth : N/A
Route Recovery in progress : N/A
Route Unhealthy : N/A
Access Timeout Recovery : N/A
Incorrect Configuration : N/A
Partition Assigned : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 5463
Type : G
Name : …/Xorg
Used GPU Memory : 4 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 2738815
Type : C
Name : python
Used GPU Memory : 2364 MiB
Capabilities
EGM : disabled

GPU 00000000:50:00.0
Product Name : NVIDIA RTX A6000
Product Brand : NVIDIA RTX
Product Architecture : Ampere
Display Mode : Requested functionality has been deprecated
Display Attached : No
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1562521008652
GPU UUID : GPU-3965aa26-e649-8315-65b4-7beb4e5ac00a
GPU PDI : 0xc1f4f715a301bb7a
Minor Number : 2
VBIOS Version : 94.02.5C.00.02
MultiGPU Board : No
Board ID : 0x5000
Board Part Number : 900-5G133-1700-000
GPU Part Number : 2230-875-A1
FRU Part Number : N/A
Platform Info
Chassis Serial Number : N/A
Slot Number : N/A
Tray Index : N/A
Host ID : N/A
Peer Type : N/A
Module Id : 1
GPU Fabric GUID : N/A
Inforom Version
Image Version : G133.0500.00.05
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
Inforom Time Data
Time Run : 10862042 seconds
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Recovery Action : None
GSP Firmware Version : 610.43.02
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x50
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x223010DE
Bus Id : 00000000:50:00.0
Sub System Id : 0x145910DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 5
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 781 KB/s
Rx Throughput : 537 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Clocks Event Reasons Counters
SW Power Capping : 111077003441 us
Sync Boost : 0 us
SW Thermal Slowdown : 0 us
HW Thermal Slowdown : 0 us
HW Power Braking : 0 us
Sparse Operation Mode : N/A
FB Memory Usage
Total : 49140 MiB
Reserved : 602 MiB
Used : 2388 MiB
Free : 46152 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 4 MiB
Free : 252 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
GPU : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
DRAM Encryption Mode
Current : N/A
Pending : N/A
ECC Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
SRAM Threshold Exceeded : N/A
Aggregate Uncorrectable SRAM Sources
SRAM L2 : N/A
SRAM SM : N/A
SRAM Microcontroller : N/A
SRAM PCIE : N/A
SRAM Other : N/A
Channel Repair Pending : No
TPC Repair Pending : No
Unrepairable Memory : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Inactive Correctable Error : 0
Uncorrectable Error : 0
Inactive Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 25 C
GPU Current T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 93 C
GPU Target Temperature Specification : 84 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Average Power Draw : 6.40 W
Instantaneous Power Draw : 6.24 W
GPU Ceiling Power Limit
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
GPU Base Power
Current Base Power : N/A
Requested Base Power : N/A
Default Base Power : N/A
Min Power Limit : 100.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Module Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Power Smoothing : N/A
Workload Power Profiles
Requested Profiles : N/A
Enforced Profiles : N/A
EDPp Multiplier : N/A
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : Requested functionality has been deprecated
Memory : Requested functionality has been deprecated
Default Applications Clocks
Graphics : Requested functionality has been deprecated
Memory : Requested functionality has been deprecated
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 8001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Summary : N/A
Bandwidth : N/A
Route Recovery in progress : N/A
Route Unhealthy : N/A
Access Timeout Recovery : N/A
Incorrect Configuration : N/A
Partition Assigned : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 5463
Type : G
Name : …/Xorg
Used GPU Memory : 4 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 2738815
Type : C
Name : python
Used GPU Memory : 2364 MiB
Capabilities
EGM : disabled

GPU 00000000:63:00.0
Product Name : NVIDIA RTX A6000
Product Brand : NVIDIA RTX
Product Architecture : Ampere
Display Mode : Requested functionality has been deprecated
Display Attached : No
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1324521013057
GPU UUID : GPU-e81fa781-6c5a-9516-4e19-6437871cc144
GPU PDI : 0x38f062b4c7c5a023
Minor Number : 3
VBIOS Version : 94.02.5C.00.02
MultiGPU Board : No
Board ID : 0x6300
Board Part Number : 900-5G133-1700-000
GPU Part Number : 2230-875-A1
FRU Part Number : N/A
Platform Info
Chassis Serial Number : N/A
Slot Number : N/A
Tray Index : N/A
Host ID : N/A
Peer Type : N/A
Module Id : 1
GPU Fabric GUID : N/A
Inforom Version
Image Version : G133.0500.00.05
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
Inforom Time Data
Time Run : 3066502 seconds
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Recovery Action : None
GSP Firmware Version : 610.43.02
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x63
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x223010DE
Bus Id : 00000000:63:00.0
Sub System Id : 0x145910DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 5
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 634 KB/s
Rx Throughput : 634 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Not Active
Applications Clocks Setting : Not Active
SW Power Cap : Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdown : Not Active
Display Clock Setting : Not Active
Clocks Event Reasons Counters
SW Power Capping : 293796621126 us
Sync Boost : 0 us
SW Thermal Slowdown : 151831439006 us
HW Thermal Slowdown : 0 us
HW Power Braking : 0 us
Sparse Operation Mode : N/A
FB Memory Usage
Total : 49140 MiB
Reserved : 602 MiB
Used : 2388 MiB
Free : 46152 MiB
BAR1 Memory Usage
Total : 256 MiB
Used : 4 MiB
Free : 252 MiB
Conf Compute Protected Memory Usage
Total : 0 MiB
Used : 0 MiB
Free : 0 MiB
Compute Mode : Default
Utilization
GPU : 0 %
Memory : 0 %
Encoder : 0 %
Decoder : 0 %
JPEG : 0 %
OFA : 0 %
Encoder Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
FBC Stats
Active Sessions : 0
Average FPS : 0
Average Latency : 0
DRAM Encryption Mode
Current : N/A
Pending : N/A
ECC Mode
Current : Disabled
Pending : Disabled
ECC Errors
Volatile
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
Aggregate
SRAM Correctable : N/A
SRAM Uncorrectable Parity : N/A
SRAM Uncorrectable SEC-DED : N/A
DRAM Correctable : N/A
DRAM Uncorrectable : N/A
SRAM Threshold Exceeded : N/A
Aggregate Uncorrectable SRAM Sources
SRAM L2 : N/A
SRAM SM : N/A
SRAM Microcontroller : N/A
SRAM PCIE : N/A
SRAM Other : N/A
Channel Repair Pending : No
TPC Repair Pending : No
Unrepairable Memory : N/A
Retired Pages
Single Bit ECC : N/A
Double Bit ECC : N/A
Pending Page Blacklist : N/A
Remapped Rows
Correctable Error : 0
Inactive Correctable Error : 0
Uncorrectable Error : 0
Inactive Uncorrectable Error : 0
Pending : No
Remapping Failure Occurred : No
Bank Remap Availability Histogram
Max : 192 bank(s)
High : 0 bank(s)
Partial : 0 bank(s)
Low : 0 bank(s)
None : 0 bank(s)
Temperature
GPU Current Temp : 27 C
GPU Current T.Limit Temp : N/A
GPU Shutdown Temp : 98 C
GPU Slowdown Temp : 95 C
GPU Max Operating Temp : 93 C
GPU Target Temperature Specification : 84 C
Memory Current Temp : N/A
Memory Max Operating Temp : N/A
GPU Power Readings
Average Power Draw : 26.80 W
Instantaneous Power Draw : 26.89 W
GPU Ceiling Power Limit
Current Power Limit : 300.00 W
Requested Power Limit : 300.00 W
Default Power Limit : 300.00 W
GPU Base Power
Current Base Power : N/A
Requested Base Power : N/A
Default Base Power : N/A
Min Power Limit : 100.00 W
Max Power Limit : 300.00 W
GPU Memory Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Module Power Readings
Average Power Draw : N/A
Instantaneous Power Draw : N/A
Current Power Limit : N/A
Requested Power Limit : N/A
Default Power Limit : N/A
Min Power Limit : N/A
Max Power Limit : N/A
Power Smoothing : N/A
Workload Power Profiles
Requested Profiles : N/A
Enforced Profiles : N/A
EDPp Multiplier : N/A
Clocks
Graphics : 210 MHz
SM : 210 MHz
Memory : 405 MHz
Video : 555 MHz
Applications Clocks
Graphics : Requested functionality has been deprecated
Memory : Requested functionality has been deprecated
Default Applications Clocks
Graphics : Requested functionality has been deprecated
Memory : Requested functionality has been deprecated
Deferred Clocks
Memory : N/A
Max Clocks
Graphics : 2100 MHz
SM : 2100 MHz
Memory : 8001 MHz
Video : 1950 MHz
Max Customer Boost Clocks
Graphics : N/A
Clock Policy
Auto Boost : N/A
Auto Boost Default : N/A
Fabric
State : N/A
Status : N/A
CliqueId : N/A
ClusterUUID : N/A
Health
Summary : N/A
Bandwidth : N/A
Route Recovery in progress : N/A
Route Unhealthy : N/A
Access Timeout Recovery : N/A
Incorrect Configuration : N/A
Partition Assigned : N/A
Processes
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 5463
Type : G
Name : …/Xorg
Used GPU Memory : 4 MiB
GPU instance ID : N/A
Compute instance ID : N/A
Process ID : 2738815
Type : C
Name : python
Used GPU Memory : 2364 MiB
Capabilities
EGM : disabled

GPU 00000000:99:00.0
Product Name : NVIDIA RTX A6000
Product Brand : NVIDIA RTX
Product Architecture : Ampere
Display Mode : Requested functionality has been deprecated
Display Attached : No
Display Active : Disabled
Persistence Mode : Enabled
Addressing Mode : None
MIG Mode
Current : N/A
Pending : N/A
Accounting Mode : Disabled
Accounting Mode Buffer Size : 4000
Driver Model
Current : N/A
Pending : N/A
Serial Number : 1324721048573
GPU UUID : GPU-0f222a99-77ff-5076-57dd-7402c0edab78
GPU PDI : 0xc425080f5e918529
Minor Number : 4
VBIOS Version : 94.02.5C.00.02
MultiGPU Board : No
Board ID : 0x9900
Board Part Number : 900-5G133-1700-000
GPU Part Number : 2230-875-A1
FRU Part Number : N/A
Platform Info
Chassis Serial Number : N/A
Slot Number : N/A
Tray Index : N/A
Host ID : N/A
Peer Type : N/A
Module Id : 1
GPU Fabric GUID : N/A
Inforom Version
Image Version : G133.0500.00.05
OEM Object : 2.0
ECC Object : 6.16
Power Management Object : N/A
Inforom BBX Object Flush
Latest Timestamp : N/A
Latest Duration : N/A
Inforom Time Data
Time Run : 4247586 seconds
GPU Operation Mode
Current : N/A
Pending : N/A
GPU C2C Mode : N/A
GPU Virtualization Mode
Virtualization Mode : None
Host VGPU Mode : N/A
vGPU Heterogeneous Mode : N/A
GPU Recovery Action : None
GSP Firmware Version : 610.43.02
IBMNPU
Relaxed Ordering Mode : N/A
PCI
Bus : 0x99
Device : 0x00
Domain : 0x0000
Base Classcode : 0x3
Sub Classcode : 0x0
Device Id : 0x223010DE
Bus Id : 00000000:99:00.0
Sub System Id : 0x145910DE
GPU Link Info
PCIe Generation
Max : 4
Current : 1
Device Current : 1
Device Max : 4
Host Max : 5
Link Width
Max : 16x
Current : 16x
Bridge Chip
Type : N/A
Firmware : N/A
Replays Since Reset : 0
Replay Number Rollovers : 0
Tx Throughput : 683 KB/s
Rx Throughput : 732 KB/s
Atomic Caps Outbound : N/A
Atomic Caps Inbound : N/A
Fan Speed : 30 %
Performance State : P8
Clocks Event Reasons
Idle : Active
Applications Clocks Setting : Not Active
SW Power Cap : Not Active
HW Slowdown : Not Active
HW Thermal Slowdown : Not Active
HW Power Brake Slowdown : Not Active
Sync Boost : Not Active
SW Thermal Slowdow
2026-06-02T01:34:55Z [322ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] osDistro = ‘ubuntu’
2026-06-02T01:34:55Z [325ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] osName = ‘22.04.5 LTS (Jammy Jellyfish)’
2026-06-02T01:34:55Z [327ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] osVersion = ‘22.04.5’
2026-06-02T01:34:55Z [329ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] portableMode = ‘1’
2026-06-02T01:34:55Z [331ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] primaryDisplayRes = ‘4000x2560x32bit@0Hz’
2026-06-02T01:34:55Z [333ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] runEnvironment = ‘Individual’
2026-06-02T01:34:55Z [335ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] runningInContainer = ‘0’
2026-06-02T01:34:55Z [337ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] shaderdb_debugSymbols = ‘0’
2026-06-02T01:34:55Z [340ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] shaderdb_dumpIncludeOverrides = ‘0’
2026-06-02T01:34:55Z [342ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] shaderdb_dumpIntermediates = ‘0’
2026-06-02T01:34:55Z [344ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] shaderdb_obfuscateCode = ‘1’
2026-06-02T01:34:55Z [346ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] shaderdb_optimizationLevel = ‘1’
2026-06-02T01:34:55Z [348ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] systemInfo = ’
|---------------------------------------------------------------------------------------------|
| Driver Version: 610.43.02 | Graphics API: Vulkan
|=============================================================================================|
| GPU | Name | Active | LDA | GPU Memory | Vendor-ID | LUID |
| | | | | | Device-ID | UUID |
| | | | | | Bus-ID | |
|---------------------------------------------------------------------------------------------|
| 0 | NVIDIA RTX A6000 | | | 49386 MB | 10de | 0 |
| | | | | | 2230 | 45662d1a.. |
| | | | | | 17 | |
|---------------------------------------------------------------------------------------------|
| 1 | NVIDIA RTX A6000 | | | 49386 MB | 10de | 0 |
| | | | | | 2230 | d9bee061.. |
| | | | | | 3d | |
|---------------------------------------------------------------------------------------------|
| 2 | NVIDIA RTX A6000 | | | 49386 MB | 10de | 0 |
| | | | | | 2230 | 3965aa26.. |
| | | | | | 50 | |
|---------------------------------------------------------------------------------------------|
| 3 | NVIDIA RTX A6000 | | | 49386 MB | 10de | 0 |
| | | | | | 2230 | e81fa781.. |
| | | | | | 63 | |
|---------------------------------------------------------------------------------------------|
| 4 | NVIDIA RTX A6000 | | | 49386 MB | 10de | 0 |
| | | | | | 2230 | 0f222a99.. |
| | | | | | 99 | |
|---------------------------------------------------------------------------------------------|
| 5 | NVIDIA RTX A6000 | | | 49386 MB | 10de | 0 |
| | | | | | 2230 | 46abb649.. |
| | | | | | bd | |
|---------------------------------------------------------------------------------------------|
| 6 | NVIDIA RTX A6000 | | | 49386 MB | 10de | 0 |
| | | | | | 2230 | 9d0ec067.. |
| | | | | | cf | |
|---------------------------------------------------------------------------------------------|
| 7 | NVIDIA RTX A6000 | Yes: 0 | | 49386 MB | 10de | 0 |
| | | | | | 2230 | 484a656f.. |
| | | | | | e1 | |
|=============================================================================================|
| OS: 22.04.5 LTS (Jammy Jellyfish) ubuntu, Version: 22.04.5, Kernel: 6.5.0-15-generic
| XServer Vendor: Moba/X, XServer Version: 12101015
| Processor: Intel(R) Xeon(R) Gold 5418Y
| Cores: 48 | Logical Cores: 96
|---------------------------------------------------------------------------------------------|
| Total Memory (MB): 257602 | Free Memory: 247699
| Total Page/Swap (MB): 4095 | Free Page/Swap: 4094
|---------------------------------------------------------------------------------------------|

2026-06-02T01:34:55Z [350ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] telemetrySessionId = '8199836311160711159
2026-06-02T01:34:55Z [352ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] terminatedByAbort = ‘0’
2026-06-02T01:34:55Z [355ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] totalRamBareMetalMB = ‘257602’
2026-06-02T01:34:55Z [357ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] totalRamLimitedMB = ‘257602’
2026-06-02T01:34:55Z [359ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] totalSwapBareMetalMB = ‘4095’
2026-06-02T01:34:55Z [361ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] totalSwapLimitedMB = ‘4095’
2026-06-02T01:34:55Z [363ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] userId = ‘default’
2026-06-02T01:34:55Z [365ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] workingDirectory = '/home//Yifa
2026-06-02T01:34:55Z [367ms] [Fatal] [carb.crashreporter-breakpad.plugin] [crash] Thread 532661 backtrace follows:
2026-06-02T01:34:55Z [447ms] [Fatal] [carb.crashreporter-breakpad.plugin] 000: libc.so.6!__sigaction+0x50 (libc_sigaction.c:?)
2026-06-02T01:34:55Z [461ms] [Fatal] [carb.crashreporter-breakpad.plugin] 001: librtx.scenedb.plugin.so!void std::vector<std::igned int> > >::_M_realloc_insert<std::tuple<char const*, float, float, unsigned int, unsigned int, unsigned int> >(__gnu_cxx:int, unsigned int, unsigned int>, std::allocator<std::tuple<char const*, float, float, unsigned int, unsigned int, unsigned in
2026-06-02T01:34:55Z [469ms] [Fatal] [carb.crashreporter-breakpad.plugin] 002: librtx.scenedb.plugin.so!void std::vector<std::igned int> > >::_M_realloc_insert<std::tuple<char const*, float, float, unsigned int, unsigned int, unsigned int> >(__gnu_cxx:int, unsigned int, unsigned int>, std::allocator<std::tuple<char const*, float, float, unsigned int, unsigned int, unsigned in
2026-06-02T01:34:55Z [476ms] [Fatal] [carb.crashreporter-breakpad.plugin] 003: librtx.scenedb.plugin.so!void std::vector<std::igned int> > >::_M_realloc_insert<std::tuple<char const*, float, float, unsigned int, unsigned int, unsigned int> >(__gnu_cxx:int, unsigned int, unsigned int>, std::allocator<std::tuple<char const*, float, float, unsigned int, unsigned int, unsigned in
2026-06-02T01:34:55Z [482ms] [Fatal] [carb.crashreporter-breakpad.plugin] 004: librtx.scenedb.plugin.so!carbOnPluginStartup+0x
2026-06-02T01:34:55Z [488ms] [Fatal] [carb.crashreporter-breakpad.plugin] 005: librtx.scenedb.plugin.so!void std::vector<int,
2026-06-02T01:34:55Z [494ms] [Fatal] [carb.crashreporter-breakpad.plugin] 006: libcarb.scenerenderer-rtx.plugin.so!std::unordeered_map()+0x4655f (??:?)
2026-06-02T01:34:55Z [500ms] [Fatal] [carb.crashreporter-breakpad.plugin] 007: libcarb.scenerenderer-rtx.plugin.so!std::unordeered_map()+0x64486 (??:?)
2026-06-02T01:34:55Z [505ms] [Fatal] [carb.crashreporter-breakpad.plugin] 008: libcarb.scenerenderer-rtx.plugin.so!carbOnPlugi
2026-06-02T01:34:55Z [508ms] [Fatal] [carb.crashreporter-breakpad.plugin] 009: libomni.hydra.rtx.plugin.so!+0x4459
2026-06-02T01:34:55Z [514ms] [Fatal] [carb.crashreporter-breakpad.plugin] 010: libomni.usd.so!void std::vector<unsigned long, d long> > >, unsigned long&&)+0x10da (??:?)
2026-06-02T01:34:55Z [520ms] [Fatal] [carb.crashreporter-breakpad.plugin] 011: libomni.usd.so!void std::vector<unsigned long, d long> > >, unsigned long&&)+0x58ce (??:?)
2026-06-02T01:34:55Z [528ms] [Fatal] [carb.crashreporter-breakpad.plugin] 012: libomni.usd.so!void std::vector<unsigned long, d long> > >, unsigned long&&)+0x5c5a (??:?)
2026-06-02T01:34:55Z [540ms] [Fatal] [carb.crashreporter-breakpad.plugin] 013: libcarb.tasking.plugin.so!std::thread::_State_ier*, carb::tasking::Fiber*, carb::tasking::Scheduler::TrackedThread*> > >::_M_run()+0xa3a7 (??:?)
2026-06-02T01:34:55Z [550ms] [Fatal] [carb.crashreporter-breakpad.plugin] 014: libcarb.tasking.plugin.so!void std::vector<int,
2026-06-02T01:34:55Z [561ms] [Fatal] [carb.crashreporter-breakpad.plugin] 015: libcarb.tasking.plugin.so!void std::vector<int,
2026-06-02T01:34:55Z [572ms] [Fatal] [carb.crashreporter-breakpad.plugin] 016: libcarb.tasking.plugin.so!make_fcontext+0x39 (?
Segmentation fault (core dumped)
(forkvla) wli@dl:~/Yifan$ [11.135s] app ready
(forkvla) wli@dl:~/Yifan$ [11.135s] app readyashreporter-breakpad.plugin] [crash] A crash has occurred. If a debugger should 2026-06-02T01:34:54Z [0ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] A crash has occurred. If a debugger should be attached, please set the ‘/crashreporter/debuggerAttachTimeoutMs’ setting to a timeout in milliseconds. This can be used to allow the crash reporter to wait for up to that long for a debugger to attach before processing or sending the crash report. 026-06-02T01:34:54Z [94ms] [Error] [carb.crashreporter-breakpad.plugin] [crash] dump file size is 6619768 bytes, file is
2026-06-02T01:34:54Z [87ms] [Error] [carb.crashreporter-breakpad.plugin] [crash] Wrote dump file '/home/wli/Yifan/isaacsim/kit/data/Kit/Isaac-Sim Python/5.1/152cbb1c-6265-42c1-f0d208b6-665d1c66.dmp’in] [crash] Crash metadata:
2026-06-02T01:34:54Z [92ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] preventing upload of minidump due to user opt-out: '/home/wli/Yifan/isaacsim/kit/data/Kit/Isaac-Sim Python/5.1/152cbb1c-6265-42c1-f0d208b6-665d1c66.dmp’release.9587.07f1
2026-06-02T01:34:54Z [94ms] [Error] [carb.crashreporter-breakpad.plugin] [crash] dump file size is 6619768 bytes, file is readable.2T01:34:54Z [108ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] DriverShaderCacheWrapper = ‘disabled’
2026-06-02T01:34:54Z [98ms] [Fatal] [carb.crashreporter-breakpad.plugin] [crash] Crash detected in pid 532629 thread 532661665
2026-06-02T01:34:54Z [100ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] Crash metadata:us = ‘0’
2026-06-02T01:34:54Z [102ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] BuildGitlabJobID = ‘209507876’
2026-06-02T01:34:54Z [104ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] CarbSdkVersion = '206.6+release.9587.07f17b1b.gl’02T01:34:54Z [119ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] RetryCount = ‘0’
2026-06-02T01:34:54Z [106ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] CrashTime = 'Tue Jun 2 01:34:54 2026 GMT’026-06-02T01:34:54Z [123ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UploadSuccessful = ‘0’
2026-06-02T01:34:54Z [108ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] DriverShaderCacheWrapper = ‘disabled’
2026-06-02T01:34:54Z [110ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] DumpId = '152cbb1c-6265-42c1-f0d208b6-665d1c66’6-02T01:34:54Z [129ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UserStoryStatus = 'Running in headless mo
2026-06-02T01:34:54Z [112ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] LastUploadStatus = '0’duction.229672.69cb
2026-06-02T01:34:54Z [114ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] ProductName = ‘OmniverseKit’
2026-06-02T01:34:54Z [116ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] PythonTracebackStatus = ‘‘py-spy’ successfully wrote info to ‘/home/wli/Yifan/isaacsim/kit/data/Kit/Isaac-Sim Python/5.1/152cbb1c-6265-42c1-f0d208b6-665d1c66.py.txt’’
2026-06-02T01:34:54Z [119ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] RetryCount = '0’n = ‘2024.1’
2026-06-02T01:34:54Z [121ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] StartupTime = ‘1780364082’n’
2026-06-02T01:34:54Z [123ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UploadSuccessful = ‘0’
2026-06-02T01:34:54Z [125ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UptimeSeconds = ‘12’
2026-06-02T01:34:54Z [127ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UserStory = ‘’ ‘’
2026-06-02T01:34:54Z [129ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] UserStoryStatus = ‘Running in headless mode; not gathering user story’[Warning] [carb.crashreporter-breakpad.plugin] [crash] buildCi = ‘gl’
2026-06-02T01:34:54Z [132ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] Version = '107.3.3+production.229672.69cbf6ad.gl’02T01:34:54Z [158ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildGitlabJobName = 'kit-build-release-l
2026-06-02T01:34:54Z [134ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] aftermath_enabled = ‘1’kit-build’
2026-06-02T01:34:54Z [136ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] aftermath_flags = ‘3’ = ‘34875094’
2026-06-02T01:34:54Z [138ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] aftermath_status = ‘auto-enabled’
2026-06-02T01:34:54Z [140ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] aftermath_version = ‘2024.1’
2026-06-02T01:34:54Z [143ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] appName = ‘Isaac-Sim Python’
2026-06-02T01:34:54Z [145ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] appState = ‘started’2’
2026-06-02T01:34:54Z [147ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] appVersion = ‘5.1.0’
2026-06-02T01:34:54Z [149ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] autoloadExts = '‘07.3.3’
2026-06-02T01:34:54Z [151ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildBranch = ‘production’= '206.6+releas
2026-06-02T01:34:54Z [153ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildCi = 'gl’rsion = '206.6+release.9587
2026-06-02T01:34:54Z [156ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildConfig = ‘release’/home//Y
2026-06-02T01:34:54Z [158ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildGitlabJobName = 'kit-build-release-linux-x86_64’1:34:54Z [185ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuId = 'Intel64 Family 6 Model 143 Stepp
2026-06-02T01:34:54Z [160ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildGitlabJobStage = ‘kit-build’
2026-06-02T01:34:54Z [162ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildGitlabPipelineID = ‘34875094’418Y’
2026-06-02T01:34:54Z [165ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildHash = ‘69cbf6ad’tal = ‘48’
2026-06-02T01:34:54Z [166ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildMajor = ‘107’1.000000’
2026-06-02T01:34:54Z [168ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildMinor = ‘3’uineIntel’
2026-06-02T01:34:54Z [170ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildNumber = ‘229672’)’
2026-06-02T01:34:54Z [172ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildPatch = ‘3’000x2560’
2026-06-02T01:34:54Z [174ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] buildVersion = ‘107.3.3’
2026-06-02T01:34:54Z [176ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] carboniteFrameworkVersion = ‘206.6+release.9587.07f17b1b.gl’Z [206ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] environmentName = ‘Individual’
2026-06-02T01:34:54Z [178ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] carboniteSdkVersion = ‘206.6+release.9587.07f17b1b.gl’:34:54Z [210ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_0 = ‘610.43’
2026-06-02T01:34:54Z [181ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] commandLine = ‘python3 /home//Yifan/scripts/collection/04_collect_scripted.py’ashreporter-breakpad.plugin] [crash] gpuDriver_2 = ‘610.43’
2026-06-02T01:34:54Z [183ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuCoreLimited = ‘96’’
2026-06-02T01:34:54Z [185ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuId = ‘Intel64 Family 6 Model 143 Stepping 8’6-02T01:34:54Z [221ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_5 = ‘610.43’
2026-06-02T01:34:54Z [187ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuLogicalCoresBareMetal = ‘96’
2026-06-02T01:34:54Z [189ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuName = ‘Intel(R) Xeon(R) Gold 5418Y’
2026-06-02T01:34:54Z [191ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuPhysicalCoresBareMetal = ‘48’
2026-06-02T01:34:54Z [193ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuUsageQuota = ‘-1.000000’
2026-06-02T01:34:54Z [195ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] cpuVendor = ‘GenuineIntel’
2026-06-02T01:34:54Z [197ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] desktopOrigin = ‘(0, 0)’’
2026-06-02T01:34:54Z [199ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] desktopSize = ‘4000x2560’
2026-06-02T01:34:54Z [201ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] displayCount = ‘1’974336’
2026-06-02T01:34:54Z [203ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] displayRes_0 = ‘4000x2560x32bit@0Hz’
2026-06-02T01:34:54Z [206ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] environmentName = ‘Individual’
2026-06-02T01:34:54Z [208ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] externalBuild = ‘1’ A6000’
2026-06-02T01:34:54Z [210ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_0 = ‘610.43’000’
2026-06-02T01:34:54Z [212ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_1 = ‘610.43’000’
2026-06-02T01:34:54Z [214ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_2 = ‘610.43’000’
2026-06-02T01:34:54Z [216ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_3 = ‘610.43’000’
2026-06-02T01:34:54Z [218ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_4 = ‘610.43’000’
2026-06-02T01:34:54Z [221ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_5 = ‘610.43’000’
2026-06-02T01:34:54Z [223ms] [Warning] [carb.crashreporter-breakpad.plugin] [crash] gpuDriver_6 = ‘610.43’

Screenshots or Videos

(If applicable, add screenshots or links to videos that demonstrate the issue)Additional Information

What I’ve Tried

I was collecting data using Isaac Sim 5.1 on a GPU cluster. Midway through the collection process, one of the GPUs went offline; after it came back online, the driver version had updated from 580 to 610, and since then, Isaac Sim has been crashing continuously.

@3024732340 is there any way you can revert to 580 and see if the crashing persists? there’s been report of invalidated driver past 580 on 5.1.0:

The docs update will ship with the upcoming 6.0 GA release, adding links to recommended driver versions in Isaac Sim Requirements — Isaac Sim Documentation.

Since the maintenance of the GPU cluster falls outside our scope of responsibility—and given that we have been advised that rolling back the GPU driver could potentially trigger another crash—are there any alternative approaches available?

Driver version 610.43.02 is not from the production branch.

You may want to try the recommended production branch versions stated in Technical Requirements — Omniverse Developer Guide.