Hi GDS team,
I have a quick question regarding GDS profiling and performance monitoring. I’m working with a DGX-A100 setup that includes 8 GPUs and multiple SSDs. Currently, I’m able to monitor global GDS activity across all SSDs via `/proc/driver/nvidia-fs/stats`, which works well. 1. GPUDirect Storage Installation and Troubleshooting Guide — GPUDirect Storage Installation and Troubleshooting Guide
However, I’m wondering if it’s possible to obtain GDS statistics at the per-SSD level—i.e., with SSD-level granularity. Is there any way to break down GDS performance metrics by individual SSD?
Here is output of `/proc/driver/nvidia-fs/stats:
GDS Version: 1.7.1.12
NVFS statistics(ver: 4.0)
NVFS Driver(version: 2.17.3)
Mellanox PeerDirect Supported: True
IO stats: Enabled, peer IO stats: Disabled
Logging level: info
Active Shadow-Buffer (MiB): 0
Active Process: 0
Batches : n=0 ok=0 err=0 Avg-Submit-Latency(usec)=0
Reads : n=174140085 ok=174140076 err=9 readMiB=298128271 io_state_err=0
Reads : Bandwidth(MiB/s)=5039 Avg-Latency(usec)=944
Sparse Reads : n=0 io=0 holes=0 pages=0
Writes : n=0 ok=0 err=0 writeMiB=0 io_state_err=0 pg-cache=0 pg-cache-fail=0 pg-cache-eio=0
Writes : Bandwidth(MiB/s)=0 Avg-Latency(usec)=0
Mmap : n=50751 ok=50751 err=0 munmap=50751
Bar1-map : n=50751 ok=50751 err=0 free=50751 callbacks=0 active=0 delay-frees=0
Error : cpu-gpu-pages=0 sg-ext=0 dma-map=0 dma-ref=0
Ops : Read=0 Write=0 BatchIO=0
GPU 0000:4e:00.0 uuid:15e58378-e584-57e2-fe70-5bb20adcf902 : Registered_MiB=0 Cache_MiB=0 max_pinned_MiB=1024
GPU 0000:b7:00.0 uuid:b0e0689b-08ff-bdd9-c14a-2cbcc0004d3b : Registered_MiB=0 Cache_MiB=0 max_pinned_MiB=1024
GPU 0000:47:00.0 uuid:1ef889d3-1332-a22e-88bc-54910f4e754a : Registered_MiB=0 Cache_MiB=0 max_pinned_MiB=1024
GPU 0000:bd:00.0 uuid:5b0c7adb-6993-9760-39d6-f0ac1fb3cf97 : Registered_MiB=0 Cache_MiB=0 max_pinned_MiB=1024
GPU 0000:07:00.0 uuid:fbcffbe0-c1fa-e698-919e-feefc9323321 : Registered_MiB=0 Cache_MiB=0 max_pinned_MiB=1024
GPU 0000:90:00.0 uuid:72f859cf-3444-d3e5-bbb3-8fd8204f7dc1 : Registered_MiB=0 Cache_MiB=0 max_pinned_MiB=1024
GPU 0000:0f:00.0 uuid:07b5e8e7-7cd3-3e22-9300-35bae8829939 : Registered_MiB=0 Cache_MiB=0 max_pinned_MiB=1024
GPU 0000:87:00.0 uuid:61738c82-fd4e-4799-9c10-fee20c7b63da : Registered_MiB=0 Cache_MiB=0 max_pinned_MiB=1024