Got a strange nvlink message "knvlinkDiscoverPostRxDetLinks_GH100 ", inference speed reduced from 30 tokens/s 10 tokens /s

user143432 · June 6, 2025, 4:16am

Hi, my pod (16 H100) inference speed dropped from 30 tokens/s to 10 tokens/s . while in the dmesg, got huge information like :
[235715.381986] NVRM: knvlinkDiscoverPostRxDetLinks_GH100: Input mask contains a GPU on which NVLink is disabled.
[235715.382022] NVRM: knvlinkDiscoverPostRxDetLinks_GH100: Input mask contains a GPU on which NVLink is disabled.
[235715.382053] NVRM: knvlinkDiscoverPostRxDetLinks_GH100: Input mask contains a GPU on which NVLink is disabled.
[235715.382085] NVRM: knvlinkDiscoverPostRxDetLinks_GH100: Input mask contains a GPU on which NVLink is disabled.
[235716.650295] NVRM: knvlinkDiscoverPostRxDetLinks_GH100: Input mask contains a GPU on which NVLink is disabled.
[235716.650348] NVRM: knvlinkDiscoverPostRxDetLinks_GH100: Input mask contains a GPU on which NVLink is disabled.
[235716.650380] NVRM: knvlinkDiscoverPostRxDetLinks_GH100: Input mask contains a GPU on which NVLink is disabled.
[235716.650411] NVRM: knvlinkDiscoverPostRxDetLinks_GH100: Input mask contains a GPU on which NVLink is disabled.

it seems every seconds there are 5 more this message. I cant find any information from google. please help me

Topic		Replies	Views
Low GPU Utilization during inference DeepStream SDK gpu	4	1384	October 12, 2021
GPU loss while running very simple deep learning code possibly memory based Linux	4	989	February 1, 2019
How to measure the performance of NVLINK while I running HPL DGX User Forum hw	0	3091	July 19, 2021
NVLink Not Active on Quadro RTX A5000 Pair Despite Physical Connection Linux	0	195	August 26, 2024
The NVIDIA GPU installed in this system has fallen off the bus and is not responding to commands Linux ubuntu	2	1136	May 27, 2022
NvLink (V100) GPU - Hardware	4	1906	October 12, 2021
Very poor video playback performance on a GTX 1080 Ti General	2	1720	October 12, 2021
GPU fallen off bus Linux ubuntu , gpu , debugging-and-troubleshooting	2	1258	May 27, 2022
Deepstream pipeline low frame rate when rtsp stream is used DeepStream SDK	10	465	June 18, 2022
GPU has fallen off the bus Linux	7	9851	September 12, 2023

Got a strange nvlink message "knvlinkDiscoverPostRxDetLinks_GH100 ", inference speed reduced from 30 tokens/s 10 tokens /s

Related topics