NSight Systems 2024.4.1 fails to profile on multi-CPU target machine

After installing NSight Systems 2024.4.1 I’ve run into an issue where starting a profiling session by creating a project and then selecting my dual CPU workstation as the target leads to a communication error message being shown.

After pressing the “More Info …” button I get a message box with the following contents

— start of message box content —

RuntimeError (120) {
OriginalExceptionClass: struct boost::wrapexcept
OriginalFile: C:\dvs\p4\build\sw\devtools\Agora\Rel\QuadD_Main\QuadD\Host\Analysis\LocalWindowsDevice.cpp
OriginalLine: 341
OriginalFunction: class std::shared_ptr __cdecl QuadDAnalysis::LocalWindowsDevice::GetAgentCaps(void)
ErrorText: Agent capabilities request failed: GetLogicalProcessorInformationEx num of cores 72 does not match GetSystemInfo num of cores 36.
}

— end of message box content —

It seems that this version of NSight Systems has a problem with determining the correct amount of cores/threads for this dual 18-core/36 threads CPU machine.

I had to go back to NSight Systems 2023.4.1 before I found a version that would still correctly identify this Workstation and would allow me to profile our CUDA based application.

This workstation is a HP Z8 with two Intel Xeon Gold 6154 CPUs

On another workstation (a HP Z4 with one Intel Xeon W 2145 8 core/16 threads CPU) 2024.4.1 works without this issue, so it seems to be related to multi-socket CPU systems.

@Andrey_Trachenko to respond or reassign.

Thank you for reporting this issue! We are looking at it under an internal bug number 4745040. Looks like most of our multi-socket testing was happening on Linux.

Hi Andrey,

Any updates regarding a fix?

I would like to keep all the systems on the same version of NSight Systems for profiling purposes.