NVIDIA Sync 0.64.24 marks direct LAN DGX Spark disconnected while nvsync tunnel remains RUNNING

NVIDIA Sync Escalation Report

Suggested Forum / Support Title

NVIDIA Sync 0.64.24 marks direct LAN DGX Spark disconnected while nvsync tunnel remains RUNNING

Summary

NVIDIA Sync 0.64.24 on Windows repeatedly marks a DGX Spark direct LAN connection as disconnected after roughly 45-70 seconds, even though the underlying nvsync-amd64.exe tunnel process remains running and its local forwarded ports remain reachable.

This was reproduced after a clean NVIDIA Sync profile reset, with no Tailscale device, no .local hosts-file workaround, and a direct LAN device added by IP address.

Environment

  • Client: Windows 11 Pro laptop NEELTJE
  • NVIDIA Sync: 0.64.24
  • DGX Spark hostname: spark-fab4
  • DGX Spark LAN IP: 192.168.20.10
  • Laptop LAN IP observed by Spark: 192.168.40.117
  • Connection type tested: Direct LAN SSH
  • Tailscale integration: disabled / not used for the clean reproduction
  • Device entry used for clean reproduction:
    • Name: spark-fab4
    • Hostname/IP: 192.168.20.10
    • User: brad
    • Port: 22

Expected Behavior

Per NVIDIA Sync documentation, after adding and connecting a device, Sync should remain connected and allow custom applications to stay bound until stopped or disconnected. Opening another custom app should not cause the device to reconnect and tear down existing app tunnels.

Actual Behavior

  1. NVIDIA Sync connects to 192.168.20.10.
  2. The desktop log shows Sync calling:
    • connect 192.168.20.10 --detach
    • open 192.168.20.10 11000
  3. nvsync-amd64.exe status 192.168.20.10 reports RUNNING.
  4. Local ports remain reachable:
    • 127.0.0.1:11000
    • 127.0.0.1:11002
  5. After about 45-70 seconds, NVIDIA Sync desktop state flips the device back to disconnected.
  6. Clicking the tray icon often shows Connect again.
  7. Reconnecting or opening another custom app can tear down previously opened custom app tunnels. DGX Dashboard often survives.

Key Evidence

nvsync.log shows the tunnel starts:

hostname=“192.168.20.10” message=“Dialing remote machine”
duration=“45s” message=“Starting keepalive goroutine”
port=11002 message=“Port tunnel opened and ready”
port=11000 message=“Port tunnel opened and ready”

Then the health/status path fails:

error=“ssh: rejected: connect failed (Connection refused)”
message=“Could not dial localhost:11002”

At the same time, CLI status still reports the tunnel as running:

{
“status”: “RUNNING”,
“error”: “”,
“dashboard_installed”: true,
“tailscale”: {
“enrolled”: false
},
“ports”: {
“11000”: {
“status”: “OPENED”,
“error”: “”
},
“11002”: {
“status”: “OPENED”,
“error”: “”
}
}
}

But state-store.json is updated to:

device status: disconnected
hostname: 192.168.20.10
tailscaleEnrolled: false
dashboardInstalled: true

Spark-Side Checks

Spark services are stable and listening:

127.0.0.1:11000 DGX Dashboard
0.0.0.0:12000 Open WebUI
127.0.0.1:12001 Spark LLM Control
0.0.0.0:12002 Spark Files
0.0.0.0:13001 Spark LLM Ops

Spark does not appear to have a listener on remote localhost:11002, which seems to be the path NVIDIA Sync uses for connection health/status. That refusal appears to cause the desktop app to mark the device disconnected even though the local tunnel remains running.

Steps Already Tried

  • Reset NVIDIA Sync profile completely.
  • Reinstalled/repaired NVIDIA Sync 0.64.24 from cached installer.
  • Removed stale FQDN/Tailscale device entries.
  • Removed spark-fab4.local hosts-file workaround.
  • Added device fresh using direct LAN IP 192.168.20.10.
  • Confirmed LAN SSH works independently.
  • Confirmed 192.168.20.10:22 is reachable from Windows.
  • Confirmed Tailscale is not part of the clean reproduction.
  • Confirmed nvsync-amd64.exe status 192.168.20.10 reports RUNNING while desktop state says disconnected.

Diagnostic Bundle

Local diagnostic zip:

C:\Users\BradleyMarshall\AppData\Local\Temp\nvidia-sync-reconnect-bug-20260525-161226.zip

The bundle contains:

  • NVIDIA Sync logs
  • NVIDIA Sync config snapshot
  • Process command lines
  • CLI status output
  • Relevant hosts-file lines

Request

Please advise whether this is a known NVIDIA Sync 0.64.24 issue with the localhost:11002 health/status path, or whether a Spark-side Sync/Workbench service is expected to be listening on remote 11002.

The main question:

Why does NVIDIA Sync desktop mark a direct LAN device disconnected when nvsync-amd64.exe status still reports the tunnel as RUNNING and local forwarded ports remain reachable?

nvidia-sync-reconnect-bug-20260525-161226.zip (24.0 KB)