DGX Spark: Ethernet connection unstable after November 2025 update – EEE (Energy Efficient Ethernet) workaround

Summary

After applying a recent OTA update (end of November 2025), the Ethernet connection on my NVIDIA DGX Spark became unstable:

  • Ethernet link comes up normally

  • Works for a few minutes

  • Then traffic silently stalls or packets drop

  • NetworkManager retries repeatedly

  • SSH and NVIDIA Sync disconnect

A reboot temporarily restores connectivity, but the issue reliably returns after a short time.

The same hardware and setup worked without issues for weeks before this update.


Setup

  • Device: NVIDIA DGX Spark

  • Connection: Direct PC ↔ Spark via Ethernet (no router, no switch)

  • NIC: enP7s7

  • Use case: SSH access, NVIDIA Sync, development workloads


Software Versions

cat /etc/dgx-release
uname -r
nmcli --version

DGX Release:

DGX_NAME="DGX Spark"
DGX_PRETTY_NAME="NVIDIA DGX Spark"
DGX_SWBUILD_VERSION="7.2.3"
DGX_SWBUILD_DATE="2025-09-10"
DGX_COMMIT_ID="833b4a7"

OTA Update Applied:

DGX_OTA_VERSION="7.3.1"
DGX_OTA_DATE="Sun Nov 30 16:46:02 CST 2025"

Kernel:

6.14.0-1015-nvidia

NetworkManager:

nmcli tool, version 1.46.0


Symptoms

  • Ethernet reports connected, but traffic freezes after a few minutes

  • Ping starts dropping or stalls completely

  • NetworkManager logs show repeated activation failures

  • NVIDIA Sync cannot reliably discover or maintain a connection to the Spark

Notable observation:
SSH sometimes continued to work via IPv6 link-local (fe80::/64), even when IPv4 connectivity appeared broken. This suggested a lower-layer issue, not a routing or application-level problem.


Root Cause (Very Likely)

Energy Efficient Ethernet (EEE) was enabled on the Spark Ethernet interface.

After the OTA update (7.3.1), EEE appears to enter low-power states too aggressively, leading to:

  • Link instability

  • Packet loss

  • Silent connection freezes after a short time

This strongly points to a PHY / power-management regression, not a DHCP or NetworkManager misconfiguration.


How to Verify

Check EEE status:

ethtool --show-eee enP7s7

In my case:

EEE status: active


Immediate Fix (Test)

Disable EEE manually:

sudo ethtool --set-eee enP7s7 eee off

Result:

  • Ethernet connection becomes stable

  • No further freezes or packet loss

  • SSH remains reliable

  • NVIDIA Sync works again


Permanent Fix (Survives Reboot)

Disable EEE on every boot using systemd:

sudo tee /etc/systemd/system/disable-eee.service << 'EOF'
[Unit]
Description=Disable EEE on Ethernet
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
ExecStart=/sbin/ethtool --set-eee enP7s7 eee off

[Install]
WantedBy=multi-user.target
EOF

sudo systemctl daemon-reload
sudo systemctl enable disable-eee.service

After reboot, verify:

ethtool --show-eee enP7s7

EEE should remain disabled and Ethernet stability is restored.


Notes

  • Issue occurs regardless of DHCP vs static IP
  • Reproducible on direct PC ↔ Spark connections
  • IPv6 link-local may continue working and mask the underlying issue
  • Disabling EEE fully resolves the problem in this environment

Conclusion

If your DGX Spark Ethernet:

  • works initially
  • then freezes or drops connections after a few minutes
  • especially after OTA update 7.3.1

Check and disable Energy Efficient Ethernet (EEE) first. This fully resolved the issue in my case.


In my case:

➜ ~ ethtool --show-eee enP7s7
EEE settings for enP7s7:
EEE status: enabled - inactive
Tx LPI: 19 (us)
Supported EEE link modes: 100baseT/Full
1000baseT/Full
10000baseT/Full
2500baseT/Full
5000baseT/Full
Advertised EEE link modes: 100baseT/Full
1000baseT/Full
10000baseT/Full
2500baseT/Full
5000baseT/Full
Link partner advertised EEE link modes: Not reported

I’m not sure if it’s useful info, but I haven’t had any issues like this. Mine is connected to a NETGEAR GS208-100UKS 8 Port Gigabit (through ~100ft of CAT6 cable). I have all patches installed.

danny@toad:~$ ethtool --show-eee enP7s7
EEE settings for enP7s7:
        EEE status: enabled - active
        Tx LPI: 12 (us)
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
                                   10000baseT/Full
                                   2500baseT/Full
                                   5000baseT/Full
        Advertised EEE link modes:  100baseT/Full
                                    1000baseT/Full
                                    10000baseT/Full
                                    2500baseT/Full
                                    5000baseT/Full
        Link partner advertised EEE link modes:  100baseT/Full
                                                 1000baseT/Full
danny@toad:~$ cat /etc/dgx-release
DGX_NAME="DGX Spark"
DGX_PRETTY_NAME="NVIDIA DGX Spark"
DGX_SWBUILD_DATE="2025-09-10-13-50-03"
DGX_SWBUILD_VERSION="7.2.3"
DGX_COMMIT_ID="833b4a7"
DGX_PLATFORM="DGX Server for KVM"
DGX_SERIAL_NUMBER="1983725000686"

DGX_OTA_VERSION="7.3.1"
DGX_OTA_DATE="Wed Nov 19 16:06:37 GMT 2025"
danny@toad:~$ uname -r
6.14.0-1013-nvidia
danny@toad:~$ nmcli --version
nmcli tool, version 1.46.0

@mirko.kubin what OS are you using on your PC? You might want to look to the PC side instead of Spark.

Good point, agreed – sender and receiver both matter.

PC is running Windows 11, direct Ethernet connection (no switch/router in between).
The same PC ↔ Spark setup worked reliably for weeks before the recent Spark OTA update.

The instability only appeared after the Spark update and disappeared immediately once EEE was disabled on the Spark NIC.

I can’t fully rule out PC-side influence, but the behavior was reproducible on the Spark side and fully resolved there.

Check the upgrade instructions here: OS and Component Update Guide — DGX Spark User Guide.

I use the manual update via SSH.

I wonder if your firmware is updated?

Firmware was updated via fwupdmgr during the OTA (DGX OS 7.3.1). fwupdmgr get-devices shows EC/UEFI/USB-C PD updates succeeded. Ethernet management NIC is Realtek r8127 (enP7s7). After the update I saw intermittent link resets (r8127: enP7s7: link down/up) and the connection would freeze after a few minutes. Disabling EEE on enP7s7 stabilized it.

PC is Windows (SSH client works reliably). The Spark-side kernel log shows the Ethernet NIC driver (r8127) reporting repeated link down/up events around the time of failures, so this looks like a link/driver/power-management issue on the Spark NIC path rather than an SSH client issue.

@mirko.kubin what is your Ethernet driver version? Here is output using kernel 6.14.0-1015-nvidia:

elsaco@spark1:~$ ethtool -i enP7s7
driver: r8127
version: 11.014.00-NAPI
firmware-version:
expansion-rom-version:
bus-info: 0007:01:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

Looks like you’re experiencing link flapping!

elsaco@spark1:~$ ethtool --show-eee enP7s7
EEE settings for enP7s7:
        EEE status: enabled - active
        Tx LPI: 39 (us)
        Supported EEE link modes:  100baseT/Full
                                   1000baseT/Full
                                   10000baseT/Full
                                   2500baseT/Full
                                   5000baseT/Full
        Advertised EEE link modes:  100baseT/Full
                                    1000baseT/Full
                                    10000baseT/Full
                                    2500baseT/Full
                                    5000baseT/Full
        Link partner advertised EEE link modes:  100baseT/Full
                                                 1000baseT/Full
                                                 2500baseT/Full

Check the Tx LPI counter if it shows zero. Also check your link partner EEE modes. Both ends must agree to a set setting in order to synchronize. Connected to a switch and didn’t experience link flapping with any Spark.

Try booting with an older kernel, the one you were using before the updates and see if the issue reproduces. i.e. vmlinux-6.14.0-1013-nvidia or vmlinuz-6.11.0-1016-nvidia if you didn’t remove them already. Resetting the network cable might help also!

Thanks, that matches my setup.

Yes, same driver and kernel on my side (r8127 11.014.00-NAPI, 6.14.0-1015-nvidia).
The key difference in my case seems to be the link partner:

  • Earlier, my Spark reported
    Link partner advertised EEE link modes: Not reported

  • EEE showed as enabled but inactive

  • I’m running direct PC ↔ Spark, no switch in between

With that setup I observed repeated link down / link up events in dmesg.
Disabling EEE on the Spark (ethtool --set-eee enP7s7 eee off) stopped the flapping completely.

I agree that with a switch negotiating EEE properly this likely doesn’t reproduce. The direct-link case seems to be the edge condition here.

Hi @mirko.kubin, thanks for reporting this. Can you give more details of your host NIC so we can attempt to reproduce this issue?

Hi, thanks for the follow-up. Here are the exact host NIC details.

Host system (PC side):

  • OS: Windows 11 Pro

  • Host NIC: Intel(R) Ethernet Controller (3) I225-V (onboard, 2.5GbE)

  • There are two I225-V interfaces present on the system (both tested)

  • Connection type: direct PC ↔ DGX Spark (no switch, no router in between)

Topology at time of issue (Scenario A – original):

  • Direct Ethernet cable between PC (Intel I225-V) and DGX Spark (r8127)

  • No DHCP server on the link

  • ISP router is powered off nightly (hard power cut)

  • After the Spark update, the link started flapping or ended up with broken IPv4 after power-on

  • Communication often fell back to IPv6 link-local (fe80::)

Topology at time of issue (Scenario B – current):

  • Spark LAN + PC LAN → ASUS RT-BE880 → ISP router (Kölbi/Huawei)

  • ISP router is powered off nightly, ASUS stays powered

  • In the morning, Spark sometimes ends up with IPv4 link-local (169.254.x.x) until link reset / reconnect

  • This behavior became reproducible after recent updates

Observations:

  • Spark NIC: Realtek r8127, driver 11.014.00-NAPI (in-tree), kernel 6.14.0-1015-nvidia

  • ethtool --show-eee enP7s7 shows EEE enabled but often inactive, with link partner EEE modes not reported

  • Disabling EEE on the Spark side stabilizes the link in direct-link scenarios

  • Issue does not reproduce consistently when both sides are behind the ASUS router with stable upstream connectivity

This suggests a potential EEE / link negotiation / state recovery edge case between Realtek r8127 and Intel I225-V, especially in power-cycle and DHCP-loss scenarios.