ASUS GX10 ConnectX-7 Will not recognize QSPF-112 cable plugin

So still having issues with ASUS GX10’s not recognizing when I plug the QSPF cables in to bring the ConnectX-7 ports online. It’s like it never sees the cables plugging in. I’ve tried both amphenol and naddod officially supported cables. This is happening on 3 separate GX10s all with the latest BIOS and OS kernel and drivers. Anyone else seeing this issue on GX10s? ASUS support has been less than stellar asking me to run windows diagnostic tools and repeated asks that I verify the BIOS facepalm then repeated we are escalating to our BIOS/Driver team contact you in 1-2 business days and never getting anything back. All 3 run clean with no errors on the fieldiag tests. all 3 come from different production lots so highly doubting hardware failure. NADDOD RMA’d their cable tested it clean, tested a new one on their DGX and shipped me a known good cable.

Any advice or suggestions greatly appreciated

Hi @robert287,

Thanks for all the detail here – I know this is a painful one to debug.

First step that will really help on our side is a full nvidia-bug-report bundle, since that pulls in dmesg, PCI info, driver versions, etc.:

sudo nvidia-bug-report.sh

This will produce a .gz file in your current directory; you can attach that archive to the thread.

A couple of additional details that are useful alongside the bug report:

  1. Environment

    • DGX Spark OS / Ubuntu version and kernel version you’re running.

    • Confirmation that these are stock DGX Spark GX10 systems (no custom OFED/DOCA or kernel modules added).

  2. Link / module behavior on the CX7 port

    • Output of:

      ip a
      
      

      and then:

      sudo ethtool enP2p1s0f1np1
      
      

      (replace enP21s08f1np1 with whatever the CX7 interface name is on your system) before and after you plug in the QSFP cable, so we can see whether the OS ever reports module presence or link changes.

  3. Cables

    • Exact part numbers for the Amphenol and NADDOD cables you’ve tried.

    • For DGX Spark stacking we currently validate against the cables listed in the DGX Spark User Guide:

      • Amphenol NJAAKK‑N911 (QSFP to QSFP112, 32AWG, 400 mm, LSZH) and NJAAKK‑0006 (0.5 m version)

      • Luxshare LMTQF022‑SD‑R (QSFP112 400G DAC cable, 400 mm, 30 AWG)

      Ref: “Spark Stacking” in the DGX Spark User Guide. If your Amphenol cable is one of these PNs (or you have access to one of the listed parts), that data point is very helpful when we look at the logs.

With the nvidia-bug-report plus the interface / cable details, we can see whether the issue is “module never detected” vs. “module detected but link never comes up” and route it appropriately.

So yes stock units no mods

1 note to even get the interfaces to show up I had to disable the hotplug by moving it to .bak and touching an empty file

Linux node-01 6.17.0-1014-nvidia #14-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 17 19:01:40 UTC 2026 aarch64 aarch64 aarch64 GNU/Linux

Distributor ID: Ubuntu

Description: Ubuntu 24.04.4 LTS

Release: 24.04

Codename: noble

ip -a

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

   valid_lft forever preferred_lft forever

inet6 ::1/128 scope host noprefixroute 

   valid_lft forever preferred_lft forever

2: enP7s7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000

link/ether 30:c5:99:3f:0f:17 brd ff:ff:ff:ff:ff:ff

altname enP7p1s0

inet 10.0.0.1/24 brd 10.0.0.255 scope global noprefixroute enP7s7

   valid_lft forever preferred_lft forever

inet6 fe80::4e6b:11eb:95d2:e621/64 scope link noprefixroute 

   valid_lft forever preferred_lft forever

3: enp1s0f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000

link/ether 30:c5:99:3f:0f:18 brd ff:ff:ff:ff:ff:ff

4: enp1s0f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000

link/ether 30:c5:99:3f:0f:19 brd ff:ff:ff:ff:ff:ff

5: enP2p1s0f0np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000

link/ether 30:c5:99:3f:0f:1c brd ff:ff:ff:ff:ff:ff

6: enP2p1s0f1np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000

link/ether 30:c5:99:3f:0f:1d brd ff:ff:ff:ff:ff:ff

7: wlP9s9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000

link/ether 50:bb:b5:a4:5e:06 brd ff:ff:ff:ff:ff:ff

altname wlP9p1s0

inet 192.168.68.56/22 brd 192.168.71.255 scope global dynamic noprefixroute wlP9s9

   valid_lft 4617sec preferred_lft 4617sec

inet6 fdca:d7e:9794:455d:287d:1dbc:9064:8810/64 scope global temporary dynamic 

   valid_lft 1667sec preferred_lft 1667sec

inet6 fdca:d7e:9794:455d:552:d94a:ded0:d890/64 scope global dynamic mngtmpaddr noprefixroute 

   valid_lft 1667sec preferred_lft 1667sec

inet6 fe80::6800:1ed4:ff30:5aa9/64 scope link noprefixroute 

   valid_lft forever preferred_lft forever

8: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default

link/ether 2a:e3:ef:7f:00:55 brd ff:ff:ff:ff:ff:ff

inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0

   valid_lft forever preferred_lft forever

inet6 fe80::28e3:efff:fe7f:55/64 scope link 

   valid_lft forever preferred_lft forever

#######################################

sudo ethtool enP2p1s0f1np1

Settings for enP2p1s0f1np1:

Supported ports: [ ]

Supported link modes: 1000baseT/Full

                    10000baseT/Full

                    1000baseKX/Full

                    10000baseKR/Full

                    10000baseR_FEC

                    40000baseKR4/Full

                    40000baseCR4/Full

                    40000baseSR4/Full

                    40000baseLR4/Full

                    25000baseCR/Full

                    25000baseKR/Full

                    25000baseSR/Full

                    50000baseCR2/Full

                    50000baseKR2/Full

                    100000baseKR4/Full

                    100000baseSR4/Full

                    100000baseCR4/Full

                    100000baseLR4_ER4/Full

                    50000baseSR2/Full

                    1000baseX/Full

                    10000baseCR/Full

                    10000baseSR/Full

                    10000baseLR/Full

                    10000baseER/Full

                    50000baseKR/Full

                    50000baseSR/Full

                    50000baseCR/Full

                    50000baseLR_ER_FR/Full

                    50000baseDR/Full

                    100000baseKR2/Full

                    100000baseSR2/Full

                    100000baseCR2/Full

                    100000baseLR2_ER2_FR2/Full

                    100000baseDR2/Full

                    200000baseKR4/Full

                    200000baseSR4/Full

                    200000baseLR4_ER4_FR4/Full

                    200000baseDR4/Full

                    200000baseCR4/Full

                    100000baseKR/Full

                    100000baseSR/Full

                    100000baseLR_ER_FR/Full

                    100000baseCR/Full

                    100000baseDR/Full

                    200000baseKR2/Full

                    200000baseSR2/Full

                    200000baseLR2_ER2_FR2/Full

                    200000baseDR2/Full

                    200000baseCR2/Full

Supported pause frame use: Symmetric

Supports auto-negotiation: Yes

Supported FEC modes: None RS BASER

Advertised link modes: 1000baseT/Full

                    10000baseT/Full

                    1000baseKX/Full

                    10000baseKR/Full

                    10000baseR_FEC

                    40000baseKR4/Full

                    40000baseCR4/Full

                    40000baseSR4/Full

                    40000baseLR4/Full

                    25000baseCR/Full

                    25000baseKR/Full

                    25000baseSR/Full

                    50000baseCR2/Full

                    50000baseKR2/Full

                    100000baseKR4/Full

                    100000baseSR4/Full

                    100000baseCR4/Full

                    100000baseLR4_ER4/Full

                    50000baseSR2/Full

                    1000baseX/Full

                    10000baseCR/Full

                    10000baseSR/Full

                    10000baseLR/Full

                    10000baseER/Full

                    50000baseKR/Full

                    50000baseSR/Full

                    50000baseCR/Full

                    50000baseLR_ER_FR/Full

                    50000baseDR/Full

                    100000baseKR2/Full

                    100000baseSR2/Full

                    100000baseCR2/Full

                    100000baseLR2_ER2_FR2/Full

                    100000baseDR2/Full

                    200000baseKR4/Full

                    200000baseSR4/Full

                    200000baseLR4_ER4_FR4/Full

                    200000baseDR4/Full

                    200000baseCR4/Full

                    100000baseKR/Full

                    100000baseSR/Full

                    100000baseLR_ER_FR/Full

                    100000baseCR/Full

                    100000baseDR/Full

                    200000baseKR2/Full

                    200000baseSR2/Full

                    200000baseLR2_ER2_FR2/Full

                    200000baseDR2/Full

                    200000baseCR2/Full

Advertised pause frame use: Symmetric

Advertised auto-negotiation: Yes

Advertised FEC modes: Not reported

Speed: Unknown!

Duplex: Unknown! (255)

Auto-negotiation: on

Port: Other

PHYAD: 0

Transceiver: internal

Supports Wake-on: d

Wake-on: d

Link detected: no (No cable)

###########################################################
Amphenol – NJAAKK-N911

NDD Q112-400G-CU0-5

##############################################
The system never ever recognizes the cable plugging in. I’ve tried disconnecting power and holding power button to discharge any power. NDD tested the orginal cable they provided and then they tested a new cable on their pair of DGX Sparks and worked fine
###################################################
I also have the output from a fieldiag run from single user mode though everything flagged as passed and doesn’t look like it does any network device testing

##################################################

I ran some probes in iomem stat vs the addresses that look like they are being called when the system probes the PCI bus and looks like a memory address space is match.

0x05170000-0x051cffff (NVDA8800)

  1. 0xc8000000-0xd7ffffff (NVDA8900)

Looking closely at the iomem dump. Neither of those addresses exists in the kernel’s memory map.

  • There is a massive gap between 0b39ffff and 1002d000.

  • The 0x05170000 address is completely missing from the physical memory space.

  • The 0xc8000000 address is also nowhere to be found (the dump ends around 647fffff).

looks The ASUS BIOS ACPI tables are instructing the ConnectX-7 security enclave to map its management mailboxes (DOE) to physical memory addresses that do not exist on the motherboard.

When the CX7 tries to reach out to that non-existent memory, it returns the -5 (EIO) error, enters a hard panic, and permanently cuts power to the QSFP cages. This is why no pci=realloc or IOMMU bypass will work—you can’t reallocate memory that the physical silicon lacks.

*********START iomem dump *******************************

rob@spark-node1:~$ sudo cat /proc/iomem | grep -A 5 -B 5 -iE “05170000|c8000000|NVDA”

0b316000-0b316003 : MTKW9002:00
0b316004-0b316007 : MTKW9002:00
0b316010-0b316013 : MTKW9002:00
0b316018-0b31601b : MTKW9002:00
0b39f600-0b39ffff : MTKW9002:00
1002d000-1002dfff : NVDA9221:00
1002d000-1002dfff : NVDA9221:00 NVDA9221:00
10200000-10201fff : DRAM8901:00
10206000-10207fff : DRAM8901:00
10208000-10209fff : DRAM8901:00
12410000-12410fff : NVDA9221:00
12440000-12440fff : NVDA9221:00
12440000-12440fff : NVDA9221:00 NVDA9221:00
12460000-12460fff : NVDA9221:00
12460000-12460fff : NVDA9221:00 NVDA9221:00
12800000-12800fff : NVDA9221:00
12830000-12830fff : NVDA9221:00
12830000-12830fff : NVDA9221:00 NVDA9221:00
12850000-12850fff : NVDA9221:00
12850000-12850fff : NVDA9221:00 NVDA9221:00
12870000-12870fff : NVDA9221:00
12870000-12870fff : NVDA9221:00 NVDA9221:00
12890000-12890fff : NVDA9221:00
12890000-12890fff : NVDA9221:00 NVDA9221:00
128b0000-128b0fff : NVDA9221:00
128b0000-128b0fff : NVDA9221:00 NVDA9221:00
12a50000-12a50fff : NVDA9221:00
12a50000-12a50fff : NVDA9221:00 NVDA9221:00
12e00000-12e00fff : NVDA9221:00
12e30000-12e30fff : NVDA9221:00
12e30000-12e30fff : NVDA9221:00 NVDA9221:00
13000000-1301ffff : arm-smmu-v3.1.auto
13000000-13000dff : arm-smmu-v3.1.auto
13002000-13002fff : arm-smmu-v3-pmcg.10.auto
13002000-13002fff : arm-smmu-v3-pmcg.10.auto arm-smmu-v3-pmcg.10.auto
13010000-13010dff : arm-smmu-v3.1.auto

130d2000-130d2fff : arm-smmu-v3-pmcg.15.auto arm-smmu-v3-pmcg.15.auto

130e2000-130e2fff : arm-smmu-v3-pmcg.16.auto
130e2000-130e2fff : arm-smmu-v3-pmcg.16.auto arm-smmu-v3-pmcg.16.auto
130f2000-130f2fff : arm-smmu-v3-pmcg.16.auto
130f2000-130f2fff : arm-smmu-v3-pmcg.16.auto arm-smmu-v3-pmcg.16.auto
13630000-13630fff : NVDA9221:00
13630000-13630fff : NVDA9221:00 NVDA9221:00
13800000-1381ffff : arm-smmu-v3.0.auto
13800000-13800dff : arm-smmu-v3.0.auto
13802000-13802fff : arm-smmu-v3-pmcg.3.auto
13802000-13802fff : arm-smmu-v3-pmcg.3.auto arm-smmu-v3-pmcg.3.auto
13810000-13810dff : arm-smmu-v3.0.auto

138d2000-138d2fff : arm-smmu-v3-pmcg.8.auto arm-smmu-v3-pmcg.8.auto

138e2000-138e2fff : arm-smmu-v3-pmcg.9.auto
138e2000-138e2fff : arm-smmu-v3-pmcg.9.auto arm-smmu-v3-pmcg.9.auto
138f2000-138f2fff : arm-smmu-v3-pmcg.9.auto
138f2000-138f2fff : arm-smmu-v3-pmcg.9.auto arm-smmu-v3-pmcg.9.auto
14200000-14200fff : NVDA2861:00
14900000-1491ffff : arm-smmu-v3.2.auto
14900000-14900dff : arm-smmu-v3.2.auto
14902000-14902fff : arm-smmu-v3-pmcg.17.auto
14902000-14902fff : arm-smmu-v3-pmcg.17.auto arm-smmu-v3-pmcg.17.auto
14910000-14910dff : arm-smmu-v3.2.auto

14952000-14952fff : arm-smmu-v3-pmcg.18.auto arm-smmu-v3-pmcg.18.auto

14962000-14962fff : arm-smmu-v3-pmcg.19.auto
14962000-14962fff : arm-smmu-v3-pmcg.19.auto arm-smmu-v3-pmcg.19.auto
14972000-14972fff : arm-smmu-v3-pmcg.19.auto
14972000-14972fff : arm-smmu-v3-pmcg.19.auto arm-smmu-v3-pmcg.19.auto
16050000-16050fff : NVDA0310:00
16a00000-16a00fff : MTKI0511:00
16a00000-16a0001f : serial
16b20000-16b2ffff : MIPI0100:02
16b30000-16b3ffff : MIPI0100:00
16bd0000-16bd0fff : pnp 00:00
16c10000-16c10fff : NVDA0210:00
16c50000-16c50fff : NVDA0210:01
16c70000-16c70fff : NVDA0210:02
16d10000-16d1ffff : MIPI0100:01
18010000-18010fff : NVDA8600:00
18020000-18020fff : NVDA8601:00
18020000-1802000f : NVDA8601:00
1a000000-1a000fff : NVDA8301:00
1a001000-1a001003 : NVDA8302:00
1a00f000-1a00ffff : NVDA8301:00
1a010000-1a010003 : NVDA8302:00
1a020000-1a07ffff : NVDA8301:00
1a080000-1a09ffff : NVDA8301:00
1a0a0000-1a0fffff : NVDA8302:00
1a100000-1a11ffff : NVDA8302:00
1a120000-1a15ffff : NVDA8301:00
1a160000-1a19ffff : NVDA8302:00
1a350000-1a350fff : NVDA8301:00
1a360000-1a360fff : NVDA8302:00
1a400000-1a400fff : NVDA8303:00
1a40f000-1a40ffff : NVDA8303:00
1a420000-1a47ffff : NVDA8303:00
1a480000-1a4bffff : NVDA8303:00
1a560000-1a5dffff : NVDA8303:00
1a750000-1a750fff : NVDA8303:00
1a800000-1a87ffff : NVDA8200:00
1aaa0000-1aaa0fff : NVDA8200:00
1aab0000-1aab0fff : NVDA8200:00
1ab20000-1ab20fff : NVDA8200:00
1c004000-1c005fff : DRAM8901:00
1c041000-1c041fff : sbsa-gwdt.0
1c041000-1c041fff : sbsa-gwdt.0 sbsa-gwdt.0
1c042000-1c042fff : sbsa-gwdt.0
1c042000-1c042fff : sbsa-gwdt.0 sbsa-gwdt.0
1c440000-1c440fff : arch_mem_timer
1c548000-1c548133 : SPMI0001:00
1c548200-1c548333 : SPMI0002:00
1c548400-1c548533 : SPMI0003:00
1c54a000-1c54afff : NVDA9221:00
1c570000-1c5700ff : SPMI0001:00
1c5c0000-1c5c00ff : SPMI0002:00
1c610000-1c6100ff : SPMI0003:00
1c660000-1c6600ff : SPMI0004:00
1c6a0000-1c6a08fe : SPMI0001:00
1c6b0000-1c6b08fe : SPMI0002:00
1c6c0000-1c6c08fe : SPMI0003:00
1c6d0000-1c6d08fe : SPMI0004:00
1c8b0000-1c8b00ff : NVDA6210:00
1c8c0000-1c8c00ff : NVDA6210:00
1c900000-1c900fff : NVDA6210:00
1d600000-1d600fff : pnp 00:00
1d640000-1d640fff : pnp 00:00
1d690000-1d690fff : pnp 00:00
1d790000-1d790fff : pnp 00:00
1d860000-1d8677ff : NVDA8001:00
1d860000-1d8677ff : NVDA8001:00 NVDA8001:00
1d868000-1d8680ff : NVDA8001:00
1d870000-1d8777ff : NVDA8000:04
1d870000-1d8777ff : NVDA8000:04 NVDA8000:04
1d878000-1d8780ff : NVDA8000:04
1d880000-1d883fff : NVDA8001:00
1d890000-1d894fff : NVDA8000:04
1d8e0560-1d8e0577 : NVDA8001:00
1d8e0578-1d8e0593 : NVDA8000:04
1db60000-1db677ff : NVDA8000:00
1db60000-1db677ff : NVDA8000:00 NVDA8000:00
1db68000-1db680ff : NVDA8000:00
1db70000-1db73fff : NVDA8000:00
1db90000-1db977ff : NVDA8000:01
1db90000-1db977ff : NVDA8000:01 NVDA8000:01
1db98000-1db980ff : NVDA8000:01
1dba0000-1dba3fff : NVDA8000:01
1dbd012c-1dbd0143 : NVDA8000:00
1dbd0144-1dbd015b : NVDA8000:01
1dde0000-1dde77ff : NVDA8000:02
1dde0000-1dde77ff : NVDA8000:02 NVDA8000:02
1dde8000-1dde80ff : NVDA8000:02
1ddf0000-1ddf3fff : NVDA8000:02
1de10000-1de177ff : NVDA8000:03
1de10000-1de177ff : NVDA8000:03 NVDA8000:03
1de18000-1de180ff : NVDA8000:03
1de20000-1de23fff : NVDA8000:03
1de5012c-1de50143 : NVDA8000:02
1de50144-1de5015b : NVDA8000:03
24000000-281fffff : PCI Bus 000f:00
24000000-27ffffff : PCI Bus 000f:01
24000000-27ffffff : 000f:01:00.0
24000000-27ffffff : nvidia
29000000-291fffff : PCI ECAM

311c0100-311c0103 : MTK00055:00
311c0104-311c0107 : MTK00055:00
311c0108-311c010b : MTK00055:00
311c010c-311c010f : MTK00055:00
31270074-31270077 : MTK00055:00
36078000-3607ffff : NVDA2014:00
36078000-3607ffff : NVDA2014:00 NVDA2014:00
3e900000-3e90ffff : NVDA3000:00
5d010000-5f7fffff : PCI Bus 0002:00
5d100000-5d2fffff : PCI Bus 0002:01
5d100000-5d1fffff : 0002:01:00.0
5d200000-5d2fffff : 0002:01:00.1
62010000-647fffff : PCI Bus 0004:00

nvidia-bug-report.log.gz (2.1 MB)

mlxlink details

Physical state                : ETH_AN_FSM_ENABLE       ← stuck in auto-neg FSM

Supported Cable Speed (Ext.) : 0x00000000 () ← NIC sees ZERO cable speeds

Status Opcode : 1024

Group Opcode : MNG FW ← firmware-side assertion

Recommendation : Cable is unplugged ← FW's conclusion, wrong

Identical on both CX-7 ASICs. This gives the NVIDIA dev a specific firmware opcode to trace: 1024 in the MNG FW group. Much more actionable than just “status: 0x3”.

ethtool -m — same root cause:

netlink error: mlx5_core: Query module eeprom by page failed, read 0 bytes, err -5

All 4 ports. err -5 = EIO. The firmware’s I²C/MCIA path to the QSFP EEPROM is just dead. That’s what mlx5_query_mcia 0x3 is reporting at a higher level.

  • mst_pci kernel module fails to build for 6.17.0-1014-nvidia — MFT 4.34.1-12 doesn’t install it. Diagnostics still work via PCI address so not blocking, but worth NVIDIA knowing.

Firmware triangle captured cleanly:

  • FW: 28.45.4028

  • MFT: 4.34.1-12

  • amBER: 5.75