ConnectX-3 CX312A-XCBT and Thunderbolt

Hi. I am endeavouring to port the linux device driver, and I was assuming I could plug the card into a Thunderbolt adapter, namely the Sonnet Technologies Echo Express Pro.

I can plug in a spare Broadcom dual-port 1Gig ethernet card I have lying about, and it shows up as a pair of PCI devices. There is no driver for this device, it just appears in the System Information “PCI Cards” list.

The CX312A does not show up, which leads me to believe there is some issue.

Well, the obvious one is: is the card DOA? I put it into an HP ProLiant ML110 G7 into a PCIe2 x16 slot, and it shows up under NotBSD:

004:00:0: Mellanox Technologies product 0x1003 (ethernet network)

The Echo Express Pro claims compatibility with various 10gigE cards, and the compat list is mainly for those with drivers already written, as they have some driver requirements (for sleep/wakeup and hot plug).

Sonnet - Support - PCIe Card Thunderbolt Compatibility Chart http://www.sonnettech.com/support/charts/thunderbolt/index.html

Any ideas as to why this card might not be working in this expansion chassis? the slots are (claimed to be) PCIe 2.0 x16.

The Mellanox ConnectX-3 should not be physicaly aware that it is connected to a Thunderbolt adapter and should linkup on the PCIe with the adapter.

So I see two options to why this is not working:

  1. The Sonnet adapter is running some sort of enforcement on the device ID or type as reported in the PCI configuration header.

  2. The ConnectX3 did linkup but you just can see it on your machine.

to check if this is #2 you should try and use an lspci tool as suggested in the thread.

One other option is that the ConnectX-3 device you have supports PCIe Gen3, we have seen in the past legacy PCIe root ports that didn’t have forward compatability with Gen3 devices.

Let’s start checking if the device is recognized using lspci like tool and take it from there…

My MacPro (2008) finally died for the last time earlier this year. It was getting machine check faults at boot, and I had replaced the motherboard (under warrantee) and the power supply (out of warrantee, the real cause of the problems I was originally seeing). The new power supply worked a treat until suddenly the machine-check issue arose, and I finally decided I had had enough.

My neighbour has the same MacPro 2008, so I have been giving him all the good bits from mine. I can ask him to allow me to try installing the card, but if in the process his aged MacPro dies, it might interfere with my ability to borrow gardening implements — a risk I am still evaluating.

I guess I will have to see about renting one in order to complete this project, unless I can get the card to successfully negotiate some link-level access in the Sonnet.

I wonder if this has anything to do with it:

"Once the firmware image has been loaded to the device’s internal memory and component initial- ization has completed, the device is ready to respond to PCI enumeration. Prior to this condition, the device responds with the ‘configuration retry status’ completion status to type0 configuration cycles targeting the device. "

whenever I have been adding it to the system, it has been sleep/plug/wake, and most of the PCI config has been done, so the hot-plug events from the Thunderbolt will happen pretty fast. If OSX is not willing to retry the config space reads sufficiently, it will not find the device.

There is no real ‘lspci’ on OSX. your view of PCI config space is a “this is what we found”, not “this is what is currently there”.

I will try to allow the device to be powered up and then boot OSX to see that gets me past this…

I also plugged the CX312A into an x4-electrical slot in the HP ProLiant.

It comes up in NotBSD:

PCI Express Capabilities Register

Capability version: 2

Device type: PCI Express Endpoint device

Interrupt Message Number: 0

Link Capabilities Register: 0x0843f483

Maximum Link Speed: unknown 3 value

Maximum Link Width: x8 lanes

Port Number: 8

Link Status Register: 0x1041

Negotiated Link Speed: 2.5Gb/s

Negotiated Link Width: x4 lanes

I know Thunderbolt will be a bottleneck, but I was expecting the card to just run slower, but still allow more or less 10Gig ethernet.

As a thought, I remember ages ago looking for lspci for OSX too.

Finally found one, and it worked well. I think it was by some of the guys that work on the Hackintosh side of things, so if you look through the various Hackintosh sites for lspci you should find it.

If you don’t, let me know and I’ll see if I can dig it up. It’s been a while though.

Note - it did work well (perfectly), so definitely recommended.

As a thought, do you know anyone with a Mac Pro that you could plug it into?

Just thinking that might be a way to see if it’s the Sonnet causing the issues or not.

Note - Those Sonnet things look interesting. Hopefully you get this figured out, as I have Thunderbolt ports on my Mac Mini but no PCIe ones.

On the Sonnet page, it says for a PCIe device to work with it:

“2) it must properly support hot-plug/unplug, recover from sleep, etc. as defined in the Thunderbolt device interconnect paradigm.”

Maybe the ConnectX-3 card you have doesn’t support hot-plug/unplug? The Mellanox guys should be able to answer that definitively.

One of the Sonnet pages also mentions it only does PCIe x4:

“The PCIe slot will physically accommodate up to a x16 PCIe card, however the actual electrical bandwidth of Thunderbolt is x4.”

https://secure1.sonnettech.com/product_info.php?products_id=423 https://secure1.sonnettech.com/product_info.php?products_id=423

That doesn’t sound good.

(note - edited to fix typo and add PCIe x4 info)

AFAIK, Mellanox ConnectX3 (and previous models) supports all PCI-e standards

(From http://www.mellanox.com/related-docs/prod_adapter_cards/ConnectX3_VPI_Card.pdf http://www.mellanox.com/related-docs/prod_adapter_cards/ConnectX3_VPI_Card.pdf )

PCI EXPRESS INTERFACE

– PCIe Base 3.0 compliant, 1.1 and 2.0 compatible

– 2.5, 5.0, or 8.0GT/s link rate x8

– Auto-negotiates to x8, x4, x2, or x1

– Support for MSI/MSI-X mechanisms

i think that on the above HP system it showed up correctly, i am not sure about the Thunderbolt adapter.

As a thought, there might be some useful idea’s in this guys post:

http://forum.notebookreview.com/e-gpu-external-graphics-discussion/688931-thunderbolt-e-gpu-setup-sonnet-echo-express-pro-review-tomshardware.html#post8864685 http://forum.notebookreview.com/e-gpu-external-graphics-discussion/688931-thunderbolt-e-gpu-setup-sonnet-echo-express-pro-review-tomshardware.html#post8864685

He’s using a Sonnet for different purposes (graphics card), but it shows the things he tried and various problems he hit to get it detected.

Hopefully that’s useful at all.

I found all that. Too much effort to get it all to work with what else I have going on right now, but I am planning on resolving the Mlx4_core “nub” class compile issues so that it loads, and marking the device it binds to to be the driver-less GigE card in the other slot, and figuring out how to do direct PCI config-space reads, to see if I can enumerate the bus.

I do not believe I will discover it successfully negotiated link particulars, though.

I have some more info re the Link Capabilities of the EchoExpress PCIe slots, and it may be that the EchoExpress rejects it due to a lack of capability.

The hot plug and power management stuff referred, I believe, to the driver, and I will definitely have to address those issues, which might present a challenge.

The x4 electrical and PCIe 2.0 issues I thought the HP comparison shed some light, if only an existence proof.

I have also tried a pre-production Intel card with 2 10Gig ports, 4 1G ethernet devices (although with no external connectors) and a honking big crypto accelerator. This shows up in the Sonnet, although all the device IDs looked suss, but it did manage to negotiate links to the various devices.

With the device in an HP, I can view the PCIe Link Capability register.

I find that it has a value of 843f483.

Ignoring the ‘port #’ field, I find that this indicates:

supported link speeds: 3 (no idea how to interpret)

max link width: 8

Active State PM support: <11:10> = 1

Data Link Layer Active Reporting Capable: <20> = 0

Link B/W Notification Capable: <21> = 0

Bit 22 Reserved: = 1

The EchoExpressPro Link Capability register is 333fc41

Bit 22 Reserved = 0

<21> = 1

<20> = 1

<11:10> = 3

Don’t know if it enforces any of these to be certain values, in particular <11:10>.

I can find no clear description of the PCI config space settings in any of the Mellanox docs.

I am used to having an explicit (often wrong) enumeration of the PCI config space settings for a device.

If there some other manual I am missing?

The Mellanox card supports L0s, but not L1 ASPM.

This sounds like it might be the crux of the biscuit, but I note that the Intel 82599EB I have only supports L0s, and I was under the impression that at least some Intel 10Gig cards were certified as working: the SmallTree 82599-based product is on the supported list.

@yairi - There’s more info being requested here:

permezel wrote:

The EchoExpressPro Link Capability register is 333fc41

Bit 22 Reserved = 0

<21> = 1

<20> = 1

<11:10> = 3

Don’t know if it enforces any of these to be certain values, in particular <11:10>.

I can find no clear description of the PCI config space settings in any of the Mellanox docs.

I am used to having an explicit (often wrong) enumeration of the PCI config space settings for a device.

If there some other manual I am missing?

According to Sonnet Tech support:

"Feedback from our engineering dept…

We’re going to have to investigate this. It looks like a problem with PCIe 3.0 to PCIe 2.0 …"

That was back on 30 March. I have not heard anything else, but I no longer care that much about it. Well, I sort of do, but a Mac Pro has been obtained for the porting effort, and I have moved on.

I would have liked to use the Mellanox in the Sonnet personally, but I will just port a 10Gig Intel driver for a card which is known to work, and use the Mellanox in my Dell server. Either way, I just want 10gig connectivity to the server from the mac mini, so it doesn’t really matter which end has which adapter.

There is an other issue which precludes the use of the ConnectX adapter in any ThunderBolt enclosure, which I discovered relatively recently.

Thunderbolt device driver guidelines require MSI interrupt. Not MSIx. Not legacy INTx.

The connectX does not support MSI. Therefore, no Thunderbolt. Unless I wrote a polled-mode only driver. How much fun would that be?