I have a chain of XDP programs, which tail call each other, and the last one in chain can either return with XDP_PASS or redirect via xskmap. When XDP_PASS return is used, this works just fine: frames are processed by BPF, they are valid and contain valid frame data - this is confirmed, and in the end are passed to OS and can be examined with tcpdump on the interface. The problem arises when AF_XDP socket is bound. If XDP program redirects to the socket, AF_XDP program receives frame, and the frame is valid and it does contain expected data - this is also confirmed. But if XDP program returns XDP_PASS, the frame comes out from netdev, and its size matches original frame, but the contents are garbed - there is no trace of original data in these frames, it is either all zeroes, or some random data. AF_XDP processor shows zero processed frames as expected. If the AF_XDP processor is terminated, frames coming out of netdev become valid as expected. This is both in zerocopy and copy AF_XDP bind modes if XDP program is attached to interface in DRV mode. If XDP program is attached to interface in SKB mode, the problem goes away in both AF_XDP bind modes. The problem is also absent when whole setup is replicated on virtio interfaces
XDP is a pure software solution which provided by your OS vendor.
From your description I didn’t see any issue related to Nvidia products.
Please contact with your OS vendor for further help.
Sorry for bringing up the topic, but I have additional details.
I have switched to the in-kernel mlx5_core module and rerun same tests. No garbed packets. Once I come with this to my OS vendor, they will instantly send me back here, because the only difference between working and non-working cases is MLNX_EN, which is not part of OS distribution. The kernel is 5.15.85. Also forgot to mention in the original message, this is mellanox connectix 5:
3b:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
Subsystem: Mellanox Technologies ConnectX®-5 EN network interface card, 100GbE single-port QSFP28, PCIe3.0 x16, tall bracket; MCX515A-CCAT
Flags: bus master, fast devsel, latency 0, IRQ 116, NUMA node 0
Memory at ac000000 (64-bit, prefetchable) [size=32M]
Expansion ROM at ab000000 [disabled] [size=1M]
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [48] Vital Product Data
Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
Capabilities: [c0] Vendor Specific Information: Len=18 <?>
Capabilities: [40] Power Management version 3
Capabilities: [100] Advanced Error Reporting
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [1c0] Secondary PCI Express
Kernel driver in use: mlx5_core
Kernel modules: mlx5_core
BTW, I have noticed, that garbed frames do not always contain zeroes. There where couple of frames carrying parts of a system log from systemd’s journal (or maybe an application which originally supplied this log). The UMEM buffer comes from mmap, so it is supposed to be zeroed and there is no way it should be able to get such a data even if I do something wrong with AF_XDP part.
Sorry for bringing up closed topic, but I have additional details.
This is mellanox connectix 5 card:
3b:00.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5]
Subsystem: Mellanox Technologies ConnectX®-5 EN network interface card, 100GbE single-port QSFP28, PCIe3.0 x16, tall bracket; MCX515A-CCAT
Flags: bus master, fast devsel, latency 0, IRQ 116, NUMA node 0
Memory at ac000000 (64-bit, prefetchable) [size=32M]
Expansion ROM at ab000000 [disabled] [size=1M]
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [48] Vital Product Data
Capabilities: [9c] MSI-X: Enable+ Count=64 Masked-
Capabilities: [c0] Vendor Specific Information: Len=18 <?>
Capabilities: [40] Power Management version 3
Capabilities: [100] Advanced Error Reporting
Capabilities: [150] Alternative Routing-ID Interpretation (ARI)
Capabilities: [1c0] Secondary PCI Express
Kernel driver in use: mlx5_core
Kernel modules: mlx5_core
I have switched to in_kernel mlx5_core module and rerun same tests. No garbed packets. Once I come with this to my OS vendor, they will instantly send me back here, because the only difference between working and non-working cases in MLNX_EN, which is not part of OS distribution. Kernel is 5.15.85