I’m trying to setup IPSec between two nodes, and communicate via RoCEv2 RDMA.
I don’t need OVS or StrongSwan for VLAN setup / key exchange, for this setup I can manually set up keys and I just want to verify that IPSec works for now. The ulimate goal I want is near line rate performance for RDMA traffic encrypted with IPSec.
In short, I want to setup IPSec and do a perftest between two nodes and actually see the RoCE packets be encapsulated in ESP packets on the switch.
The issue that I’m having is that:
performance is very bad if I setup offloading
I’m seeing plain RoCE packets even after following the directions for IPSec setup.
My setup:
two bare metal servers connected via L2 Switch, each with BF-3. (Node A and Node B)
another node with with a NIC that is port mirrored via the switch to verify the packet going through the NIC to one of the server with BF-3. (Node C)
using only one port, no VF, just PF (is usage of VF mandatory for IPSec?)
My Experimentation:
SW only attempt:
Performed SW only IPSec, saw ESP packets with iperf, and very reduced performance compared with line-rate (Expected)
For sw only ipsec, i setup xfrm state and policies on the host only, with BF-3 in DPU mode. And I was able to see ESP packets on packet capture from Node C
HW offload attempt:
Attempted to perform IPSec offloaded to hw in BF-3, setup xfrm state and policy directly on the DPU, in the ubuntu image of BF-3.
For iperf saw much better performance compared to SW only IPSec, but still very, very short of line rate (somewhere in the 10~20% range of line rate). For perftest, saw very bad performance (less than 10% of line rate)
Did not see any ESP packets on packet capture of Node C for neither iperf nor perftest
My Method:
setup xfrm state and policies with ip xfrm state/policy ... for both nodes (in the host for sw only, in the dpu for hw offloaded)
Question:
what is the expected performance penalty of using IPSec for RDMA traffic?
what things am I missing / doing wrong in my setup?
Why is performance bad when I’m not even seeing encrypted packets? where is the performance drop coming from? If I set it up incorrectly, then I expect to either see encrypted packets with bad performance, or plaintext packets but with good performance, not, plaintext packets and bad performance.
Additional comments:
it was unclear from the docs which commands should be executed on the host and which in the DPU.
the docs often mention usage of strongswan or ovs, but I am under the impression that bare bones setup between two nodes only require just xfrm states and policies, please correct me if I’m wrong.
There are many typos in the IPSec offload for RDMA traffic documentation, which adds to the confusion.
Welcome, and thank you for posting your inquiry to the NVIDIA Developer Forums.
If you require more in-depth assistance after reading the below, and you have a valid NVIDIA Enterprise Support Entitlement, we highly recommend opening a support ticket with NVIDIA Enterprise Experience for further triage and assistance.
Configuration location: Commands must be executed on both host and DPU sides correctly.
For hardware offload, xfrm state/policy configuration should be on the DPU
Traffic steering must be properly configured on the host
VF configuration: While not strictly mandatory, VFs are recommended for proper isolation and performance:
Create and configure VFs for the BlueField-3 device
Bind VFs to the RDMA applications
ESP encapsulation issue: The lack of ESP packets indicates failed offload. Verify:
HW crypto capabilities match algorithm selection
Proper Traffic Class marking for RDMA packets
Compatible IPsec parameters are used (supported cipher/auth combinations)
Performance optimization:
Use larger MTU (jumbo frames) if network supports it
Configure proper memory registration for RDMA with IPsec
Verify QoS settings aren’t limiting throughput
Required xfrm flags:
Add ‘offload dev <interface_name>’ to xfrm state commands
Add ‘reqid <matching_id>’ to link policies with states
Implementation steps:
Reset all configurations
Configure IPsec with proper offload flags on DPU side
Verify hardware capabilities are detected
Follow NVIDIA’s optimized configuration patterns for BlueField-3 IPsec+RDMA
An example simple configuration follows:
Basic xfrm configuration example (Without StrongSwan/LibreSwan):
# Configure xfrm state for outbound traffic
ip xfrm state add src 192.168.1.1 dst 192.168.1.2 proto esp spi 0x12345678 \
reqid 1 mode transport \
enc "aes-gcm" 0x010203040506070809101112131415161718192021 \
offload dev enp1s0f0 dir out
# Configure matching state for inbound
ip xfrm state add src 192.168.1.2 dst 192.168.1.1 proto esp spi 0x87654321 \
reqid 1 mode transport \
enc "aes-gcm" 0x010203040506070809101112131415161718192021 \
offload dev enp1s0f0 dir in
# Configure policies
ip xfrm policy add src 192.168.1.1 dst 192.168.1.2 dir out tmpl proto esp mode transport reqid 1
ip xfrm policy add src 192.168.1.2 dst 192.168.1.1 dir in tmpl proto esp mode transport reqid 1
Repeat on second node with reversed IP addresses.
Troubleshooting Steps
Verify offload capability:
ip xfrm state list
Look for “offload” flag set to “yes”
Check for errors:
dmesg | grep -i xfrm
dmesg | grep -i mlx
Verify RDMA traffic marking:
Configure outgoing RDMA traffic with proper TOS/TC marking to match IPsec policies