RU Emulator Payload Validation ERROR with Case F08 1C

Hello,

I tried running the E2E test case (case F08 1C with testMAC + cuPHYController_SCF + RU Emulator) with cuBB 25.2 according to Running cuBB End-to-End — Aerial CUDA-Accelerated RAN .

After all components started, the RU’s console was flooded with payload validation errors. I had examined configuration files for testMAC, cuPHY controller, and the RU emulator, and they all appear to be correct.

Here is the environment setup:
Server 1: testMAC + cuPHY controller with an MT42822 BlueField-2 integrated ConnectX-6 Dx network controller and a GPU NVIDIA 6000 Ada.
Server 2: RU Emulator with a same NIC.

In addition, I have attached the configuration files and log files for reference.

Thank you!

cuphycontroller_F08.yaml.txt (26.7 KB)

l2_adapter_config_F08.yaml.txt (4.7 KB)

phy.log (719.1 KB)

ru.log (1.8 MB)

ru_config.yaml.txt (12.4 KB)

test_mac_config.yaml.txt (9.8 KB)

testmac.log (8.6 KB)

Hi @justin.jy.huang ,

Please note that this is not a recommended HW configuration. From the logs, it can be seen that there are issues in reception and transmission of packets. Even if they are transmitted, it is likely that they are not on time.

I noticed the following eAxCid setting on the cuphycontroller side

      eAxC_id_pusch: [8, 0]
      eAxC_id_pucch: [8, 0]

while the UL eAxCid setting on the RU emulator side was defined as follows:

   eAxC_UL: [8,0,1,2]

Can you use this instead on the cuphycontroller config?

    eAxC_id_pusch: [8, 0, 1, 2]
      eAxC_id_pucch: [8, 0, 1, 2]

Can you also set the following on the cuphycontroller config file

enable_cpu_task_tracing: 1
enable_compression_tracing: 1

and update the tags on the nvlog config file as follows for better tracing?

- 216: "DRV.MAP_DL"
shm_level: 5 # Example: overlay shm_log_level for a tag
console_level: 3 # Example: overlay console_log_level for a tag
- 217: "DRV.FUNC_DL"
shm_level: 5 # Example: overlay shm_log_level for a tag
console_level: 3 # Example: overlay console_log_level for a tag

- 224: "DRV.MAP_UL"
shm_level: 5 # Example: overlay shm_log_level for a tag
console_level: 3 # Example: overlay console_log_level for a tag
- 225: "DRV.FUNC_UL"
shm_level: 5 # Example: overlay shm_log_level for a tag
console_level: 3 # Example: overlay console_log_level for a tag

- 311: "L2A.PROCESSING_TIMES"
shm_level: 5 # Example: overlay shm_log_level for a tag
console_level: 3 # Example: overlay console_log_level for a tag

- 503: "RU.LATE_PACKETS"
shm_level: 5 # Example: overlay shm_log_level for a tag
console_level: 3 # Example: overlay console_log_level for a tag

Thank you.

Hi @bkecicioglu ,

I had set the eAxC_id configuration, but the result looked the same.

I noticed two abnormal points:

  1. The UL throughput at the PHY side is zero, with CRC errors.
  2. The RU Emulator compares data from cuPHY and the test vector, but the values don’t match, which leads to the payload validation error.

The following attachments include the configuration files and log files for easier tracing.

Thank you for your reply!

ru_config.yaml.txt (12.4 KB)

ru.log (11.0 MB)

phy.log (61.3 MB)

l2_adapter_config_F08.yaml.txt (4.7 KB)

cuphycontroller_F08.yaml.txt (26.9 KB)

testmac.log (7.7 KB)

@justin.jy.huang ,

Can you share the commands you are using to run ru_emulator and test_mac? Which launch pattern are you using?

According to the RU logs, it can be seen that all packets are received on time but there are validation errors.

Phy log also indicates UL packets are received and processed.

08:19:57.458414 INF UlPhyDriver06 0 [DRV.MAP_UL] [PHYDRV] SFN 597.14 PUSCH Aggr 18678cfa57e8f051 Map 342 times ===> [CPU CUDA Setup] { START: 1758529197456402700 END: 1758529197456468055 DURATION: 65 us Setup STATUS: 1 } [CPU CUDA Run] { START: 1758529197456468055 END: 1758529197456511035 DURATION: 42 us Run STATUS: 1 } [GPU Setup Ph1] { DURATION: 21.50 us } [GPU Setup Ph2] { DURATION: 102.14 us } [GPU Run] { DURATION: 1273.38 us } [GPU Run EH] { DURATION: 0.00 us } [GPU Run Post EH] { DURATION: 0.00 us } [GPU Run Phase 2] { DURATION: 0.00 us } [GPU Run Gap] { DURATION: 0.00 us } [Max GPU Time] { START: 1758529197456402700 END: 1758529197457917639 DURATION: 1514 us } [CPU wait GPU Time] { START: 1758529197457911205 END: 1758529197457917639 DURATION: 6 us } [Callback] { START: 1758529197457938712 END: 1758529197457956271 DURATION: 17 us }

As you also observed, there are CRC errors in UL and that is why UL throughput as seen on the testMAC console is 0.

Thank you.

Hi @bkecicioglu ,

The script to run test_mac:

test_mac_bin=$cuBB_SDK/build/cuPHY-CP/testMAC/testMAC/test_mac

sudo -E $test_mac_bin F08 1C

The script to run ru_emulator:

export CUDA_VISIBLE_DEVICES=“”

ru_bin=${cuBB_SDK}/build/cuPHY-CP/ru-emulator/ru_emulator/ru_emulator

sudo -E $ru_bin F08 1C

The launch pattern I use is ./testVectors/multi-cell/launch_pattern_F08_1C.yaml.

BTW, I noticed that RU Emulator’s log includes:

08:19:43.619897 CON 7667 0 [RU] YAML invalid key: INIT, Init launch pattern missing, using only normal launch pattern

Does it cause the launch pattern that testMAC and RU Emulator use different?

Thank you!

Hi @justin.jy.huang ,

Can you please try with the attached launch pattern?

Thank you.

launch_pattern_F08_1C_59c.yaml.zip (1.1 KB)

Hi @bkecicioglu ,

Sure. Let me try that case. It will take some time because of generating test vectors.

I will let you know the result as soon as possible.

Thank you!

Hi @bkecicioglu ,

I tried the launch pattern F08_1C_59c and it ran successfully. :)

I am wondering what the difference is between these two patterns (F08_1C & F08_1C_59c) and what might cause the payload validation error.

The following attachments include the configuration files and log files. I hope they can help others as well.

Thank you!

cuphycontroller_F08_CG1.yaml.txt (39.0 KB)

ru_config.yaml.txt (12.4 KB)

ru.log (20.1 MB)

phy.log (78.4 MB)

l2_adapter_config_F08_CG1.yaml.txt (4.9 KB)

testmac.log (15.3 KB)

test_mac_config.yaml.txt (9.8 KB)

@justin.jy.huang

It is great to hear this! Thank you for sharing your test files.

Basic F08 launch patterns are legacy pattern and they are not actively maintained. It is very likely that the TVs are not aligned with the code. That was the reason for validation errors.

Please use pattern 59c for 4T4R tests.

Thank you.

Hi @bkecicioglu,

Thanks for your explanation.

I have another question:
I noticed that there was no rx_packets in cuPHY. Could it be caused by DPDK?

[FH.NIC] NIC 0000:d1:00.0 stats:
[FH.NIC] tx_packets: 1079768
[FH.NIC] rx_packets: 0
[FH.NIC] tx_bytes: 629131028
[FH.NIC] rx_bytes: 0
[FH.NIC] tx_errors: 0
[FH.NIC] rx_errors: 0
[FH.NIC] rx_missed: 0
[FH.NIC] rx_nombuf: 0

Thank you!

Hi @justin.jy.huang ,

These are counters from the DPDK API. DPDK is only used to transmit C-plane packets on the DU side. The U-plane packet transmission and reception happens with DOCA. Therefore, what you are seeing on the cuphycontroller side is the counters for the C-plane messages.

On the other hand, DPDK is used for all transmission and reception on the RU-emulator side.

Thank you.

Hi @bkecicioglu ,

Got it.

Thanks for your reply. :)