F08 59 Pattern Multi-cell Test cuphycontroller down

Hello,

I am currently testing on the F08 59 Test pattern with RU-emulator by gradually increasing the number of cells starting from 1.
However, I have noticed that kernel timeouts occasionally occur from 3 cells onward, and from 4 cells, the kernel timeout becomes more frequent, causing the cuphycontroller to go down.
(For your reference, I attached the PHY logs for each cases)

Could you advise on what configurations I should check or adjust to resolve this issue?
I am using a GH200-based server, and there is a 100Gb connection with the RU-emulator server.

Any insights or suggestions from those with similar experiences would be greatly appreciated.

Thank you!

phy_3cell.log (62.6 KB)
phy_4cell.log (67.7 KB)

Hi @tojsm,

I noticed the core assignments are not matching with the release documentation and varying between the different test runs.

Can you explain how the core assignment is done in your system?

Thank you.

08:34:41.862382 WRN phy_init 0 [CTL.YAML] UL cores: 
08:34:41.862383 WRN phy_init 0 [CTL.YAML] 	- 4
08:34:41.862383 WRN phy_init 0 [CTL.YAML] 	- 5
08:34:41.862383 WRN phy_init 0 [CTL.YAML] 	- 6
08:34:41.862383 WRN phy_init 0 [CTL.YAML] 	- 7
08:34:41.862383 WRN phy_init 0 [CTL.YAML] 	- 8
08:34:41.862383 WRN phy_init 0 [CTL.YAML] 	- 9
08:34:41.862383 WRN phy_init 0 [CTL.YAML] DL cores: 
08:34:41.862383 WRN phy_init 0 [CTL.YAML] 	- 14
08:34:41.862384 WRN phy_init 0 [CTL.YAML] 	- 15
08:34:41.862384 WRN phy_init 0 [CTL.YAML] 	- 16
08:34:41.862384 WRN phy_init 0 [CTL.YAML] 	- 17
08:34:41.862384 WRN phy_init 0 [CTL.YAML] 	- 18
08:34:41.862384 WRN phy_init 0 [CTL.YAML] 	- 19
08:28:20.573420 WRN phy_init 0 [CTL.YAML] UL cores: 
08:28:20.573420 WRN phy_init 0 [CTL.YAML] 	- 4
08:28:20.573420 WRN phy_init 0 [CTL.YAML] 	- 5
08:28:20.573420 WRN phy_init 0 [CTL.YAML] 	- 6
08:28:20.573420 WRN phy_init 0 [CTL.YAML] 	- 7
08:28:20.573421 WRN phy_init 0 [CTL.YAML] DL cores: 
08:28:20.573421 WRN phy_init 0 [CTL.YAML] 	- 14
08:28:20.573421 WRN phy_init 0 [CTL.YAML] 	- 15
08:28:20.573421 WRN phy_init 0 [CTL.YAML] 	- 16
08:28:20.573421 WRN phy_init 0 [CTL.YAML] 	- 17
08:28:20.573421 WRN phy_init 0 [CTL.YAML] 	- 18
08:28:20.573421 WRN phy_init 0 [CTL.YAML] 	- 19

@tojsm to clarify my previous comment, the number of cores assigned to DL and UL workers should not be changed from our default configuration. I see that more cores are assigned and they also differ between the test runs.

You can change the core_id’s but the number of cores assigned should not be changed. We need 2 cores for UL and 3 cores for DL.

  workers_ul: [5,6]
  workers_dl: [11,12,13]

Thanks for your reply,

I initially suspected that the issue might be related to computational resources as the number of cells increases, so I tried increasing the number of workers.

Following the suggestion, I also reverted to the original core settings and added the log details below. However, the issue at 4 cells still same.

phy_4cell_1101.log (47.1 KB)
phy_3cell_1101.log (63.1 KB)

Any additional insights or suggestions would be greatly appreciated!I just thought that

Can you please share the cuphycontroller, l2adapter, testMAC and ru_emulator config yaml files and also the test commands you use?

Thanks.

Can you please check the config files by following the steps in running_cuBB_test?

Please make sure the peerethaddr and nic_interface are set correctly in the ru_emulator config and the NIC PCI_e address is also correct in cuphycontroller yaml file.

Order kernel timeout is related to UL traffic only. It means the UL packets are not received.

You can also check your NIC & PTP settings.

Sure, I attached the yaml config files.

I also followed the steps from the link, so there was no problem on 1~2 cells cases.
And for the 3~4 cells cases, it also flowed traffic for few slots

yamls.zip (12.1 KB)

Thanks!

Please disable enhanced L1-L2 interface feature since it was intended only for a single cell.

Copying the related reference from our release documentation in section " Running the F08 Test Cases"

# For Enhanced L1 - L2 Interface
sed -i 's/uciIndPerSlot :.*/uciIndPerSlot : 2/' ${cuBB_SDK}/cuPHY-CP/testMAC/testMAC/test_mac_config.yaml
sed -i "s/pusch_subSlotProcEn:.*/pusch_subSlotProcEn: 1/" ${cuBB_SDK}/cuPHY-CP/cuphycontroller/config/cuphycontroller_F08_R750.yaml

sed -i "s/ mCh_segment_proc_enable:.*/ mCh_segment_proc_enable: 1/" ${cuBB_SDK}/cuPHY-CP/cuphycontroller/config/cuphycontroller_F08_R750.yaml
sed -i "s/ channel_segment_timelines:.*/ channel_segment_timelines: 1/"${cuBB_SDK}/cuPHY-CP/testMAC/testMAC/test_mac_config.yaml

# Run F08 1C only as Enhanced L1 - L2 Interface is intended for 1 Cell.

Thank you.

@tojsm

From your another thread regarding PTP sync, i got the info that the BF3 is on numa1, so please try to use the UL and DL List as below.

ul_core_list: [5,7,9,11,13,15,17,19,21]
dl_core_list: [23,25,27,29,31,33,35,37,39,41,43]

Thank @bkecicioglu, this solution works well!

I missed the Run F08 1C only as Enhanced L1 - L2 Interface is intended for 1 Cell.

1 Like

@jixu Thanks, I will try this config to my server!

No problem!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.