[DRV.FUNC_UL] Slot Map 103, SFN 104 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!

Hello,

I am trying to run OAI with cuBB. After running OAI nr-softmodem, I see the following error keeps flushing to docker logs.

05:02:35.952954 ERR UlPhyDriver02 0 [AERIAL_CUPHY_API_EVENT] [DRV.FUNC_UL] Slot Map 189, SFN 190 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!
05:02:35.962940 ERR UlPhyDriver02 0 [AERIAL_CUPHY_API_EVENT] [DRV.FUNC_UL] Slot Map 190, SFN 191 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!
05:02:35.972946 ERR UlPhyDriver02 0 [AERIAL_CUPHY_API_EVENT] [DRV.FUNC_UL] Slot Map 191, SFN 192 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!
05:02:35.982945 ERR UlPhyDriver02 0 [AERIAL_CUPHY_API_EVENT] [DRV.FUNC_UL] Slot Map 192, SFN 193 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!
05:02:35.992943 ERR UlPhyDriver02 0 [AERIAL_CUPHY_API_EVENT] [DRV.FUNC_UL] Slot Map 193, SFN 194 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!
05:02:36.002946 ERR UlPhyDriver02 0 [AERIAL_CUPHY_API_EVENT] [DRV.FUNC_UL] Slot Map 194, SFN 195 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!
05:02:36.012950 ERR UlPhyDriver03 0 [AERIAL_CUPHY_API_EVENT] [DRV.FUNC_UL] Slot Map 195, SFN 196 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!
05:02:36.022945 ERR UlPhyDriver03 0 [AERIAL_CUPHY_API_EVENT] [DRV.FUNC_UL] Slot Map 196, SFN 197 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!
05:02:36.032944 ERR UlPhyDriver03 0 [AERIAL_CUPHY_API_EVENT] [DRV.FUNC_UL] Slot Map 197, SFN 198 Slot 19 Order kernel timeout error or Exit error for cell index 0 Dyn index -1!

This is the sample output of nr-softmodem:

5053.481992 [NR_MAC] I Frame.Slot 384.0

5054.761997 [NR_MAC] I Frame.Slot 512.0

5056.041992 [NR_MAC] I Frame.Slot 640.0

5057.321991 [NR_MAC] I Frame.Slot 768.0

5058.601991 [NR_MAC] I Frame.Slot 896.0

5059.881997 [NR_MAC] I Frame.Slot 0.0

5061.161992 [NR_MAC] I Frame.Slot 128.0

5062.441992 [NR_MAC] I Frame.Slot 256.0

Does anyone know the reason for this error? Do you have any suggestions for solving this? Can we ignore this?

Hi @vantuan_ngo
Please check and confirm the following three things

  1. run the following two commands on gNB server to see if PTP is synced with GM.
    systemctl status ptp4l.service
    systemctl status phc2sys.service

the rms values in outputs for both services should be 1 digit or small two digits.
2) check the O-RU logs to confirm o-RU is PTP synced.

  1. please check and confirm whether the MAC address (dst_mac_addr) in cuphycontroller_P5G_FXN.yaml matches with O-RU MAC address, also the dst-mac in O-RU matches with the MAC address of NIC port of gNB.
1 Like

Hi @jixu ,

The error messages keep flushing.

  1. For PTP:
➜  ~ systemctl status ptp4l.service
systemctl status phc2sys.service
● ptp4l.service - Precision Time Protocol (PTP) service
     Loaded: loaded (/lib/systemd/system/ptp4l.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2024-07-04 07:46:56 UTC; 4h 44min ago
       Docs: man:ptp4l
    Process: 886416 ExecStartPre=ifconfig aerial00 up (code=exited, status=0/SUCCESS)
    Process: 886417 ExecStartPre=ethtool --set-priv-flags aerial00 tx_port_ts on (code=exited, status=0/SUCCESS)
    Process: 886418 ExecStartPre=ethtool -A aerial00 rx off tx off (code=exited, status=0/SUCCESS)
    Process: 886419 ExecStartPre=ifconfig aerial01 up (code=exited, status=0/SUCCESS)
    Process: 886420 ExecStartPre=ethtool --set-priv-flags aerial01 tx_port_ts on (code=exited, status=0/SUCCESS)
    Process: 886421 ExecStartPre=ethtool -A aerial01 rx off tx off (code=exited, status=0/SUCCESS)
   Main PID: 886422 (ptp4l)
      Tasks: 1 (limit: 598939)
     Memory: 404.0K
        CPU: 1min 37.889s
     CGroup: /system.slice/ptp4l.service
             └─886422 /usr/sbin/ptp4l -f /etc/ptp.conf
● phc2sys.service - Synchronize system clock or PTP hardware clock (PHC)
     Loaded: loaded (/lib/systemd/system/phc2sys.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2024-07-04 07:46:58 UTC; 4h 44min ago
       Docs: man:phc2sys
    Process: 886423 ExecStartPre=sleep 2 (code=exited, status=0/SUCCESS)
   Main PID: 886431 (sh)
      Tasks: 2 (limit: 598939)
     Memory: 436.0K
        CPU: 1min 43.149s
     CGroup: /system.slice/phc2sys.service
             β”œβ”€886431 /bin/sh -c "taskset -c 41 /usr/sbin/phc2sys -s /dev/ptp1 -c CLOCK_REALTIME -n 24 -O 0 -R 25>
             └─886432 /usr/sbin/phc2sys -s /dev/ptp1 -c CLOCK_REALTIME -n 24 -O 0 -R 256 -u 256
➜  ~ sudo tail -f /var/log/syslog
Jul  4 12:32:41 smo3 ptp4l: [118457.014] rms   29 max   58 freq   -124 +/- 136 delay   787 +/-   7
Jul  4 12:32:41 smo3 phc2sys: [118457.490] CLOCK_REALTIME rms   37 max   72 freq  -7314 +/- 342 delay   494 +/-  13
Jul  4 12:32:42 smo3 ptp4l: [118458.139] rms   16 max   33 freq   -152 +/-  76 delay   789 +/-   1
Jul  4 12:32:42 smo3 phc2sys: [118458.509] CLOCK_REALTIME rms   41 max   77 freq  -7251 +/- 339 delay   494 +/-  13
Jul  4 12:32:43 smo3 ptp4l: [118459.264] rms   22 max   46 freq   -136 +/- 101 delay   794 +/-   6
Jul  4 12:32:43 smo3 phc2sys: [118459.529] CLOCK_REALTIME rms   41 max   72 freq  -7153 +/- 370 delay   493 +/-  13
Jul  4 12:32:44 smo3 ptp4l: [118460.390] rms   24 max   53 freq   -154 +/- 112 delay   794 +/-   4
Jul  4 12:32:44 smo3 phc2sys: [118460.549] CLOCK_REALTIME rms   39 max   72 freq  -7344 +/- 309 delay   493 +/-  13
Jul  4 12:32:45 smo3 ptp4l: [118461.514] rms   22 max   39 freq   -124 +/- 104 delay   794 +/-   7
Jul  4 12:32:45 smo3 phc2sys: [118461.569] CLOCK_REALTIME rms   25 max   55 freq  -7193 +/- 202 delay   495 +/-  13
Jul  4 12:32:46 smo3 phc2sys: [118462.588] CLOCK_REALTIME rms   28 max   51 freq  -7209 +/- 236 delay   493 +/-  13
Jul  4 12:32:47 smo3 ptp4l: [118462.639] rms   22 max   52 freq   -127 +/- 105 delay   795 +/-   4
  1. RU is synchronized.
error -17 ns, e_variation 14.221 ns, slope 121.1, s_variation 0, delay 1010, synchronized
  1. This is a pcap file I captured exchanged between DU and RU. The MAC address is matched
    OFH.zip (5.8 MB)

This is my current config.

cuphycontroller_P5G_FXN.yaml

# Copyright (c) 2017-2024, NVIDIA CORPORATION.  All rights reserved.
#
# Redistribution and use in source and binary forms, with or without modification, are permitted
# provided that the following conditions are met:
#     * Redistributions of source code must retain the above copyright notice, this list of
#       conditions and the following disclaimer.
#     * Redistributions in binary form must reproduce the above copyright notice, this list of
#       conditions and the following disclaimer in the documentation and/or other materials
#       provided with the distribution.
#     * Neither the name of the NVIDIA CORPORATION nor the names of its contributors may be used
#       to endorse or promote products derived from this software without specific prior written
#       permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
# FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NVIDIA CORPORATION BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
# STRICT LIABILITY, OR TOR (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
---
l2adapter_filename: l2_adapter_config_P5G.yaml
aerial_metrics_backend_address: 127.0.0.1:8081

# CPU core shared by all low-priority threads
low_priority_core: 8
nic_tput_alert_threshold_mbps: 85000

cuphydriver_config:
  standalone: 0
  validation: 0
  num_slots: 8
  profiler_sec: 0
  log_level: DBG
  dpdk_thread: 8
  dpdk_verbose_logs: 0
  accu_tx_sched_res_ns: 500
  accu_tx_sched_disable: 0
  fh_stats_dump_cpu_core: 8
  pdump_client_thread: -1
  mps_sm_pusch: 84
  mps_sm_pucch: 16
  mps_sm_prach: 16
  mps_sm_pdsch: 82
  mps_sm_pdcch: 28
  mps_sm_pbch: 14
  mps_sm_srs: 16
  pdsch_fallback: 0
  dpdk_file_prefix: cuphycontroller
  nics:
    - nic: 0000:ab:00.0
      mtu: 8192
      cpu_mbufs: 196608
      uplane_tx_handles: 64
      txq_count: 48
      rxq_count: 16
      txq_size: 8192
      rxq_size: 16384
      gpu: 1
  gpus:
    - 1
    # Set GPUID to the GPU sharing the PCIe switch as NIC
    # run nvidia-smi topo -m to find out which GPU
  workers_ul:
    - 2
    - 3
  workers_dl:
    - 4
    - 5
    - 6
  workers_sched_priority: 95
  prometheus_thread: -1
  start_section_id_srs: 3072
  start_section_id_prach: 2048
  enable_ul_cuphy_graphs: 1
  enable_dl_cuphy_graphs: 1
  # Both RF and eLSU eCPRI configs
  ul_order_timeout_cpu_ns: 4000000
  ul_order_timeout_gpu_ns: 4000000
  cplane_disable: 0
  gpu_init_comms_dl: 1
  cell_group: 1
  cell_group_num: 1
  pusch_sinr: 1
  pusch_rssi: 1
  pusch_tdi: 0
  pusch_cfo: 0
  pusch_dftsofdm: 0
  pusch_to:  0
  pusch_select_eqcoeffalgo: 1
  pusch_select_chestalgo: 1
  pusch_tbsizecheck: 1
  enable_cpu_task_tracing: 0
  enable_compression_tracing: 0
  enable_prepare_tracing: 0
  enable_dl_cqe_tracing: 0
  mMIMO_enable: 0
  pusch_forcedNumCsi2Bits: 0
  enable_srs: 0
  mCh_segment_proc_enable: 0
  enable_csip2_v3: 0
  cells:
    - name: O-RU 0
      cell_id: 101
      ru_type: 1
      # set to 00:00:00:00:00:00 to use the MAC address of the NIC port to use
      src_mac_addr: 9c:63:c0:a7:08:32
      # dst_mac_addr: e8:c7:4f:25:89:40 # MAC address of Foxconn O-RU #1 # Liteon
      dst_mac_addr: 6c:b9:c5:22:92:cc
      nic: 0000:ab:00.0
      vlan: 564
      pcp: 0
      txq_count_uplane: 1
      eAxC_id_ssb_pbch: [0, 1, 2, 3]
      eAxC_id_pdcch: [0, 1, 2, 3]
      eAxC_id_pdsch: [0, 1, 2, 3]
      eAxC_id_csirs: [0, 1, 2, 3]
      eAxC_id_pusch: [0, 1, 2, 3]
      eAxC_id_pucch: [0, 1, 2, 3]
      eAxC_id_srs: [0, 1, 2, 3]
      eAxC_id_prach: [0, 1, 2, 3]
      dl_iq_data_fmt: {comp_meth: 1, bit_width: 9}
      ul_iq_data_fmt: {comp_meth: 1, bit_width: 9}
      section_3_time_offset: 484
      fs_offset_dl: 15
      exponent_dl: 4
      ref_dl: 0
      fs_offset_ul: -5
      exponent_ul: 4
      max_amp_ul: 65504
      mu: 1
      T1a_max_up_ns: 280000
      T1a_max_cp_ul_ns: 405000
      Ta4_min_ns: 50000
      Ta4_max_ns: 331000
      Tcp_adv_dl_ns: 125000
      fh_len_range: 0
      pusch_prb_stride: 273
      prach_prb_stride: 12
      srs_prb_stride: 273
      pusch_ldpc_max_num_itr_algo_type: 1
      pusch_fixed_max_num_ldpc_itrs: 10
      pusch_ldpc_n_iterations: 10
      pusch_ldpc_early_termination: 0
      pusch_ldpc_algo_index: 0
      pusch_ldpc_flags: 2
      pusch_ldpc_use_half: 1
      ul_gain_calibration: 48.68
      lower_guard_bw: 845
      tv_pusch: cuPhyChEstCoeffs.h5

l2_adapter_config_P5G.yaml

# Copyright (c) 2017-2024, NVIDIA CORPORATION.  All rights reserved.
#
# Redistribution and use in source and binary forms, with or without modification, are permitted
# provided that the following conditions are met:
#     * Redistributions of source code must retain the above copyright notice, this list of
#       conditions and the following disclaimer.
#     * Redistributions in binary form must reproduce the above copyright notice, this list of
#       conditions and the following disclaimer in the documentation and/or other materials
#       provided with the distribution.
#     * Neither the name of the NVIDIA CORPORATION nor the names of its contributors may be used
#       to endorse or promote products derived from this software without specific prior written
#       permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
# IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
# FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NVIDIA CORPORATION BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
# BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
# OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
# STRICT LIABILITY, OR TOR (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

---
#gnb_module
msg_type: scf_5g_fapi
phy_class: scf_5g_fapi
slot_advance: 3

# tick_generator_mode: 0 - poll + sleep; 1 - sleep; 2 - timer_fd
tick_generator_mode: 1

# Allowed maximum latency of SLOT FAPI messages which send from L2 to L1. Unit: slot
allowed_fapi_latency: 0

# Allowed tick interval error. Unit: us
allowed_tick_error: 10

timer_thread_config:
  name: timer_thread
  cpu_affinity: 7
  sched_priority: 99
message_thread_config:
  name: msg_processing
  #core assignment
  cpu_affinity: 7
  # thread priority
  sched_priority: 95
# Lowest TTI for Ticking
mu_highest: 1
dl_tb_loc: 1
instances:
  # PHY 0
  -
    name: scf_gnb_configure_module_0_instance_0
    prach_ta_offset_usec: 2.5
  -
    name: scf_gnb_configure_module_0_instance_1
    prach_ta_offset_usec: 2.5
  -
    name: scf_gnb_configure_module_0_instance_2
    prach_ta_offset_usec: 2.5
  -
    name: scf_gnb_configure_module_0_instance_3
    prach_ta_offset_usec: 2.5
  -
    name: scf_gnb_configure_module_0_instance_4
    prach_ta_offset_usec: 2.5
  -
    name: scf_gnb_configure_module_0_instance_5
    prach_ta_offset_usec: 2.5
  -
    name: scf_gnb_configure_module_0_instance_6
    prach_ta_offset_usec: 2.5
  -
    name: scf_gnb_configure_module_0_instance_7
    prach_ta_offset_usec: 2.5

# Config dedicated yaml file for nvipc. Example: nvipc_multi_instances.yaml
nvipc_config_file: null

# Transport settings for nvIPC
transport:
  type: shm
  udp_config:
    local_port: 38556
    remort_port: 38555
  shm_config:
    primary: 1
    prefix: nvipc
    cuda_device_id: 0
    ring_len: 8192
    mempool_size:
      cpu_msg:
        buf_size: 8192
        pool_len: 4096
      cpu_data:
        buf_size: 576000
        pool_len: 1024
      cuda_data:
        buf_size: 307200
        pool_len: 0
      gpu_data:
        buf_size: 576000
        pool_len: 0
  dpdk_config:
    primary: 1
    prefix: nvipc
    local_nic_pci: 0000:ab:00.0
    peer_nic_mac: 00:00:00:00:00:00
    cuda_device_id: 0
    need_eal_init: 0
    lcore_id: 11
    mempool_size:
      cpu_msg:
        buf_size: 8192
        pool_len: 4096
      cpu_data:
        buf_size: 576000
        pool_len: 1024
      cuda_data:
        buf_size: 307200
        pool_len: 0
  app_config:
    grpc_forward: 0
    debug_timing: 0
    pcap_enable: 1
    pcap_cpu_core: 8 # CPU core of background pcap log save thread
    pcap_cache_size_bits: 29 # 2^29 = 512MB, size of /dev/shm/${prefix}_pcap
    pcap_file_size_bits: 31 # 2^31 = 2GB, max size of /var/log/aerial/${prefix}_pcap. Requires pcap_file_size_bits > pcap_cache_size_bits.
    pcap_max_data_size: 8000 # Max DL/UL FAPI data size to capture reduce pcap size.

cell_group: 1
prepone_h2d_copy: 1
pucch_dtx_thresholds: [-100.0, -100.0, -100.0, -100.0, -100.0]
ptp: {gps_alpha: 0, gps_beta: 0}
enableTickDynamicSfnSlot: 1
...

Hi @vantuan_ngo

what O-RU and NIC card are you using in your setup?

Also please try changing the pcp value to β€˜7’ for the cell in cuphycontroller config yaml file, as shown below.

cells:
- name: O-RU 0
cell_id: 101
ru_type: 1
# set to 00:00:00:00:00:00 to use the MAC address of the NIC port to use
src_mac_addr: 9c:63:c0:a7:08:32
# dst_mac_addr: e8:c7:4f:25:89:40 # MAC address of Foxconn O-RU #1 # Liteon
dst_mac_addr: 6c:b9:c5:22:92:cc
nic: 0000:ab:00.0
vlan: 564
pcp: 7 # it was β€˜0’

This is my topology:


NIC is ConnectX-6 DX. For RU, I am trying with Delta and Liteon RU.

After changing PCP to 7, the ERR appear periodically with lower frequency, once per several minutes. This is the current output.


Is it normal output? Could you explain the reason? Why does changing PCP help?

Oh, I see my RU send VLAN packets with PCP 7.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.