An error occurs when executing the cuPHY example

The following error occurs when executing cuPHY Example.
Can you give me some advice to solve the problem?

aerial@134servwe:/opt/nvidia/cuBB$ cuPHY/build/examples/pusch_rx_multi_pipe/cuphy_ex_pusch_rx_multi_pipe -i 5GModel/aerial_mcore/examples/GPU_test_input/TVnr_7201_PUSCH_gNB_CUPHY_s0p0.h5
AERIAL_LOG_PATH unset
Using default log path
Log file set to /tmp/pusch.log
10:34:19.177932 WRN 39 0 [NVLOG.CPP] Using /opt/nvidia/cuBB/cuPHY/nvlog/config/nvlog_config.yaml for nvlog configuration
10:34:19.178477 WRN 39 0 [NVLOG.CPP] Output log file path /tmp/pusch.log
File extension: h5
PuschRx Pipeline[0]: Failed to set scheduling algo pid 43, prio 99, return code -1: err Operation not permitted
PuschRx Pipeline[0]: pid 43 set affinity to CPU 0
10:34:19.763916 WRN 43 0 [CUPHY.PUSCH_RX] LDPC throughput mode enabled
10:34:19.763916 WRN 43 0 [CUPHY.PUSCH_RX] LDPC throughput mode enabled
10:34:19.766198 ERR 43 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] Error creating algorithm instance for CC 30064771072
10:34:19.767169 ERR 43 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] EXCEPTION: Internal error
10:34:19.766702 ERR 43 0 [AERIAL_CUPHY_EVENT] [CUPHY] CUPHY FUNC EXCEPTION: Function cuphyCreateLDPCDecoder() returned CUPHY_STATUS_INTERNAL_ERROR: Internal error
Segmentation fault (core dumped)

=====================================================

aerial@134servwe:/opt/nvidia/cuBB$ cuPHY/build/examples/pusch_rx_multi_pipe/cuphy_ex_pusch_rx_multi_pipe -i 5GModel/aerial_mcore/examples/GPU_test_input/TVnr_7202_PUSCH_gNB_CUPHY_s0p0.h5
AERIAL_LOG_PATH unset
Using default log path
Log file set to /tmp/pusch.log
10:41:06.967330 WRN 50 0 [NVLOG.CPP] Using /opt/nvidia/cuBB/cuPHY/nvlog/config/nvlog_config.yaml for nvlog configuration
10:41:06.967881 WRN 50 0 [NVLOG.CPP] Output log file path /tmp/pusch.log
File extension: h5
PuschRx Pipeline[0]: Failed to set scheduling algo pid 54, prio 99, return code -1: err Operation not permitted
PuschRx Pipeline[0]: pid 54 set affinity to CPU 0
10:41:07.124841 WRN 54 0 [CUPHY.PUSCH_RX] LDPC throughput mode enabled
10:41:07.127112 ERR 54 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] Error creating algorithm instance for CC 30064771072
10:41:07.127617 ERR 54 0 [AERIAL_CUPHY_EVENT] [CUPHY] CUPHY FUNC EXCEPTION: Function cuphyCreateLDPCDecoder() returned CUPHY_STATUS_INTERNAL_ERROR: Internal error
10:41:07.128066 ERR 54 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] EXCEPTION: Internal error
Segmentation fault (core dumped)

I found the error print in Pusch_rx_test.cpp in Aerial-cuBB-source-24.03.0-Rel-24-3.277.
I would like to add printf, build cuPHY with the below command, and use it for debugging purposes.
$ cmake -Bbuild -GNinja -DCMAKE_TOOLCHAIN_FILE=cmake/toolchains/native -DCMAKE_INSTALL_PREFIX=./install
$ cmake --build build

But, I couldn’t find Pusch_rx_test.cpp in the container with find command.
$ find ./ -name Pusch_rx_test.cpp
Is there Pusch_rx_test.cpp in the container? and is my approach possible?

Hi,
That test runs for me with sudo:

/opt/nvidia/cuBB$ AERIAL_LOG_PATH=./ sudo -E ./build/cuPHY/examples/pusch_rx_multi_pipe/cuphy_ex_pusch_rx_multi_pipe -i testVectors/TVnr_7201_PUSCH_gNB_CUPHY_s0p0.h5
AERIAL_LOG_PATH set to ./
Log file set to .//pusch.log
12:41:29.484971 WRN 95 0 [NVLOG.CPP] Using /opt/nvidia/cuBB/cuPHY/nvlog/config/nvlog_config.yaml for nvlog configuration
12:41:29.485739 WRN 95 0 [NVLOG.CPP] Output log file path .//pusch.log
File extension: h5
PuschRx Pipeline[0]: pid 99 policy 2 prio 99
PuschRx Pipeline[0]: pid 99 set affinity to CPU 0
PuschRx Pipeline[0]: Wait for start-sync point
12:41:29.647921 WRN 99 0 [CUPHY.PUSCH_RX] LDPC throughput mode enabled
12:41:29.655534 WRN 99 0 [CUPHY.MEMFOOT] cuphyMemoryFootprint - GPU allocation: 674.455 MiB for cuPHY PUSCH channel object (0xfff864057df0).
12:41:29.655535 WRN 99 0 [CUPHY.PUSCH_RX] PuschRx: Running with eqCoeffAlgo 3
Run config: GPU Id 0, # of pipelines 1
PuschRx Pipeline[0]: start-syncpoint hit
PuschRx Pipeline[0]: For transmission 1 using filename testVectors/TVnr_7201_PUSCH_gNB_CUPHY_s0p0.h5
slot 0 ---------------------------------------------------------------
PuschRx Pipeline[00]: Metric - Throughput (w/ GPU runtime only): 00.0506 Gbps (encoded input bits 16136)
PuschRx Pipeline[00]: Metric - GPU Time usec (using CUDA events, over 1000 runs): Run-P1 01.9683 (01.7600, 02.2720) Run-P2 299.1046 (297.0240, 305.0240) Run-P3 17.6674 (14.0800, 22.8480) Setup-P1 06.9497 (05.7280, 08.4480) Setup-P2 22.0651 (16.5760, 25.6320) Total 347.7551
PuschRx Pipeline[00]: Metric - CPU Time usec (using wall clock w/ 1 ms delay kernel, over 1000 runs): Run-P1 00.0365 (00.0000, 00.6080) Run-P2 21.5313 (18.4960, 43.5520) Run-P3 08.9029 (08.0000, 13.1200) Setup-P1 04.2066 (03.2960, 12.1600) Setup-P2 06.5745 (05.1200, 41.0240) Total 41.2519
PuschRx Pipeline[00]: Total time usec GPU (CUDA event) 347.7551 CPU (wall clock) 41.2519
PuschRx Pipeline[00]: Debug - start-event record to notify delay in usec (wall clock) 00.0551 (00.0000, 00.4800)
PuschRx Pipeline[00]: Debug - start-event notify to pipelne launch start delay in usec (wall clock) 14.1377 (11.7120, 55.5520)
dim[0]: resDim 1 refDim 1
[0][0][0][0] refScaler 41.174561 cuphyScaler 41.174561
dim[0]: resDim 1 refDim 1
[0][0][0][0] refScaler -0.004972 cuphyScaler -0.004972
dim[0]: resDim 1 refDim 1
[0][0][0][0] refScaler 39.678135 cuphyScaler 39.678139
PuschRx Pipeline[0]: Joining worker thread
12:41:31.143759 WRN 99 0 [CUPHY.PUSCH_RX] Cell # 0 : TbIdx: 0 Metric - Block Error Rate      : 0.0000 (Error CBs 0, Mismatched CBs 0, MismatchedCRC CBs 0, Total CBs 5)
12:41:31.143761 WRN 99 0 [CUPHY.PUSCH_RX] Cell # 0 :          Metric - TB CRC Error      :(MismatchedCRC TBs 0, Total TBs 1)
12:41:31.143766 WRN 99 0 [CUPHY.PUSCH_RX] Cell # 0 : No UCI
12:41:31.143839 WRN 99 0 [CUPHY.PUSCH_RX] RSSI reference comparison SNR inf     for UEGRP[0]
12:41:31.143857 WRN 99 0 [CUPHY.PUSCH_RX] RSRP reference comparison SNR 82.7601
12:41:31.143867 WRN 99 0 [CUPHY.PUSCH_RX] RSRP reference comparison difference (dB) 3.618188202381134e-07
12:41:31.143878 WRN 99 0 [CUPHY.PUSCH_RX] NoiseVarPreEqIntf interface reference comparison SNR 140.3429
12:41:31.143891 WRN 99 0 [CUPHY.PUSCH_RX] SINR PreEq interface reference comparison SNR 140.3418
12:41:31.143902 WRN 99 0 [CUPHY.PUSCH_RX] NoiseVarPostEqIntf interface reference comparison SNR inf
12:41:31.143911 WRN 99 0 [CUPHY.PUSCH_RX] SINR PostEq interface reference comparison SNR inf
Exiting bg_fmtlog_collector - log queue ever was full: 0

And I find that file here:

/opt/nvidia/cuBB$ find ./ -name pusch_rx_test.cpp
./cuPHY/examples/pusch_rx_multi_pipe/pusch_rx_test.cpp

Hope that helps

Do I have to add “AERIAL_LOG_PATH=./” and “sudo -E”?

When “AERIAL_LOG_PATH=./sudo -E” was used, the log was changed as below.

aerial@134servwe:/opt/nvidia/cuBB$ AERIAL_LOG_PATH=./ sudo -E cuPHY/build/examples/pusch_rx_multi_pipe/cuphy_ex_pusch_rx_multi_pipe -i testVectors/TVnr_7201_PUSCH_gNB_CUPHY_s0p0.h5
AERIAL_LOG_PATH set to ./
Log file set to .//pusch.log
01:48:01.663651 WRN 104 0 [NVLOG.CPP] Using /opt/nvidia/cuBB/cuPHY/nvlog/config/nvlog_config.yaml for nvlog configuration
01:48:01.664318 WRN 104 0 [NVLOG.CPP] Output log file path .//pusch.log
File extension: h5
PuschRx Pipeline[0]: pid 108 policy 2 prio 99
PuschRx Pipeline[0]: pid 108 set affinity to CPU 0
01:48:01.874392 WRN 108 0 [CUPHY.PUSCH_RX] LDPC throughput mode enabled
01:48:01.877350 ERR 108 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] Error creating algorithm instance for CC 30064771072
01:48:01.878130 ERR 108 0 [AERIAL_CUPHY_EVENT] [CUPHY] CUPHY FUNC EXCEPTION: Function cuphyCreateLDPCDecoder() returned CUPHY_STATUS_INTERNAL_ERROR: Internal error
01:48:01.878826 ERR 108 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] EXCEPTION: Internal error

I checked the difference with your log.

  • “start-syncpoint hit” is not displayed.
    In the pusch_rx_test.cpp code, I found that number 1 was executed but number 2 did not complete execution.

    1. staticApiDataset.puschStatPrms.enableDebugEqOutput = (m_debugEqualizer) ? 1 : 0;
    2. cuphy::pusch_rx puschRxPipe(staticApiDataset.puschStatPrms, cuStrm);
  • ERR 1589 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] LogChanged Error creating algorithm instance for CC 30064771072
    following line is executed in lpdp.cpp.
    case CC_7_0:
    // Volta (My server GPU is Tesla V100 PCIe 32GB.)
    algo_factory<LDPC_ALGO_REG_INDEX_FP_DESC_DYN> ::create(*this, algos_);

    and I found that the error occurred in algos[static_cast(TAlgo)].reset(new algo_t(dec)).

    static void create(ldpc::decoder& dec, std::vector<decode_algo_ptr_t>& algos)
    {
        assert(algos.size() > static_cast<int>(TAlgo));
        algos[static_cast<int>(TAlgo)].reset(new algo_t(dec));
    }

I found the cause of the followings.

  • “start-syncpoint hit” is not displayed.
  • ERR 1589 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] LogChanged Error creating algorithm instance for CC 30064771072

I’m using GV100GL, Tesla V100 PCIe 32GB which has compute capability = 70.
So, I added -DCMAKE_CUDA_ARCHITECTURES=“70” in cuPHY build option.

the following phrase in pyAeria was helpful.
It would be nice if the same content is written in the cuPHY build.

Note that pyAerial, similarly to Aerial cuPHY, is by default built for GPUs with compute capabilities 8.0 or 9.0, and these are also what pyAerial has been tested against.
There is no guarantee that pyAerial will work correctly with other GPUs. However, pyAerial can be built for other compute capabilities with an additional cmake option, for example for CC 8.9:
cmake -Bbuild -GNinja -DCMAKE_TOOLCHAIN_FILE=cuPHY/cmake/toolchains/native -DNVIPC_FMTLOG_ENABLE=OFF -DCMAKE_CUDA_ARCHITECTURES="89"

When I solved the above problem, another problem occurred as below.
I will proceed with the analysis.
Please let me know if you have any good ideas to solve the problem.

aerial@134servwe:/opt/nvidia/cuBB$ AERIAL_LOG_PATH=./ sudo -E cuPHY/build/examples/pusch_rx_multi_pipe/cuphy_ex_pusch_rx_multi_pipe -i testVectors/TVnr_7201_PUSCH_gNB_CUPHY_s0p0.h5
AERIAL_LOG_PATH set to ./
Log file set to .//pusch.log
04:00:28.914910 WRN 47751 0 [NVLOG.CPP] Using /opt/nvidia/cuBB/cuPHY/nvlog/config/nvlog_config.yaml for nvlog configuration
04:00:28.915605 WRN 47751 0 [NVLOG.CPP] Output log file path .//pusch.log
File extension: h5
PuschRx Pipeline[0]: pid 47755 policy 2 prio 99
PuschRx Pipeline[0]: pid 47755 set affinity to CPU 0
PuschRx Pipeline[0]: Wait for start-sync point
Run config: GPU Id 0, # of pipelines 1
PuschRx Pipeline[0]: start-syncpoint hit
PuschRx Pipeline[0]: For transmission 1 using filename testVectors/TVnr_7201_PUSCH_gNB_CUPHY_s0p0.h5
04:00:29.265932 WRN 47755 0 [CUPHY.PUSCH_RX] LDPC throughput mode enabled
04:00:29.273036 WRN 47755 0 [CUPHY.MEMFOOT] cuphyMemoryFootprint - GPU allocation: 674.455 MiB for cuPHY PUSCH channel object (0x7f94a0057dc0).
04:00:29.273037 WRN 47755 0 [CUPHY.PUSCH_RX] PuschRx: Running with eqCoeffAlgo 3
04:00:29.402496 ERR 47755 0 [AERIAL_CUPHY_EVENT] [CUPHY] [/opt/nvidia/cuBB/cuPHY/src/cuphy_channels/pusch_rx.cpp:5103] CUDA driver error invalid argument
04:00:29.402591 ERR 47755 0 [AERIAL_CUPHY_EVENT] [CUPHY] CUDA DRIVER EXCEPTION: CUDA driver error: CUDA_ERROR_INVALID_VALUE - invalid argument
04:00:29.406951 ERR 47755 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] EXCEPTION: Internal error

kernelNodeParams in pusch_rx.cpp:5103 have the following values.

  • code
        if(m_LDPCkernelLaunchMode & PUSCH_RX_ENABLE_DRIVER_LDPC_LAUNCH)
        {
            for(int i = 0; i < m_LDPCDecodeDescSet.count(); ++i)
            {
                const CUDA_KERNEL_NODE_PARAMS& kernel_node_params_driver = m_ldpcLaunchCfgs[i].kernel_node_params_driver;
                    CU_CHECK_EXCEPTION(launch_kernel(kernel_node_params_driver, phase1Stream));
                // return (CUDA_SUCCESS == e) ? CUPHY_STATUS_SUCCESS : CUPHY_STATUS_INTERNAL_ERROR;
            }
        }
  • value
m_LDPCDecodeDescSet.count()=1, i=0
func=0xc0848ec0, gridDimX=3, girdDimY=1, girdDimZ=1, blockDimX=352, blockDimY=1, blockDimZ=1, sharedMemBytes=90112, kernelParams=0xc00be058, extra=0x0

The following error occurs for for TVnr_7201_PUSCH_gNB_CUPHY_s0p0.h5.

aerial@134servwe:/opt/nvidia/cuBB$ AERIAL_LOG_PATH=./ sudo -E cuPHY/build/examples/pusch_rx_multi_pipe/cuphy_ex_pusch_rx_multi_pipe -i testVectors/TVnr_7201_PUSCH_gNB_CUPHY_s0p0.h5
AERIAL_LOG_PATH set to ./
Log file set to .//pusch.log
10:25:11.948843 WRN 3038 0 [NVLOG.CPP] Using /opt/nvidia/cuBB/cuPHY/nvlog/config/nvlog_config.yaml for nvlog configuration
10:25:11.949461 WRN 3038 0 [NVLOG.CPP] Output log file path .//pusch.log
File extension: h5
PuschRx Pipeline[0]: pid 3042 policy 2 prio 99
PuschRx Pipeline[0]: pid 3042 set affinity to CPU 0
PuschRx Pipeline[0]: Wait for start-sync point
10:25:12.330007 WRN 3042 0 [CUPHY.PUSCH_RX] LDPC throughput mode enabled
10:25:12.337202 WRN 3042 0 [CUPHY.MEMFOOT] cuphyMemoryFootprint - GPU allocation: 674.455 MiB for cuPHY PUSCH channel object (0x7faffc057dc0).
10:25:12.337204 WRN 3042 0 [CUPHY.PUSCH_RX] PuschRx: Running with eqCoeffAlgo 3
Run config: GPU Id 0, # of pipelines 1
PuschRx Pipeline[0]: start-syncpoint hit
PuschRx Pipeline[0]: For transmission 1 using filename testVectors/TVnr_7201_PUSCH_gNB_CUPHY_s0p0.h5
m_LDPCDecodeDescSet.count()=1, i=0
func=0xfdd8e230, gridDimX=3, girdDimY=1, girdDimZ=1, blockDimX=352, blockDimY=1, blockDimZ=1, sharedMemBytes=90112, kernelParams=0xfc0bdfb8, extra=0x0
10:25:12.565838 ERR 3042 0 [AERIAL_CUPHY_EVENT] [CUPHY] [/opt/nvidia/cuBB/cuPHY/src/cuphy_channels/pusch_rx.cpp:5112] CUDA driver error invalid argument
10:25:12.565922 ERR 3042 0 [AERIAL_CUPHY_EVENT] [CUPHY] CUDA DRIVER EXCEPTION: CUDA driver error: CUDA_ERROR_INVALID_VALUE - invalid argument
10:25:12.569655 ERR 3042 0 [AERIAL_CUPHY_EVENT] [CUPHY.PUSCH_RX] EXCEPTION: Internal error

There is no error log for TVnr_7202_PUSCH_gNB_CUPHY_s0p0.h5 and TVnr_7203_PUSCH_gNB_CUPHY_s0p0.h5.
Are 7202 and 7203 completed normally?

aerial@134servwe:/opt/nvidia/cuBB$ AERIAL_LOG_PATH=./ sudo -E cuPHY/build/examples/pusch_rx_multi_pipe/cuphy_ex_pusch_rx_multi_pipe -i testVectors/TVnr_7202_PUSCH_gNB_CUPHY_s0p0.h5

AERIAL_LOG_PATH set to ./
Log file set to .//pusch.log
10:25:28.969841 WRN 3045 0 [NVLOG.CPP] Using /opt/nvidia/cuBB/cuPHY/nvlog/config/nvlog_config.yaml for nvlog configuration
10:25:28.970454 WRN 3045 0 [NVLOG.CPP] Output log file path .//pusch.log
File extension: h5
PuschRx Pipeline[0]: pid 3049 policy 2 prio 99
PuschRx Pipeline[0]: pid 3049 set affinity to CPU 0
PuschRx Pipeline[0]: Wait for start-sync point
10:25:29.347674 WRN 3049 0 [CUPHY.PUSCH_RX] LDPC throughput mode enabled
10:25:29.354263 WRN 3049 0 [CUPHY.MEMFOOT] cuphyMemoryFootprint - GPU allocation: 674.455 MiB for cuPHY PUSCH channel object (0x7fb110057dc0).
10:25:29.354264 WRN 3049 0 [CUPHY.PUSCH_RX] PuschRx: Running with eqCoeffAlgo 3
Run config: GPU Id 0, # of pipelines 1
PuschRx Pipeline[0]: start-syncpoint hit
PuschRx Pipeline[0]: For transmission 1 using filename testVectors/TVnr_7202_PUSCH_gNB_CUPHY_s0p0.h5
slot 0 ---------------------------------------------------------------
PuschRx Pipeline[00]: Metric - Throughput (w/ GPU runtime only): 01.2064 Gbps (encoded input bits 237776)
PuschRx Pipeline[00]: Metric - GPU Time usec (using CUDA events, over 1000 runs): Run-P1 01.9983 (01.9200, 02.3360) Run-P2 176.5055 (170.1760, 210.5280) Run-P3 18.5971 (14.0160, 23.8080) Setup-P1 07.2827 (06.1440, 08.1920) Setup-P2 144.7576 (138.7520, 149.6960) Total 349.1411
PuschRx Pipeline[00]: Metric - CPU Time usec (using wall clock w/ 1 ms delay kernel, over 1000 runs): Run-P1 00.0542 (00.0310, 00.5100) Run-P2 34.8279 (32.8100, 85.7660) Run-P3 21.6226 (20.5670, 27.9540) Setup-P1 08.0172 (06.8730, 13.0490) Setup-P2 13.7549 (12.9420, 62.9960) Total 78.2767
PuschRx Pipeline[00]: Total time usec GPU (CUDA event) 349.1411 CPU (wall clock) 78.2767
PuschRx Pipeline[00]: Debug - start-event record to notify delay in usec (wall clock) 00.0436 (00.0290, 00.5510)
PuschRx Pipeline[00]: Debug - start-event notify to pipelne launch start delay in usec (wall clock) 28.6370 (26.6910, 86.9010)
dim[0]: resDim 1 refDim 1
[0][0][0][0] refScaler 41.175293 cuphyScaler 41.175293
dim[0]: resDim 1 refDim 1
[0][0][0][0] refScaler -0.004264 cuphyScaler -0.004264
dim[0]: resDim 1 refDim 1
[0][0][0][0] refScaler 39.719410 cuphyScaler 39.719406
PuschRx Pipeline[0]: Joining worker thread
10:25:30.988214 WRN 3049 0 [CUPHY.PUSCH_RX] Cell # 0 : TbIdx: 0 Metric - Block Error Rate      : 0.0000 (Error CBs 0, Mismatched CBs 0, MismatchedCRC CBs 0, Total CBs 29)
10:25:30.988215 WRN 3049 0 [CUPHY.PUSCH_RX] Cell # 0 :          Metric - TB CRC Error      :(MismatchedCRC TBs 0, Total TBs 1)
10:25:30.988216 WRN 3049 0 [CUPHY.PUSCH_RX] Cell # 0 : No UCI
10:25:30.988299 WRN 3049 0 [CUPHY.PUSCH_RX] RSSI reference comparison SNR inf     for UEGRP[0]
10:25:30.988327 WRN 3049 0 [CUPHY.PUSCH_RX] RSRP reference comparison SNR 75.1353
10:25:30.988347 WRN 3049 0 [CUPHY.PUSCH_RX] RSRP reference comparison difference (dB) 7.46455043554306e-07
10:25:30.988363 WRN 3049 0 [CUPHY.PUSCH_RX] NoiseVarPreEqIntf interface reference comparison SNR 140.3518
10:25:30.988393 WRN 3049 0 [CUPHY.PUSCH_RX] SINR PreEq interface reference comparison SNR 140.3509
10:25:30.988413 WRN 3049 0 [CUPHY.PUSCH_RX] NoiseVarPostEqIntf interface reference comparison SNR inf
10:25:30.988431 WRN 3049 0 [CUPHY.PUSCH_RX] SINR PostEq interface reference comparison SNR inf
Exiting bg_fmtlog_collector - log queue ever was full: 0

aerial@134servwe:/opt/nvidia/cuBB$ AERIAL_LOG_PATH=./ sudo -E cuPHY/build/examples/pusch_rx_multi_pipe/cuphy_ex_pusch_rx_multi_pipe -i testVectors/TVnr_7203_PUSCH_gNB_CUPHY_s0p0.h5
AERIAL_LOG_PATH set to ./
Log file set to .//pusch.log
10:26:00.375658 WRN 3052 0 [NVLOG.CPP] Using /opt/nvidia/cuBB/cuPHY/nvlog/config/nvlog_config.yaml for nvlog configuration
10:26:00.376279 WRN 3052 0 [NVLOG.CPP] Output log file path .//pusch.log
File extension: h5
PuschRx Pipeline[0]: pid 3056 policy 2 prio 99
PuschRx Pipeline[0]: pid 3056 set affinity to CPU 0
PuschRx Pipeline[0]: Wait for start-sync point
10:26:00.752233 WRN 3056 0 [CUPHY.PUSCH_RX] LDPC throughput mode enabled
10:26:00.758837 WRN 3056 0 [CUPHY.MEMFOOT] cuphyMemoryFootprint - GPU allocation: 674.455 MiB for cuPHY PUSCH channel object (0x7efdc4057dc0).
10:26:00.758838 WRN 3056 0 [CUPHY.PUSCH_RX] PuschRx: Running with eqCoeffAlgo 3
Run config: GPU Id 0, # of pipelines 1
PuschRx Pipeline[0]: start-syncpoint hit
PuschRx Pipeline[0]: For transmission 1 using filename testVectors/TVnr_7203_PUSCH_gNB_CUPHY_s0p0.h5
slot 0 ---------------------------------------------------------------
PuschRx Pipeline[00]: Metric - Throughput (w/ GPU runtime only): 00.8442 Gbps (encoded input bits 192624)
PuschRx Pipeline[00]: Metric - GPU Time usec (using CUDA events, over 1000 runs): Run-P1 02.0001 (01.9200, 02.5280) Run-P2 208.5940 (204.8960, 244.5760) Run-P3 17.5762 (13.6320, 23.4560) Setup-P1 07.1987 (06.1440, 08.1920) Setup-P2 143.5190 (138.4960, 151.8400) Total 378.8880
PuschRx Pipeline[00]: Metric - CPU Time usec (using wall clock w/ 1 ms delay kernel, over 1000 runs): Run-P1 00.0401 (00.0300, 00.4860) Run-P2 34.5776 (32.8050, 85.0520) Run-P3 21.6843 (20.7090, 30.9810) Setup-P1 08.0810 (07.1420, 14.3830) Setup-P2 13.9223 (13.3760, 63.9790) Total 78.3053
PuschRx Pipeline[00]: Total time usec GPU (CUDA event) 378.8880 CPU (wall clock) 78.3053
PuschRx Pipeline[00]: Debug - start-event record to notify delay in usec (wall clock) 00.0615 (00.0320, 04.2390)
PuschRx Pipeline[00]: Debug - start-event notify to pipelne launch start delay in usec (wall clock) 28.8919 (27.2680, 85.0110)
dim[0]: resDim 1 refDim 1
[0][0][0][0] refScaler 41.174583 cuphyScaler 41.174583
dim[0]: resDim 1 refDim 1
[0][0][0][0] refScaler -0.004991 cuphyScaler -0.004991
dim[0]: resDim 1 refDim 1
[0][0][0][0] refScaler 39.668068 cuphyScaler 39.668064
PuschRx Pipeline[0]: Joining worker thread
10:26:02.417133 WRN 3056 0 [CUPHY.PUSCH_RX] Cell # 0 : TbIdx: 0 Metric - Block Error Rate      : 0.0000 (Error CBs 0, Mismatched CBs 0, MismatchedCRC CBs 0, Total CBs 23)
10:26:02.417135 WRN 3056 0 [CUPHY.PUSCH_RX] Cell # 0 :          Metric - TB CRC Error      :(MismatchedCRC TBs 0, Total TBs 1)
10:26:02.417136 WRN 3056 0 [CUPHY.PUSCH_RX] Cell # 0 : No UCI
10:26:02.417219 WRN 3056 0 [CUPHY.PUSCH_RX] RSSI reference comparison SNR inf     for UEGRP[0]
10:26:02.417247 WRN 3056 0 [CUPHY.PUSCH_RX] RSRP reference comparison SNR 82.3899
10:26:02.417264 WRN 3056 0 [CUPHY.PUSCH_RX] RSRP reference comparison difference (dB) 3.7904828786849976e-07
10:26:02.417281 WRN 3056 0 [CUPHY.PUSCH_RX] NoiseVarPreEqIntf interface reference comparison SNR 140.3407
10:26:02.417304 WRN 3056 0 [CUPHY.PUSCH_RX] SINR PreEq interface reference comparison SNR 140.3396
10:26:02.417322 WRN 3056 0 [CUPHY.PUSCH_RX] NoiseVarPostEqIntf interface reference comparison SNR inf
10:26:02.417339 WRN 3056 0 [CUPHY.PUSCH_RX] SINR PostEq interface reference comparison SNR inf
Exiting bg_fmtlog_collector - log queue ever was full: 0

Hi @twoheons ,

TC7202 and 7203 should be passed. The listed TCs here (Supported Test Vector Configurations - NVIDIA Docs) have been verified in each release.

The LDPC kernel is optimized for each compute capability based on the LDPC decoder configurations (base graph, number of parity nodes, lifting size) to achieve higher performance. However, we have tested only CC8.0 and above (qualified only CC8.6 and CC9.0 (limited support for pyAerial) in Rel-24-3 as written in the release note) and never tested CC7.0 with the recent codebase, although there is a condition for CC7.0 in the code (they haven’t updated for some years). V100 has already been EOL. We’d be happy if you consider getting a newer GPU.

Thank you.