cuPHY Performance Test Results Differ from User Guide (F08 4T4R, 14 Cells)

Hi,

We are runing cuPHY Performance TestCases(F08 4T4R) now.

We follow the user guide to config a case inculding PDSCH(4 streams)/PDCCH/PUSCH(2 streams)/PUCCH/CSI-RS/SSB/PRACH, no SRS.

In the test,we add the cell number one by one, the process time threshhold(UL 1500us,DL375us as default) wasn’t broken through until 14 cells.

But in the user guide, it says: A100X supports 5 4T4R peak cells.

Why is there a big difference between our test and the user guide? Do we miss something?

The test runs on following system:

Hardareware: X86 Server( Intel Xeon W-3175X CPU) + A100 GPU + Mellanox MCX653105A-HDAT VPI adapter card

Software: Aerial CUDA-Accelerated RAN Release 24-3

Hi @fltang,

We have not benchmarked 24-3 release with A100X or A100.

A100X supports 5 4T4R peak cells.

Can you please share the link to this information in the user guide?

Thank you,
Balkan

we find following description in 24-2.1 Page69, but it’s gone in 24-3.

Then, can we use A100 to run cuBB Version 24-3? We have run some cuPHY cases, it still works

By the way, when we run cuPHY cases, we find the tensor core usage is less than 1%.

Is it normal? Why cuPHY doesn’t use tensor core to accelerate?