Public Key Accelerator performance on Bluefield 2

Hi there, I’m new to Bluefield 2, currently using the card’s hardware Public Key Accelerator to perform ECDSA signing for certain network packets. I’m using the libPKA(GitHub), and have read its architectural documentation.

However, I would like to seek for further clarification on these following topics regarding the Accelerator’s performance:

  1. How many physical accelerators are there on the Bluefield 2 card? My card’s model seems to be MBF2M516A-CEEO. For what I have tried so far seems that I can get at most 8 Rings from libPKA, each has an dedicated underlying accelerator; is that correct?
  2. How do the accelerators connect to the card’s CPUs? From the sources of user-mode libPKA it seems to be mem-mapping devices, but how does it actually interface (i.e. CPU ISA-extensions or PCIe devices)? Do they have CPU core-affinity?
  3. Would anyone provide some “raw” performance benchmarks for the accelerator? In particular I am looking at ECDSA signature generation for NIST P256 curve. While the architectural documentation of libPKA mentioned above does have some numbers, but it is not clear that if they are clock cycles or throughput; if they are cycles, what is the frequency? What I currently got from libPKA is around 120us per P256 signature per Hardware Ring, which does not seem to be fast. If available, I would like to know how fast can actual hardware can run - I may write a custom driver for that if necessary.

Thanks a lot.