Encryption/ decryption speeds on Orin NX

Hi,

We enabled the crypto driver on Orin NX after setting up disk encryption-

test@test:~# find /proc/device-tree/ -iname '*crypto*'
/proc/device-tree/bus@0/host1x@13e00000/crypto@15820000
/proc/device-tree/bus@0/host1x@13e00000/crypto@15840000
test@test:~# 
test@test:~# cat /proc/device-tree/bus@0/host1x@13e00000/crypto@15820000/status
okay
test@test:~# 
test@test:~# cat /proc/device-tree/bus@0/host1x@13e00000/crypto@15840000/status
okay

But we noticed that the encryption/ decryption speeds with and without the tegra-crypto driver is noticeably different.
Without the tegra-crypto driver enabled-

test@test:~# cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1       842906 iterations per second for 256-bit key
PBKDF2-sha256    1524093 iterations per second for 256-bit key
PBKDF2-sha512     688041 iterations per second for 256-bit key
PBKDF2-ripemd160  411529 iterations per second for 256-bit key
PBKDF2-whirlpool  238312 iterations per second for 256-bit key
argon2i       4 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id      4 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b       701.7 MiB/s      1682.6 MiB/s
    serpent-cbc        128b               N/A               N/A
    twofish-cbc        128b               N/A               N/A
        aes-cbc        256b       561.4 MiB/s      1452.1 MiB/s
    serpent-cbc        256b               N/A               N/A
    twofish-cbc        256b               N/A               N/A
        aes-xts        256b      1152.9 MiB/s      1161.6 MiB/s
    serpent-xts        256b               N/A               N/A
    twofish-xts        256b               N/A               N/A
        aes-xts        512b      1016.7 MiB/s      1016.3 MiB/s
    serpent-xts        512b               N/A               N/A
    twofish-xts        512b               N/A               N/A

With the tegra-crypto driver enabled-

test@test:~# cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1       841553 iterations per second for 256-bit key
PBKDF2-sha256    1524093 iterations per second for 256-bit key
PBKDF2-sha512     688946 iterations per second for 256-bit key
PBKDF2-ripemd160  411529 iterations per second for 256-bit key
PBKDF2-whirlpool  238746 iterations per second for 256-bit key
argon2i       4 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id      4 iterations, 1048576 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b        92.8 MiB/s       211.6 MiB/s
    serpent-cbc        128b               N/A               N/A
    twofish-cbc        128b               N/A               N/A
        aes-cbc        256b        71.7 MiB/s       172.0 MiB/s
    serpent-cbc        256b               N/A               N/A
    twofish-cbc        256b               N/A               N/A
        aes-xts        256b       210.4 MiB/s       210.9 MiB/s
    serpent-xts        256b               N/A               N/A
    twofish-xts        256b               N/A               N/A
        aes-xts        512b       171.4 MiB/s       170.0 MiB/s
    serpent-xts        512b               N/A               N/A
    twofish-xts        512b               N/A               N/A

Is there something I’ve missed? Is there a better way to check why the speed is slower?

Thanks,
K

hello heartsystem,

there’re crypto accelerators from device tree.

                        crypto@15820000 {
                                compatible = "nvidia,tegra234-se-aes";
                        crypto@15840000 {
                                compatible = "nvidia,tegra234-se-hash";

please check if the kernel supports the crypto function, and the Jetson HW SE is the highest priority.
for instance, $ sudo cat /proc/crypto | grep -A 11 "xts(aes)"
we should be able to observe the “xts-aes-tegra” driver has the highest priority.

BTW,
please also check the “clk_enable_count” of the se clock node.
for instance, $ sudo cat /sys/kernel/debug/clk/se/clk_enable_count
SE is enabled when the count > 0. currently, it’s 2 after the kernel booting up.

hi,

i’ve checked the priorities already-

test@test:~# awk '/^driver/{d=$3} /^priority/{printf "driver:=%s  priority=%s\n", d, $3}' /proc/crypto | grep xts
test@test:~#
driver:=xts-aes-tegra  priority=500
driver:=cryptd(__xts-aes-ce)  priority=350
driver:=xts-aes-ce  priority=300
driver:=__xts-aes-ce  priority=300
test@test:~# 

as well as the count-

test@test:~# cat /sys/kernel/debug/clk/se/clk_enable_count
2

and the accelerators on the device tree

test in output/crypto_bringup_tegra_se_hash/bin 
➜ ~ grep -i 'crypto' out.dts -A 1
                        crypto@15820000 {
                                compatible = "nvidia,tegra234-se2-aes\0nvidia,tegra234-se-aes";
--
                        crypto@15840000 {
                                compatible = "nvidia,tegra234-se4-hash\0nvidia,tegra234-se-hash";

Let me know what I’m missing.

Thanks,
K

hello heartsystem,

just an FYI,
if that’s configured to disabled. it will use the software mechanism for encryption and decryption. SE hardware will not be used, but the actual encryption and decryption do happen using the software alternative available in the kernel.

SE hardware is running at minimum clock rate by default.
you may try SE clock tuning to check the difference.
for instance,
$ echo rate > /sys/kernel/debug/bpmp/debug/clk/nafll_se/rate

anyways,
hardware/software mechanism it does not have any functional impact.

hi,

yes, i’m aware of how the configuration determines the use of SE hardware, my question is why is the speed slower with the hardware? Shouldn’t it be faster?

Thanks,
K

did you check the SE clock, which should be running at minimum clock rate by default.

yes-

test@test:~# cat /sys/kernel/debug/clk/se/clk_rate
473600000
test@test:~#
test@test:~#
test@test:~# cat /sys/kernel/debug/bpmp/debug/clk/nafll_se/rate
473600000
test@test:~#

FYI,
the max rate of SE is 1GHz.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.