Issues with crypto IP core
We have encountered a problem with CRYPTO IP core on STM32MP157C.
Our setup is STM32MP157C-DK2 with latest image/SDK installed
(openstlinux-5.10-dunfell-mp1-21-11-17).
1. First of all, our IPSec solution based on strongSwan doesn't work at
all when stm32-cryp.ko is loaded: after processing several packets
IPSec connection stucks. The only message we got in kernel ring
buffer is:
```
[ 102.064269] NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!!
```
It stucks no matter what cipher is selected or which settings are
used. However, if we don't use stm32-cryp at all (e.g. if we unload
this module), IPSec connection works perfectly.
We haven't found a simple way to reprocude this bug without deploying
IPSec infrastructure (it's very simple to do it with AlgoVPN [1]), so
we can provide you an access to our test environment or give you more
details on request.
2. Moreover, we have made a performance test using cryptodev-tests
([2], but this package is available in Yocto SDK too) and `openssl
speed`, and it looks like software implementations are much faster
than hardware accelerated one.
The first test was performed with userspace software implementation
(as evidence, CPU was mostly in userspace (18.02s/18.38s) during this
test):
```
root@stm32mp1:~# cat /proc/crypto | grep cbc
root@stm32mp1:~# time openssl speed -evp aes-256-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 1829060 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 548756 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 145037 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 36751 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 4614 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 2304 aes-256-cbc's in 3.00s
...
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 9754.99k 11706.79k 12376.49k 12544.34k 12599.30k 12582.91k
real 0m 18.38s
user 0m 18.02s
sys 0m 0.00s
```
The second one uses hardware-accelerated algo:
```
root@stm32mp1:~# insmod stm32-cryp.ko
root@stm32mp1:~# cat /proc/crypto | grep cbc
name : cbc(des3_ede)
driver : stm32-cbc-des3
name : cbc(des)
driver : stm32-cbc-des
name : cbc(aes)
driver : stm32-cbc-aes
root@stm32mp1:~# time openssl speed -evp aes-256-cbc -elapsed
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 32666 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 26338 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 15378 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 5661 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 818 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 16384 size blocks: 408 aes-256-cbc's in 3.01s
...
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
aes-256-cbc 174.22k 561.88k 1312.26k 1932.29k 2233.69k 2220.82k
real 0m 18.13s
user 0m 0.07s
sys 0m 2.59s
```
Very similar results with speed test from cryptodev source code:
```
root@stm32mp1:~# insmod stm32-cryp.ko
root@stm32mp1:~# time ./speed
Testing AES-128-CBC cipher:
Encrypting in chunks of 65536 bytes: done. 11.47 MB in 5.00 secs: 2.29 MB/sec
real 0m 5.00s
user 0m 0.00s
sys 0m 0.01s
```
So the question is: is this the real performance (2 MB/s for chunks
>16KB) of crypto IP core or is it an issue due to drivers or any
other hw/sw interaction problems? As we know, we are not the only
ones who bumped into this issue ([3], last answer).
It's worth noticing that during hardware-accelerated test CPU was
intensively used (95.4%) in kernel space with irq/60-54001000 task,
so this method can't be used even for reducing CPU load with
offloading it to crypto IP.
P.S. We have added these lines into local.conf to build strongSwan
and OpenSSL with cryptodev support:
```
PACKAGECONFIG_append_pn-openssl = " cryptodev-linux"
IMAGE_INSTALL_append = " strongswan cryptodev-module cryptodev-tests"
```
Thank you in advance!
[1]: https://github.com/trailofbits/algo
[2]: https://github.com/cryptodev-linux/cryptodev-linux/tree/master/tests
[3]: https://community.st.com/s/question/0D50X0000C4POdo/crypto-api
