Skip to main content
Visitor II
March 25, 2020
Question

Cryptodev/af_alg engines in OpenSSL on STM32MP157C-DK2 do not provide hash algorithms for hardware acceleration

  • March 25, 2020
  • 14 replies
  • 5178 views

Hello,

I have successfully compiled and deployed the distribution package (trusted) onto the STM32MP1. I am using OpenSSL in a program for signing data, where SHA256 is used as a hash. I would like to take advantage of hardware acceleration by using either the cryptodev or af_alg engines. The distribution provides Yocto recipes for cryptodev-linux (header), cryptodev-module (kernel module) of version 1.9 and for OpenSSL version 1.1.1b, in which I modified the PACKAGECONFIG line by adding cryptodev-linux option in it, so OpenSSL compiles with the cryptodev support. The 3 aforementioned recipes were built and deployed using devtool. OpenSSL also provides a built-in AF_ALG engine by default. However, the supported algorithms of both engines are only ciphers, there are no hash algorithms whatsoever. I also tried to build OpenSSL with the -DUSE_CRYPTODEV_DIGESTS, but that didnt change anything. Version 1.0.2 of OpenSSL however, provides some hashing algorithms like MD5 and SHA1, but not SHA256. I have also posted an issue on OpenSSL github, but maybe there is a problem on STM side. The output of cat /proc/crypto lists

SHA256 among many other hashes, which means it should be supported.

I would appreciate any help regarding this issue.

    This topic has been closed for replies.

    14 replies

    Technical Moderator
    March 31, 2020

    Hello,

    few things first to double check:

    • Architecture scheme is depicted in this article: https://wiki.st.com/stm32mpu/wiki/Crypto_API_overview openssl is part of openstlinux distribution.
    • Although the HW configuration is activated by default in the kernel config, it is not done in the devicetree, so you need to activate them in the device tree (status = "okay") to probe the drivers (see related wiki articles). Then after that you will see them (stm32....) in /proc/crypto.

    But if you already see some SW implementation of SHA256, This should work on openssl application.

    Technical Moderator
    March 31, 2020

    Here is what I did on my DK2 with V1.2.0 to check it was working well:

    1/ add "openssl" application on my board:

    In yocto build directory, add the following lines at the end of "conf/local.conf" file:

    IMAGE_INSTALL_append = "openssl-bin"

    2/ flash the new image and check SHA256 is working ok (SW):

    root@stm32mp1:~# openssl sha256 README-CHECK-GPU

    SHA256(README-CHECK-GPU)= 3a6e03abefa513077bbd323c425972c8193e7349430f17df17a40137894a9a86

    root@stm32mp1:~# cat /proc/crypto

    name    : sha256

    driver   : sha256-generic

    module   : kernel

    priority  : 100

    refcnt   : 3

    selftest  : passed

    internal  : no

    type    : shash

    blocksize  : 64

    digestsize : 32

    3/ update your devicetree and check SHA256 is working ok (HW):

    3.1/ add following lines at the end of your board devicetree and update your board with it:

    &hash1 {

    status = "okay";

    };

    3.2/ check HW sha256:

    root@stm32mp1:~# openssl sha256 README-CHECK-GPU

    SHA256(README-CHECK-GPU)= 3a6e03abefa513077bbd323c425972c8193e7349430f17df17a40137894a9a86

    root@stm32mp1:~# cat /proc/crypto

    name    : sha256

    driver   : stm32-sha256

    module   : kernel

    priority  : 200

    refcnt   : 1

    selftest  : passed

    internal  : no

    type    : ahash

    async    : yes

    blocksize  : 64

    digestsize : 32

    ParadoxAuthor
    Visitor II
    April 1, 2020

    Hello Bernard PUEL,

    First of all, thank you very much for your reply. However, this is not the described issue, I have already managed to load the STM32 drivers for both crypto and hash, however I am not able to use them with OpenSSL. There is only a software implementation of SHA256 working, which in comparison to HW acceleration is a lot slower. Here are the supported algorithms by OpenSSL devcrypto engine after enabling the drivers:

    root@stm32mp1:~# openssl engine -t -c
     
    (devcrypto) /dev/crypto engine
     
     [DES-CBC, DES-EDE3-CBC, AES-128-CBC, AES-192-CBC, AES-256-CBC, AES-128-CTR, AES-192-CTR, AES-256-CTR, AES-128-ECB, AES-192-ECB, AES-256-ECB]
     
     [ available ]
     
    (dynamic) Dynamic engine loading support
     
     [ unavailable ]

    As you can see, there are no hash algorithms available whatsoever, only encryption, which is not what I need. Interesting thing is that even those encryption algorithms are NOT using the STM32 drivers, since these were available even before I enabled the crypto module. I can also confirm that it is not supported by running a speed test:

    root@stm32mp1:~# time openssl speed -evp sha256 -engine devcrypto
    engine "devcrypto" set.
    Doing sha256 for 3s on 16 size blocks: 470605 sha256's in 2.98s
    Doing sha256 for 3s on 64 size blocks: 328102 sha256's in 3.00s
    Doing sha256 for 3s on 256 size blocks: 170421 sha256's in 2.98s
    Doing sha256 for 3s on 1024 size blocks: 58893 sha256's in 3.00s
    Doing sha256 for 3s on 8192 size blocks: 8253 sha256's in 3.00s
    Doing sha256 for 3s on 16384 size blocks: 4164 sha256's in 3.00s
    OpenSSL 1.1.1b 26 Feb 2019
    built on: Tue Feb 26 14:15:30 2019 UTC
    options:bn(64,32) rc4(char) des(long) aes(partial) idea(int) blowfish(ptr)
    compiler: arm-ostl-linux-gnueabi-gcc -march=armv7ve -mthumb -mfpu=neon-vfpv4 -mfloat-abi=hard -mcpu=cortex-a7 --sysroot=recipe-sysroot -O2 -pipe -g -feliminate-unused-debug-types -fdebug-prefix-map= -fdebug-prefix-map= -fdebug-prefix-map= -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG
    The 'numbers' are in 1000s of bytes per second processed.
    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes
    sha256 2526.74k 6999.51k 14640.19k 20102.14k 22536.19k 22740.99k
    real 0m 18.12s
    user 0m 17.97s
    sys 0m 0.02s

    As you can see from the output, the speed is around 22 MB/s with 16KB chunks, which is quite slow when compared to HW acceleration. I managed to make my program work with AF_ALG sockets, which give me 53 MB/s speed and a way lower CPU utilization at only around 25% compared to full load on SW implementation, which is a pretty big difference in my opinion. Also, if the acceleration worked, the time displayed for "user" would be below 1s and all the time would be spent by "sys".

    Technical Moderator
    April 3, 2020

    Could you please details how did you activate cryptodev engine in openssl (I know you described it in your first comment but I could not succeed to do the same) ?

    Then, I could discuss with experts:

    • It is not expected to have a big gap between SW and HW implementation on the current platform implementation (depend on use cases and size)
    • The root cause is the Kernel framework and ST driver has not been optimized for that
    • Current implementation (kernel framework + driver) will help to save CPU rather to get improvment in performance
    ParadoxAuthor
    Visitor II
    April 3, 2020

    Hello,

    since I already had an image built, I used following commands to get cryptodev module on the board:

    devtool modify cryptodev-linux
    devtool build cryptodev-linux
    devtool deploy-target cryptodev-linux root@<ip_address>:/
     
    devtool modify cryptodev-module
    devtool build cryptodev-module
    devtool deploy-target cryptodev-module root@<ip_address>:/
     

    First recipe only generated a cryptodev header file, but it is a prerequisite for cryptodev module, which is the required kernel module for OpenSSL. When both have been successfully deployed, I updated the modules using depmod and loaded the kernel module using modprobe cryptodev.

    Afterwards, by command openssl engine -t -c I could list the available engines (in this case only cryptodev) and their supported algorithms. As mentioned before, no hashes are available on newest OpenSSL version, but in older 1.0.2 version there are at least MD5 and SHA1 (which are outdated). Perhaps I should add the cryptodev module to the local.conf file and bitbake whole image with it?

    EDIT: Attempted to bitbake the image with options "openssl-bin cryptodev-module", which were both installed, but there are still no hashes.

    Technical Moderator
    April 3, 2020

    Ok. on my side I have tried to work with openssl recipe (to make sure the application was well configured) but without any success: I don't see the devcrypto engine.

    ParadoxAuthor
    Visitor II
    April 3, 2020

    The OpenSSL recipe alone does not come with cryptodev, it only supports it. You need to install the cryptodev module separately and load it. After you successfully load it, it should be located in /dev/crypto. OpenSSL then displays it as shown two posts above after issuing openssl engine -t -c. The newest OpenSSL should support it automatically, but in case you still have trouble, in the openssl_1.1.1b.bb recipe you can also modify the line PACKAGECONFIG ?= "", where you add cryptodev-linux. That way it will definitely display the engine, as long as it is properly loaded.

    Technical Moderator
    April 6, 2020

    Yes. cryptodev-linux package was removed some time ago from the openssl recipe. So I though only adding back again cryptodev-linux was enough (what I did) but it seems not. Probably the pb comes from the installation as on your side the "manual" installation with devtool is working well.

    ParadoxAuthor
    Visitor II
    April 6, 2020

    The installation is working, there wasn't any problem with devtool or bitbake, but I might still be missing something as the hash acceleration is not supported by the engine. I know for sure it is working as I have confirmed using the AF_ALG sockets​. Currently I am using those in combination with OpenSSL to make sure its accelerating, but the problem with complete integration still persists which is why I am keeping the question "unanswered". What is also interesting is that there is an AF_ALG engine for OpenSSL and even that does not show any hashes. I am starting to think HW acceleration of hashes in OpenSSL was disabled on purpose, since it might be inconsistent and even could produce wrong output in certain scenarios.

    Technical Moderator
    April 6, 2020

    I have tried same as you did but never got the cryptodev engine appeared in openssl. Very frustrating indeed ...

    I will enter an internal ticket for integration team and let you know. If you get some updates on your side please let me know.

    ParadoxAuthor
    Visitor II
    April 6, 2020

    Now thats interesting. It should not differ from my case, since it is the same board. Are you sure you are running the newest version of st-image-weston image? I have literally just built that image and flashed the FlashLayout_sdcard_stm32mp157c-dk2-trusted version using STM32CubeProgrammer. Afterwards I used devtool to build the 3 recipes that I mentioned in my first post and deployed them onto the board. If that does not work, I don't know what would. I think I will not waste any more time finding out why it does not show any hashes, I will proceed with what I have gotten, but I am curious about how this works out. Thank you for your time and support though, I very much appreciate it.

    Technical Moderator
    April 20, 2020

    Hello,

    investigation of this issue continued on our integration team. Here is the outcome:

    for Dunfell, if openssl is compiled with 'cryptodev-linux' (PACKAGECONFIG_append_pn-openssl = " cryptodev-linux") then only need to 'modprobe cryptodev' and HW acceleration is working.

    root@stm32mp1:~# time openssl speed -evp sha256 -engine devcrypto

    engine "devcrypto" set.

    Doing sha256 for 3s on 16 size blocks: 27948 sha256's in 0.27s

    Doing sha256 for 3s on 64 size blocks: 26880 sha256's in 0.24s

    Doing sha256 for 3s on 256 size blocks: 21500 sha256's in 0.14s

    Doing sha256 for 3s on 1024 size blocks: 20334 sha256's in 0.19s

    Doing sha256 for 3s on 8192 size blocks: 12768 sha256's in 0.14s

    Doing sha256 for 3s on 16384 size blocks: 8919 sha256's in 0.10s

    OpenSSL 1.1.1f 31 Mar 2020

    built on: Tue Mar 31 12:17:45 2020 UTC

    options:bn(64,32) rc4(char) des(long) aes(partial) idea(int) blowfish(ptr)

    compiler: arm-ostl-linux-gnueabi-gcc -mthumb -mfpu=neon-vfpv4 -mfloat-abi=hard -mcpu=cortex-a7 --sysroot=recipe-sysroot -O2 -pipe -g -feli

    minate-unused-debug-types -fmacro-prefix-map= -fdebug-prefix-map= -fdebug-prefix-map=

    -fdebug-prefix-map= -DOPENSSL_USE_NODELETE -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_

    ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DPOLY1305_ASM -DNDEBUG

    The 'numbers' are in 1000s of bytes per second processed.

    type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes 16384 bytes

    sha256 1656.18k 7168.00k 39314.29k 109589.56k 747110.40k 1461288.96k

    real 0m 18.23s

    user 0m 1.08s

    sys 0m 7.36s