Hello,I am trying to run a Transformer model with Multi-head self attention on the STM32N6570-DK board.The model is this one : https://github.com/arx7ti/cold-nilmHere is what I did so far:Convert the .ckpt I got from the above repo into a .onnx format.Import and Quantize the model using your ST Edge AI Developer Cloud tool.Try to use stedgeai `generate` command:From the standalone tool on my Linux machineFrom STCubeMX plugginFrom the ST Edge AI Developer Cloud--> None of thoses attempts were succesful. Here is the command I used in the first case:stedgeai analyze --model cold_stm32_float32_final_ready_PerChannel_quant_random_1.onnx --st-neural-art default@user_neuralart_STM32N6570-DK.json --target stm32n6 --name network --workspace workspace --output outputI got several issues that I tried to fix in order to make the Developer Cloud tool quantize my initial onnx model:Removing Constant operationsConverting LayerNormalization layers in compatible layersRemoving Pow operationsRemoving Reshape allowzero attributesRemoving ReduceMean noop_with_epty_axes attributes...I cannot guarantee that everything was necessary. My understanding of what I was really doing is not so important to be honnest. I just tried to get rid of all error messages one by one.But I did not succeed in making it work. I always got this message (as in this post from @mincho00 @Julian E. :waving_hand:) at 91% of generation completion:ST Edge AI Core v2.2.0-20266 2adc00962 INTERNAL ERROR: Exported ONNX could be malformed since ONNX shape inference failsNow, I am pretty confident that the issue is related to the model type (Transformer-based) not compatible with stedgeai tool. Moreover I have read here:"It is specifically designed to accelerate the inference execution of a wide range of quantized convolutional neural network (CNN) models in area-constrained embedded and IoT devices."Here is my question: is there absolutely no way to run such model on the NPU ?PS: I have added the orignal .ckpt model, the generated onnx and the quantized one in an attached archive. Do not hesitate to have a look at them !

M

MathieuG

Associate

Question

Stedgeai tool fails generating outputs for STM32N6's Neural-ART accelerated NPU

Forum|Forum|6 months ago
November 7, 2025
3 replies
292 views

Hello,

I am trying to run a Transformer model with Multi-head self attention on the STM32N6570-DK board.

The model is this one : https://github.com/arx7ti/cold-nilm

Here is what I did so far:

Convert the .ckpt I got from the above repo into a .onnx format.
Import and Quantize the model using your ST Edge AI Developer Cloud tool.
Try to use stedgeai `generate` command:
1. From the standalone tool on my Linux machine
2. From STCubeMX pluggin
3. From the ST Edge AI Developer Cloud

--> None of thoses attempts were succesful. Here is the command I used in the first case:

stedgeai analyze --model cold_stm32_float32_final_ready_PerChannel_quant_random_1.onnx --st-neural-art default@user_neuralart_STM32N6570-DK.json --target stm32n6 --name network --workspace workspace --output output

I got several issues that I tried to fix in order to make the Developer Cloud tool quantize my initial onnx model:

Removing Constant operations
Converting LayerNormalization layers in compatible layers
Removing Pow operations
Removing Reshape allowzero attributes
Removing ReduceMean noop_with_epty_axes attributes
...

I cannot guarantee that everything was necessary. My understanding of what I was really doing is not so important to be honnest. I just tried to get rid of all error messages one by one.

But I did not succeed in making it work. I always got this message (as in this post from @mincho00 @Julian E. :waving_hand:) at 91% of generation completion:

ST Edge AI Core v2.2.0-20266 2adc00962
 
INTERNAL ERROR: Exported ONNX could be malformed since ONNX shape inference fails

Now, I am pretty confident that the issue is related to the model type (Transformer-based) not compatible with stedgeai tool. Moreover I have read here:

"It is specifically designed to accelerate the inference execution of a wide range of quantized convolutional neural network (CNN) models in area-constrained embedded and IoT devices."

Here is my question: is there absolutely no way to run such model on the NPU ?

PS: I have added the orignal .ckpt model, the generated onnx and the quantized one in an attached archive. Do not hesitate to have a look at them !

Julian E.

Technical Moderator

Hi @MathieuG,

X Cube AI and the Dev Cloud both use the ST Edge AI Core in background. So, it is normal to have the same behavior as long as you use the same version.

As you pointed out, the NPU is primarily made to accelerate conv models. But there is still support to transformers to some degree.

I tried to convert your model with internal version of the core, and it triggers an error in the Neural Art compiler (the NPU compiler). I created an internal ticket, but I cannot guaranty that it will be treated soon...

I would suggest seeing if any of the model from ST Model Zoo can be of any help for you. They are at least architectures of model you are sure to pass the "generate" with.

I think the next release planned for end of January will bring new models.

Sorry for the inconvenience.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

M

MathieuGAuthor

Associate

Hi @Julian E. ,

Thank you for this fast answer.

Do you have more information regarding the error ? Maybe it could help me adapt the model..

Unfortunately none of the models from ST Model Zoo are of any help. I am looking for specific purpose model (NILM).

Do not hesitate to ask us to beta-test any pre-release version of the tool.

Best regards,

Julian E.

Technical Moderator

@MathieuG,

I don't, I tried to read the logs, but I couldn't understand anything useful. It would agree that the issue is caused because of its transformer nature, but I can only ask for support to the dev team that will sadly only look at it when they can...

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

M

MathieuGAuthor

Associate

Is there any chance that you come back to me if you have more information ? Dev analysis for example even if the fix is not implemented/released ?

Best regards,

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded