Hello everyone,I am using CubeAI 10.1.0 and STAGEAI 2.1 to analyze my model for STM32N6, and I encountered an issue where some epochs show ?? instead of the expected results. Here's the log:epoch ID HW/SW/EC Operation (SW only)epoch 1 ECepoch 2 ECepoch 3 -SW- (DequantizeLinear)epoch 4 -SW- (PRelu)epoch 5 -SW- (QuantizeLinear)epoch 6 -SW- (MaxPool)epoch 7 ECepoch 8 ECepoch 9 -SW- (DequantizeLinear)epoch 10 -SW- (PRelu)epoch 11 -SW- (QuantizeLinear)epoch 12 ECepoch 13 ECepoch 14 -SW- (DequantizeLinear)epoch 15 -SW- (PRelu)epoch 16 ECepoch 17 -SW- (Conv)epoch 18 -SW- (Add)epoch 19 ECepoch 20 -SW- (Conv)epoch 21 -SW- (Add)epoch 22 ??epoch 23 -SW- (Add)epoch 24 ??epoch 25 -SW- (Add)epoch 26 ECIn epoch 22 and epoch 24, the result is shown as ??, and I couldn't retrieve any computation results. I have a few questions:1. What does ?? mean?Does the ?? represent that some operators or operations failed to execute during these epochs? Does it imply that those operators are not supported on STM32N6, or could it be due to hardware resource limitations?2. Will this affect model results?If epoch shows ??, will it impact the final recognition or inference accuracy of the model? Should I be concerned that this issue may lead to unreliable results from the model?3. Why is the PReLU operator still executed in software after quantization?The official documentation mentions that PReLU is supported on STM32N6, but after model quantization, the computation for PReLU is still executed in software rather than on the hardware. Why is that? Is it due to hardware limitations, or is STM32N6's hardware acceleration for this operator not fully optimized? Is there any other reason why PReLU still runs in software?4. How can I optimize the model to avoid these issues?If these issues occur, are there any recommended optimization methods or adjustment strategies to address them and ensure that the model runs smoothly and gives accurate results? Should I consider simplifying the model or replacing PReLU with another activation function to avoid the operator being executed in software?Thank you in advance for your help and suggestions!

Associate II

Solved

STM32N6 : CubeAI ?? Epoch Issue and Why PReLU Runs in Software After Quantization

Forum|Forum|10 months ago
June 23, 2025
1 reply
813 views

Hello everyone,

I am using CubeAI 10.1.0 and STAGEAI 2.1 to analyze my model for STM32N6, and I encountered an issue where some epochs show ?? instead of the expected results. Here's the log:

epoch ID HW/SW/EC Operation (SW only)
epoch 1 EC
epoch 2 EC
epoch 3 -SW- (DequantizeLinear)
epoch 4 -SW- (PRelu)
epoch 5 -SW- (QuantizeLinear)
epoch 6 -SW- (MaxPool)
epoch 7 EC
epoch 8 EC
epoch 9 -SW- (DequantizeLinear)
epoch 10 -SW- (PRelu)
epoch 11 -SW- (QuantizeLinear)
epoch 12 EC
epoch 13 EC
epoch 14 -SW- (DequantizeLinear)
epoch 15 -SW- (PRelu)
epoch 16 EC
epoch 17 -SW- (Conv)
epoch 18 -SW- (Add)
epoch 19 EC
epoch 20 -SW- (Conv)
epoch 21 -SW- (Add)
epoch 22 ??
epoch 23 -SW- (Add)
epoch 24 ??
epoch 25 -SW- (Add)
epoch 26 EC

In epoch 22 and epoch 24, the result is shown as ??, and I couldn't retrieve any computation results. I have a few questions:

1. What does ?? mean?

Does the ?? represent that some operators or operations failed to execute during these epochs? Does it imply that those operators are not supported on STM32N6, or could it be due to hardware resource limitations?

2. Will this affect model results?

If epoch shows ??, will it impact the final recognition or inference accuracy of the model? Should I be concerned that this issue may lead to unreliable results from the model?

3. Why is the PReLU operator still executed in software after quantization?

The official documentation mentions that PReLU is supported on STM32N6, but after model quantization, the computation for PReLU is still executed in software rather than on the hardware. Why is that? Is it due to hardware limitations, or is STM32N6's hardware acceleration for this operator not fully optimized? Is there any other reason why PReLU still runs in software?

4. How can I optimize the model to avoid these issues?

If these issues occur, are there any recommended optimization methods or adjustment strategies to address them and ensure that the model runs smoothly and gives accurate results? Should I consider simplifying the model or replacing PReLU with another activation function to avoid the operator being executed in software?

Thank you in advance for your help and suggestions!

Best answer by Julian E.

Hello @qiqi,

So, the PReLU being in software is a bug of the CLI front end. It is supported by the aton compiler.

The bug is fixed and will be part of the next version (2.2) planned for beginning/mid July.

Concerning the ?? bug, I opened an internal ticket, and I will update you.

Until I know more, I would suggest either not to use the option causing the issue or to use the validate on target with and without the option to see the difference and make sure the results are correct.

Have a good day,

Julian

Julian E.

Technical Moderator

Hello @qiqi,

Could you please share your model in a .zip file?

Concerning the PReLU, it is indeed supported. As for why it is not used in SW epoch it could be because the compiler decided that it is faster to do it in SW. I will look at it with more detail if you share your model.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

Q

qiqiAuthor

Associate II

Dear Julian,

Thank you so much for your help! I have packed the models into a .zip file and attached it for your review. The zip file contains three models: mobilefacenet.onnx, ONet.onnx, and RNet.onnx, all of which are quantized models. During the analysis, both ONet.onnx and RNet.onnx showed ?? epochs. Could you kindly take a look and help identify any issues and suggest possible solutions?

Additionally, if you don't mind, I would like to ask you one more question. The mobilefacenet.onnx feature extraction model has relatively large parameters, and the analysis shows a total of 164 epochs, of which 111 are implemented in software. Through empirical testing, the inference time is around 100ms, which I feel is a bit long. Is there a way to move more epochs to hardware execution instead of software?

Furthermore, the model’s activations are 3.062 MB, and apart from npuRAM3, npuRAM4, npuRAM5, and npuRAM6, it must occupy some space in hyperRAM. According to the official documentation I reviewed, this might affect the inference speed. Is that the case? If so, can it be optimized by adjusting the options in the user_neuralart.json file?

Apologies for all the questions, and I really appreciate your help in answering them and optimizing the model.

Thanks again for your support, and I look forward to your reply!

Best regards,
QiQi

inf-model.zip

Julian E.

Technical Moderator

Hello @qiqi,

Thank you for the models, I will first take a look at this ?? issue.

Regarding optimization, if the activations do not fit into internal RAM, then, it will indeed have a big impact on the inference time. The weights are in external flash, but because they are read one time when needed, the impact is not heavy. For activations however, multiple read and writes will require to access external memory, inducing this augmentation of inference time.

I will take a look with my colleague to see if we can provide you with some tips to help you.

In the meantime, you can look at this piece of information, if you have not already seen it:
https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_neural_art_compiler.html#tips-variations-around-the-basic-use-case

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

1. What does ?? mean?

2. Will this affect model results?

3. Why is the PReLU operator still executed in software after quantization?

4. How can I optimize the model to avoid these issues?

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded