Skip to main content
Associate II
December 12, 2025
Solved

Matmul operation is converted to MCU target Convolution

  • December 12, 2025
  • 1 reply
  • 552 views

Hello. I am currently attempting to execute a MatMul operation on an NPU. I have implemented a simple TFLite file with a MatMul operation, as shown in the image below.

mincho00_3-1765548820534.png

When converting the model using ST Edge AI, I observed that Matmul operator is mapped to the MCU instead of the NPU, and the operation is converted into a Convolution. Furthermore, checking the network.c file reveals that it is being converted into a convolution layer with an extremely large stride.

mincho00_1-1765548640104.png

mincho00_4-1765548858606.png

 

https://stm32ai-cs.st.com/assets/embedded-docs/stneuralart_operator_support.html states that the matmul(batch matmul) operator is supported on the ST Neural art Accelerator.

How can I resolve this issue? Thank you.

Best answer by Julian E.

Hi @mincho00;

 

In documentation, for matmul, there is this need:


HW
Second input should be constant else SW fallback is considered

https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html

 

In your case, you use batchMatmul. Batchmatmul and matmul are two different operators. The Batch matmul can be seen as a concatenation of N matmul with N="batch", That's why we end up with 4 different convolutions, and for the moment, we have the same constraints of the matmul on the second input that should be constant. 

 

So you should try to have a constant second input to see if it helps.

 

Have a good day,

Julian

1 reply

Julian E.
Julian E.Best answer
Technical Moderator
January 12, 2026

Hi @mincho00;

 

In documentation, for matmul, there is this need:


HW
Second input should be constant else SW fallback is considered

https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html

 

In your case, you use batchMatmul. Batchmatmul and matmul are two different operators. The Batch matmul can be seen as a concatenation of N matmul with N="batch", That's why we end up with 4 different convolutions, and for the moment, we have the same constraints of the matmul on the second input that should be constant. 

 

So you should try to have a constant second input to see if it helps.

 

Have a good day,

Julian

​In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.