Question
What's the best practice of porting a pretrained pytorch model to STM32 MCU?
Dear support-team,
I've encountered quite some problems when loading a pre-trained Pytorch mobilenetv2 model in Xcube-AI.
I have tried the following with the app stm32ai located in the installation repository of X-Cube-AI/7.0.0.
- torch fp32 --> onnx_fp32 --> onnx_int8 (external)
- torch fp32 --> onnx fp32 --> keras (h5) fp32
- converted keras (h5) fp32 --> tflite_fp32
I found that:
- quantized onnx model cannot be ported and analyzed directly in Xcube-AI. The quantization tool is only aimed at keras model. Does it mean that we cannot use quantized int8 ONNX model on STM32 platform?
- since only channel-last (NHWC) model is supported in STM32 MCU, the torch model has to be converted from NCHW to NHWC. But the Transposition operator does not seem to supported in Xcube-AI. If that is case, how would you recommend to port pytorch to STM32 platform? Should we use directly the torch JIT format? Is torchscript model supported on STM32 platform?
Looking forward to your answer,
Best regards,
Zhen
