Skip to main content
Associate III
January 30, 2025
Solved

Issue Running a GRU/LSTM Model on STM32 with Neural-ART

  • January 30, 2025
  • 1 reply
  • 1233 views

Hello everyone,

I’m trying to run a model on an STM32 MCU with Neural-ART using the ST Edge AI Developer Cloud.

The final model includes GRU layers, but when I attempt to quantize and then optimize it, I encounter issues. To debug this, I created a minimal test model, which is as follows:

class GRUTEST(nn.Module):
 def __init__(self, hidden_layer_size, n_layers, output_size, dropout, device):
 super(GRUTEST, self).__init__()
 self.device = device
 self.hidden_layer_size = hidden_layer_size
 self.n_layers = n_layers

 self.rnn = nn.GRU(256, hidden_layer_size, n_layers, batch_first=True, dropout=dropout, bidirectional=False)
 self.fc = nn.Linear(64, output_size)

 def forward(self, x):
 x, _ = self.rnn(x)
 x = self.fc(x[:, -1, :])
 return x

I then convert this model to ONNX as follows:

torch_model = GRUTEST(
 hidden_layer_size=64,
 n_layers=2,
 output_size=1,
 dropout=0.2,
 device='cpu'
)
torch_model.eval()

torch.onnx.export(
 torch_model,
 torch.randn(1, 1, 256),
 "modelGRU.onnx",
 opset_version=15,
 input_names=["input"],
 output_names=["output"],
 dynamic_axes={"input": {0: "batch_size"}, "output": {0: "batch_size"}},
 export_params=True, 
 keep_initializers_as_inputs=False
)

I successfully perform per-channel quantization, but as soon as I reach the optimization step (with default optimization options), I get the following error—both with the quantized and non-quantized versions of the model:

TOOL ERROR: operands could not be broadcast together with shapes (256,) (128,).

If I replace the GRU with an LSTM model:

class LSTMTEST(nn.Module):
 def __init__(self, hidden_layer_size, n_layers, output_size, dropout, device):
 super(LSTMTEST, self).__init__()
 self.device = device
 self.hidden_layer_size = hidden_layer_size
 self.n_layers = n_layers

 self.rnn = nn.LSTM(256, hidden_layer_size, n_layers, batch_first=True, dropout=dropout, bidirectional=False)
 self.fc = nn.Linear(64, output_size)

 def forward(self, x):
 batch_size = x.size(0)
 h_0 = torch.zeros(self.n_layers, batch_size, self.hidden_layer_size, device=x.device)
 c_0 = torch.zeros(self.n_layers, batch_size, self.hidden_layer_size, device=x.device)

 x, _ = self.rnn(x, (h_0, c_0))
 x = self.fc(x[:, -1, :])
 return x

Even though I explicitly define h_0 and c_0, I still get the following error:

NOT IMPLEMENTED: Sixth input (initial_h) of LSTM _rnn_LSTM_output_0_forward is not constant or constant propagation was not able to compute it

Would anyone be able to point out what I’m doing wrong?

Thanks in advance!

Best answer by Julian E.

Hello @Dresult,

 

The neural-art does not support GRU nor LSTM. You can find the list of supported layers here:

https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html 

 

Have a good day,

Julian

1 reply

Julian E.
Julian E.Best answer
Technical Moderator
January 31, 2025

Hello @Dresult,

 

The neural-art does not support GRU nor LSTM. You can find the list of supported layers here:

https://stedgeai-dc.st.com/assets/embedded-docs/stneuralart_operator_support.html 

 

Have a good day,

Julian

​In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.
DresultAuthor
Associate III
February 3, 2025

Hello @Julian E. 

Thanks for your answer, I hadn't check the page dedicated to the NPU. Now, it makes sense.

However, I am also trying to deploy the model on the STM32 MCUs and MPUs but when it comes to the optimaztion step, the process hangs. I tried with both the ST Edge AI Core 2.0 and the STM32Cube.AI 9.0. Do GRU and LSTM layers remain unsupported even on these platforms ?