Hello ST Community,I am working on my final year project: an autonomous "Multimodal Medical Assistant Robot" based on the NUCLEO-H723ZG (Cortex-M7, 550MHz). I would like to have your expert opinion on the feasibility and technical architecture.Key Features:100% Offline AI (Edge AI):Audio: Keyword Spotting (7 commands) using INMP441 (I2S) and CMSIS-DSP (MFCC).Vision: Fall detection using OV2640 (DCMI) and a light MobileNet model.Software Stack: X-CUBE-AI, TensorFlow Lite Micro, FatFS.Innovation (Dynamic Loading): We plan to load the AI models (.tflite) and voice responses (.wav) dynamically from a USB Flash Drive (USB Host MSC) into RAM at boot time.Specific Questions:Is the NUCLEO-H723ZG powerful enough to run both Audio and Vision inference concurrently while managing motor control (PWM) and USB Host?Regarding the USB Host / RAM loading: Does X-CUBE-AI support "relocatable weights" loaded into RAM from an external storage device via FatFS? Any specific memory alignment tips for the H7 AXI SRAM?Any advice on DMA priorities between DCMI (Vision) and I2S (Audio) to avoid data loss?Thank you for your help!

Explorer II

Question

Project Feasibility: Multimodal Medical Robot on STM32H723ZG (Audio/Vision AI + USB Host)

Forum|Forum|3 months ago
February 13, 2026
2 replies
282 views

Hello ST Community,

I am working on my final year project: an autonomous "Multimodal Medical Assistant Robot" based on the NUCLEO-H723ZG (Cortex-M7, 550MHz). I would like to have your expert opinion on the feasibility and technical architecture.

Key Features:

100% Offline AI (Edge AI):
- Audio: Keyword Spotting (7 commands) using INMP441 (I2S) and CMSIS-DSP (MFCC).
- Vision: Fall detection using OV2640 (DCMI) and a light MobileNet model.
Software Stack: X-CUBE-AI, TensorFlow Lite Micro, FatFS.
Innovation (Dynamic Loading): We plan to load the AI models (.tflite) and voice responses (.wav) dynamically from a USB Flash Drive (USB Host MSC) into RAM at boot time.

Specific Questions:

Is the NUCLEO-H723ZG powerful enough to run both Audio and Vision inference concurrently while managing motor control (PWM) and USB Host?
Regarding the USB Host / RAM loading: Does X-CUBE-AI support "relocatable weights" loaded into RAM from an external storage device via FatFS? Any specific memory alignment tips for the H7 AXI SRAM?
Any advice on DMA priorities between DCMI (Vision) and I2S (Audio) to avoid data loss?

Thank you for your help!

T

Tuomas95

Associate III

You would probably want to use the STM32N6 as it has an NPU.

Y

yessine

Associate III

Hello @AMIRA11

I don't see why it can't be done using a Nucleo H723ZG (144 pins), but let’s see what you really need to do in this case:

Microphone: Nucleo boards don’t have a microphone, so you should implement a microphone module (then port a working example from another board to the H723).
Vision: The Nucleo board doesn’t have an LCD screen or a single working example for it. So first, you need to implement an LCD screen (good luck configuring FMC correctly), then you need to port a working example from the H747.
USB MSC Host: You need to implement an SD card reader (at least here the H723 has a USB MSC Host example).

On the other hand, the H747 DISCO already has an implemented microphone and SD card slot with examples, which will help you skip porting tasks.

It is also very well supported for Edge AI applications. Many Edge AI functionalities were first developed on the H747 DISCO and later on the N6 when it was launched.

I can recommend the N6, but personal advice: stay away from it as much as you can.

To conclude, it is feasible using the H723 or N6, but with your STM32 experience which I estimate to be around 1 year since this is your final-year project (2nd year in engineering) it will be very challenging.

The H747 DISCO will make it more doable (still nothing guaranteed; you may encounter many issues during your journey).

NB: For vision, you can take a look at these, they may help you: FP-AI Vision, Teachable Machine (I don’t know if it still works, but it gives you the ability to create convenient TFLite models that you can later flash using Model Zoo).

BR

Julian E.

Technical Moderator

Hi @AMIRA11,

It is difficult to answer as it will highly depends on your models.

I would suggest to work on your models first and see if they are compiled well with our ST Edge AI Core tool.

You can use the new tool replacing X Cube AI called STM32CubeAI Studio to easily compile and benchmark your model: Introducing STM32CubeAI Studio - STMicroelectronics Community

Then depending on the needs of your models, select a MCU with enough memory and power.

Note that the N6 and the NPU are mainly made to accelerate convolutions, so very useful if your model uses a lot of these layers, but won't do much if not using them.

Have a good day,

Julian

In order to give better visibility on the answered topics, please click on 'Accept as Solution' on the reply which solved your issue or answered your question.

Sign up

Login with SSO

Login to the community

Login with SSO

Scanning file for viruses.

This file cannot be downloaded