Skip to main content
SWenn.1
Senior III
September 27, 2024
Question

low latency streaming audio over IP --- NUCLEO-H723ZG

  • September 27, 2024
  • 6 replies
  • 2786 views

I am looking to stream audio over an ethernet connection to a cellular hot spot.  This will be low latency.  Can someone tell me what STM32 firmware library I should use and if this is an appropriate hardware for that task?

 

Thanks

6 replies

Andrew Neil
Super User
September 28, 2024


Define "audio": how many channels? bit rate? bit depth?

Define "low latency".

When you say "ethernet", is that purely at the physical layer, or do you also need to be IP-based?

Especially with IP, it is a non-trivial exercise - doubly so with wireless links.

https://en.wikipedia.org/wiki/Audio_over_Ethernet

 

A complex system that works is invariably found to have evolved from a simple system that worked.A complex system designed from scratch never works and cannot be patched up to make it work.
SWenn.1
SWenn.1Author
Senior III
September 28, 2024

Hello Andrew.

Great questions!

This will be up to about 10kHz 12kHz.

Low latency will be sub second.

IP-based.

Right now my vision (and please feel free to jump in here).

I have a NUCLEO-H723ZG on the way.  I am looking at the X-CUBE-OPUS and X-CUBE-ETHERNET libraries to integrate.  I 'hope' I can take A/D samples, create PCM for the OPUS and send to the ETHERNET library. Initially I want to get the pieces in place and do a proof of concept. On the final solution I am still out on the fence as whether to use UDP/Multicasting vs TCP to a website and hosting....I guess it comes down to reliability vs latency in that sense.

Thoughts?

 

Steve

 

Andrew Neil
Super User
September 30, 2024

@SWenn.1 wrote:

This will be up to about 10kHz 12kHz.


Is that your sample rate?

What bit-depth?

Mono or stereo?

 


@SWenn.1 wrote:

Low latency will be sub second.


eg, 950ms would be OK ?

Another issue to consider on IP networks is the allowable "jitter" on that ...

A complex system that works is invariably found to have evolved from a simple system that worked.A complex system designed from scratch never works and cannot be patched up to make it work.
LCE
Principal II
September 30, 2024

I'm doing that with the H723 / H733 / H735.

8 audio channels (I2S from the 2 SAIs) at 32 bit / 200 kHz (51.2 Mbit/s) via DMA to internal SRAM buffers to HyperRAM via DMA to ETH via TCP to PC host.

I use lwIP (some minor parts tuned a little) and self-made ETH-driver, no OS.

CPU at 400 MHz, no audio processing, CPU sleeping about 50%.

Works like a charm.

If you have a client that's taking data quickly and reliably (so just do not plug into "a network" but directly to host) you probably don't need extra RAM, and latency will be about 2 packet times (guessing: < 5 ms).

So how many audio channels do you have, just this one?
Anyway, at that sampling rate it shouldn't be a problem - BUT it will be lots of work.

Spoiler: CubeMX, and even many HAL functions will not do all the work for you.
In all of my "signal path" relevant code there's only HAL_GetTick() left from HAL. :D

SWenn.1
SWenn.1Author
Senior III
September 30, 2024

Just one audio channel.  I too was hoping to avoid OS.  That is really low latency!  Can you explain the 

" (so just do not plug into "a network" but directly to host)"....

My vision would be to send to a IP address of at say AWS and users could stream off of that...

LCE
Principal II
September 30, 2024

My vision would be to send to a IP address of at say AWS and users could stream off of that...

Live streaming to or from AWS? Anyway, I have no idea about that.

Sometimes Windows 10 alone has other stuff in mind, and the more nodes in the network, the more your latency goes up - and more than the inbuilt RAM might be needed (prolly not with 1 channel at 16b/48k).

So, my short latency is in a super small network: STM32 - PoE switch - PC.
I have seen latency go up in our company office network, depending on the traffic.
I have tweaked lwIP PCBs by adding DiffServ / QoS priorities, but I have no idea if that actually helped.
There's some more testing to do on my side...

 

LCE
Principal II
October 1, 2024

Sorry, I don't.

SWenn.1
SWenn.1Author
Senior III
October 3, 2024

@LCE ...

Firstly I apologize for the length here.....In short I am just trying to get a ping tx and rx from a nucleo....Most is sharing of what I have done / setup....Hoping you can enlighten me to something I may have missed??

I received my nucleo board NUCLEO-H723ZG.  I followed the following video instructions:
https://www.youtube.com/watch?v=8r8w6mgSn1A&list=PLfIJKC1ud8ggZKVtytWAlOS63vifF5iJC&index=1 

as best as I could given the RMII vs his MII as well as sizes because the 723 is smaller and the code compiled and I do not get fault errors when running.  However I do get Destination Host Unreachable constantly.

My CubeMX ethernet settings are:

 

SWenn1_1-1727985289250.png

LWIP settings:

SWenn1_2-1727985487034.png

SWenn1_4-1727985677325.png

The sizes keep everything within SRAM1 per the datasheet (0x30000000 - 0x30003FFF)....

I altered the xxx_FLASH.Id file with :

 .lwip_sec (NOLOAD) :
 {
 . = ABSOLUTE(0x30000000);
 *(.RxDecripSection)

 . = ABSOLUTE(0x30000060);
 *(.TxDecripSection)

 . = ABSOLUTE(0x300000C0);
 *(.RxArraySection)
 } >RAM_D2

and finally the ethernetif.c file to :

ETH_DMADescTypeDef DMARxDscrTab[ETH_RX_DESC_CNT] __attribute__((section(".RxDecripSection"))); /* Ethernet Rx DMA Descriptors */
ETH_DMADescTypeDef DMATxDscrTab[ETH_TX_DESC_CNT] __attribute__((section(".TxDecripSection"))); /* Ethernet Tx DMA Descriptors */

#endif

#if defined ( __ICCARM__ ) /*!< IAR Compiler */
#pragma location = 0x30000100
extern u8_t memp_memory_RX_POOL_base[];

#elif defined ( __CC_ARM ) /* MDK ARM Compiler */
__attribute__((section(".Rx_PoolSection"))) extern u8_t memp_memory_RX_POOL_base[];

#elif defined ( __GNUC__ ) /* GNU */
__attribute__((section(".Rx_PoolSection"))) extern u8_t memp_memory_RX_POOL_base[];
#endif

/* USER CODE BEGIN 2 */
RxBuff_t rxBuff[ETH_RX_BUFFER_SIZE] __attribute__((section(".RxArraySection")));
/* USER CODE END 2 */

so that the sections exist.

When I run I don't crash or hang up but I do notice that in the Build Analyzer section I do NOT get the .RxAarraySection to show up.  And here is the confusing part...Notice RAM_D1 seems to be showing the RxDecriptSection, etc at 0x24000000 space vs RAM_D2....From your experience can you shed any light on this?

SWenn1_5-1727986236627.png

Thanks

Steve

 

 

LCE
Principal II
October 4, 2024

Alone seeing this "RxDecriptSection" from Cube gives me the creeps... :D But I'm a little oversensitive with things like that.

Anyway,

0) RMII should be no problem, setup is handled well be Cube.

 

1) It's important that the descriptors are in SRAM1, the TX / RX buffers can also be in AXI SRAM - where you actually placed the lwip pool (used for lwip packet buffers "pbuf"), which is okay.

Check RM0468, 2 Memory and bus architecture, it starts with a table which bus master (incl. ETH DMA) can access which RAM or peripheral.

2) rxBuff: the declaration looks good, so is that actually used somewhere? Should be... but I don't use the ETH HAL driver.

I would debug with a breakpoint somewhere in "ethernetif.c" where ethernet data is received and given to lwip.

 

As I said, it will be lots of work, with lots of frustration - but the H7 can do it!

January 19, 2026

<p>
For low latency streaming over IP, using a robust firmware library is essential. STM32CubeH7 provides Ethernet and DMA support that can help minimize latency. Similar to optimizing streaming performance, platforms like <a href="https://dixmaxxfree.es/" target="_blank" rel="nofollow">Dixmax Free Movies</a> demonstrate how proper buffering and network management enhance uninterrupted media delivery. Applying similar principles to audio streaming can significantly improve reliability.
</p>