I am using an STM32H750 custom hardware to interface with an external UART device at 38400 baud. (see STM32_block.png). UART 2 is configured to Asynchronous mode.
It is used to transfer the following 31 bytes of data to the external device
ff 03 fe f0 00 1e 80 03 fe 00
16 00 12 42 50 00 10 01 00 00
00 00 00 00 00 00 20 01 00 83
89
UART2 then expects 32 bytes of data from the external device
ff 03 fe f0 00 1f 80 03 fe 00
17 00 13 c2 54 20 01 00 00 00
00 00 00 00 00 00 10 01 00 00
d1 25
The software is base on Zephyr OS
Below, I used async APIs uart_rx_enable and uart_tx for data transmission and reception. Along with a semaphore for timeout purpose.
void port1_uart_callback(const struct device *dev, struct uart_event *evt, void *user_data)
{
switch (evt->type) {
case UART_TX_DONE:
//LOG_INF("Handle UART_TX_DONE: %d\n", evt->type);
k_sem_give(&group_ports[GROUP_1_PORT].data_sem);
break;
case UART_TX_ABORTED:
//LOG_INF("Handle UART_TX_ABORT: %d\n", evt->type);
break;
case UART_RX_RDY:
//LOG_INF("Handle UART_RX_RDY: %d\n", evt->type);
group_ports[GROUP_1_PORT].data.rcvPacket.len += evt->data.rx.len;
response_received = true; // Flag response as received
k_sem_give(&group_ports[GROUP_1_PORT].rcv_data_sem);
break;
default:
LOG_WRN("Unhandled UART event: %d\n", evt->type);
break;
}
}
int port1_serial_process(void)
{
uint32_t events;
int status;
// ----- Initial timer for USART performance measurement
timing_t start_time, end_time; // Variables to store timing information
uint64_t total_cycles;
uint64_t total_ns;
timing_init();
timing_start();
LOG_INF("Timing Frequency: %llu Hz\n", timing_freq_get_mhz());
LOG_INF("Grp port 1 running!");
//----- End of timer for USART performance measurement
// ----- Get system clock setting
uint32_t sys_clk, hclk, pclk1, pclk2;
// Retrieve clock frequencies using STM32 HAL APIs
sys_clk = HAL_RCC_GetSysClockFreq(); // System clock frequency
hclk = HAL_RCC_GetHCLKFreq(); // AHB clock frequency
pclk1 = HAL_RCC_GetPCLK1Freq(); // APB1 clock frequency
pclk2 = HAL_RCC_GetPCLK2Freq(); // APB2 clock frequency
// Print the clock frequencies
printk("System Clock Frequency: %u Hz\n", sys_clk);
printk("AHB Clock Frequency: %u Hz\n", hclk);
printk("APB1 Clock Frequency: %u Hz\n", pclk1);
printk("APB2 Clock Frequency: %u Hz\n", pclk2);
// ----- end Get system clock setting
LOG_INF("Grp port 1 running!");
uart_callback_set(group_ports[GROUP_1_PORT].uart, port1_uart_callback, NULL);
k_event_init(&group_ports[GROUP_1_PORT].events);
while (true) {
/* Retrieve events the event queue*/
events = k_event_wait( &group_ports[GROUP_1_PORT].events,
(uint32_t)(PORT_SYNC | PORT_SEND | PORT_SHUT_DOWN | PORT_RECEIVE | PORT_SEND_RECV | PORT_TIMEOUT | PORT_ALWAYS_RECV),
false,
K_FOREVER );
LOG_INF("processGroup1PortEvent: Port %d, Events 0x%x\n", GROUP_1_PORT, events);
if (events != 0U) {
if (events & PORT_SEND_RECV) {
start_time = timing_counter_get(); // Start timing
// timeout is in microseconds which is time of 10bits at baudrate 38400
if (uart_rx_enable(group_ports[GROUP_1_PORT].uart, (uint8_t *) &group_ports[GROUP_1_PORT].data.rcvPacket.pkt.buffer, PALCOMX_TRANSPORT_PACKET_MAX_SIZE, 520) != 0) {
LOG_ERR("Port 1 failed to enable UART RX\n");
}
status = uart_tx(group_ports[GROUP_1_PORT].uart, (uint8_t *) &group_ports[GROUP_1_PORT].data.txPacket.pkt.buffer, group_ports[GROUP_1_PORT].data.txPacket.len, SYS_FOREVER_MS);
if (status < 0) {
LOG_WRN("Failed to send data: %d\n", status);
}
// Wait for response or timeout
if (k_sem_take(&group_ports[GROUP_1_PORT].rcv_data_sem, K_MSEC(50)) == 0) {
if (response_received) {
end_time = timing_counter_get(); // End timing
LOG_INF("Resp rcv within %d ms\n", 50);
k_sem_give(&group_ports[GROUP_1_PORT].udp_resp_sem);
response_received = false; // Reset the flag for next use
/* response packet received */
LOG_INF("Port 1 recv bytes: %d\n", group_ports[GROUP_1_PORT].data.rcvPacket.len);
} else {
LOG_INF("UART Resp handling error\n");
}
} else {
LOG_INF("UART Timeout waiting for resp\n");
// housekeeping
group_ports[GROUP_1_PORT].data.rcvPacket.len = 0;
}
// Disable UART RX after operation
uart_rx_disable(group_ports[GROUP_1_PORT].uart);
// Clear the event flag
k_event_clear(&group_ports[GROUP_1_PORT].events, PORT_SEND_RECV);
total_cycles = timing_cycles_get(&start_time, &end_time);
total_ns = timing_cycles_to_ns(total_cycles);
LOG_INF("Serial send-receive duration: %llu ns", total_ns);
}
} // end of if (events != 0U) condition
} // end of while (true) loop
}
Following is an log extract of the above code.
System Clock Frequency: 400000000 Hz
AHB Clock Frequency: 200000000 Hz
APB1 Clock Frequency: 100000000 Hz
APB2 Clock Frequency: 100000000 Hz
<inf> com_test: processGroup1PortEvent: Port 0, Events 0x10
<inf> com_test: Resp rcv within 50 ms
<inf> com_test: Port 1 recv bytes: 32
<inf> com_test: Serial send-receive duration: 31375580 ns
From STM32 perspective, sending 31 bytes, followed by receiving 32 bytes of data @ 38400baud took around 31ms. The 31375580 ns value fluctuates between 29ms to 39ms
I used an oscilloscope to measure the actual time between packet transmission and reception. (see UART2_txrx.jpg) CH1 is data coming out of STM32, and CH2 is data from the UART device. The entire transfer took around 16.5ms.
From these measurements, I see that there is a latency of 12 ~ 22ms
That is quite a long time.
Comparatively, observing the response time from UART device (UART_dev_resp_time.jpg) , it only took around 360us to process the incoming data and send a response. (The device was not doing a straight echo)
I have seen the explanation regarding bus architecture
https://community.st.com/t5/stm32-mcus-products/stm32h7-gpio-togle-max-frequency/m-p/336687
I understand that latency is expected. Have I reached the peak performance for this processor? (I know I can try to clock the CPU faster). Or am I missing something that can yield better results?