Optimizing instructions for using a very frequent ISR along with USB CDC
Hello everyone,
I have a complex problem which I would like help from the memory experts and instruction optimization, if possible.
I am trying to receive data using CDC_Receive_FS() call back, and, concurrently, process the data using an ISR which is running every 300 clock cycles (the chip has a 48MHz clock).
The issue arises when this ISR becomes too fast. Anything below ~350 clock cycles overwhelms the USB callback/USB in general, and the communication fails. It works well with 500 cycle ISR.
I have a few questions, firstly, is it that the processing in the ISR consumes too many of the 350 cycles, or is it that the FREQUENCY of the ISR is INTERRUPTING the USB too much?
Here is the full code, which has a circular buffer for the incoming data, and flow control. The flow control is a synchronous packet that gets sent every received packet (this might be what is causing the larger consumption of cycles in the USB side):
static int8_t CDC_Receive_FS(uint8_t* Buf, uint32_t *Len)
{
/* USER CODE BEGIN 6 */
USBD_CDC_SetRxBuffer(&hUsbDeviceFS, &Buf[0]);
uint8_t len = (uint8_t) *Len; // Get length
if(transmissionEnable == 1){
uint16_t tempHeadPos = rxBufferHeadPos; // Increment temp head pos while writing, then update main variable when complete
for (uint32_t i = 0; i < len; i++) {
rxBuffer[tempHeadPos] = Buf[i];
tempHeadPos = (uint16_t)((uint16_t)(tempHeadPos + 1) % HL_RX_BUFFER_SIZE);
if (tempHeadPos == rxBufferTailPos) {
return USBD_FAIL;
}
}
rxBufferHeadPos = tempHeadPos;
CDC_SendRxBufferStatus_FS();
}
else{
memcpy(bulk_packet, Buf, len);
}
USBD_CDC_ReceivePacket(&hUsbDeviceFS); //indicates that the device is ready to receive more packets
return (USBD_OK);
/* USER CODE END 6 */
}
uint8_t CDC_ReadRxBuffer_FS(uint8_t* Buf, uint16_t Len) {
uint16_t bytesAvailable = CDC_GetRxBufferBytesAvailable_FS();
if (bytesAvailable < Len)
return USB_CDC_RX_BUFFER_NO_DATA;
for (uint8_t i = 0; i < Len; i++) {
Buf[i] = rxBuffer[rxBufferTailPos];
rxBufferTailPos = (uint16_t)((uint16_t)(rxBufferTailPos + 1) % HL_RX_BUFFER_SIZE);
}
return USB_CDC_RX_BUFFER_OK;
}
uint16_t CDC_GetRxBufferBytesAvailable_FS() {
return (uint16_t)(rxBufferHeadPos - rxBufferTailPos) % HL_RX_BUFFER_SIZE;
}
void CDC_SendRxBufferStatus_FS(){
uint8_t packet[64];
uint16_t bytesAvailable;
// Header
packet[0] = 'B'; // Example header byte 1
packet[1] = 'S'; // Example header byte 2
// Container Size
bytesAvailable = CDC_GetRxBufferBytesAvailable_FS();
packet[2] = (uint8_t)(bytesAvailable >> 8); // High byte
packet[3] = (uint8_t)(bytesAvailable & 0xFF); // Low byte
// Padding
for (int i = 4; i < 64; i++) {
packet[i] = 0x00;
}
// Send Packet
CDC_Transmit_FS(packet, 64);
}
void ISR_Sampler_writting() {
static uint8_t bitIndex = 7; // Start from the MSB
static uint8_t currentByte = 0; // The current byte to be written to the pin
// Check if there is data available in the buffer
if (CDC_GetRxBufferBytesAvailable_FS() > 0) {
// If this is the first bit of the byte, read the next byte from the buffer
if (bitIndex == 7) {
CDC_ReadRxBuffer_FS(¤tByte, 1); // Assume function returns only 1 byte
}
// Write the current bit to the pin
if (currentByte & (1 << bitIndex)) {
//HAL_GPIO_WritePin(GPIOA, GDO0_Pin, GPIO_PIN_SET); // Set pin high
GPIOA->BSRR = (1 << 3); // Set pin 3 (GDO0_Pin) high
} else {
//HAL_GPIO_WritePin(GPIOA, GDO0_Pin, GPIO_PIN_RESET);
GPIOA->BSRR = (1 << (3 + 16));// Reset pin 3 (GDO0_Pin) low
}
// Decrement bit index and check if we have processed the whole byte
if (bitIndex == 0) {
bitIndex = 7; // Reset bit index back to the MSB for the next byte
} else {
bitIndex--;
}
}
}
void HAL_TIM_PeriodElapsedCallback(TIM_HandleTypeDef *htim){
if(htim == &htim2){
if(transmissionEnable == 1){
ISR_Sampler_writting();
}
else if(transmissionEnable == 2){
ISR_Sampler_raw();
}
}
}
static void MX_TIM2_Init(void)
{
/* USER CODE BEGIN TIM2_Init 0 */
/* USER CODE END TIM2_Init 0 */
TIM_ClockConfigTypeDef sClockSourceConfig = {0};
TIM_MasterConfigTypeDef sMasterConfig = {0};
/* USER CODE BEGIN TIM2_Init 1 */
/* USER CODE END TIM2_Init 1 */
htim2.Instance = TIM2;
htim2.Init.Prescaler = 0;
htim2.Init.CounterMode = TIM_COUNTERMODE_UP;
htim2.Init.Period = 350-1;
htim2.Init.ClockDivision = TIM_CLOCKDIVISION_DIV1;
htim2.Init.AutoReloadPreload = TIM_AUTORELOAD_PRELOAD_DISABLE;
if (HAL_TIM_Base_Init(&htim2) != HAL_OK)
{
Error_Handler();
}
sClockSourceConfig.ClockSource = TIM_CLOCKSOURCE_INTERNAL;
if (HAL_TIM_ConfigClockSource(&htim2, &sClockSourceConfig) != HAL_OK)
{
Error_Handler();
}
sMasterConfig.MasterOutputTrigger = TIM_TRGO_RESET;
sMasterConfig.MasterSlaveMode = TIM_MASTERSLAVEMODE_DISABLE;
if (HAL_TIMEx_MasterConfigSynchronization(&htim2, &sMasterConfig) != HAL_OK)
{
Error_Handler();
}
/* USER CODE BEGIN TIM2_Init 2 */
/* USER CODE END TIM2_Init 2 */
}I am not including the main(). Assume the TIM2 is running and with the callback ISR running. I am not including the code from the host, because it is an Android device, the code is in Java, and it's NOT relevant. Assume the host is putting out a bulkPacket of some bytes on every frame, according to the required flow. Assume the flow control works and the buffer never underflows or overflows. This has all been tested.
What I am asking for is help regarding optimizing this concurrence between the CDC callback when a new bulkTransfer packet is found, and the TIM2 processing happening fast, and them both fighting for the 350 cycles that are available between one ISR processing. Context about the processing being done: it's simply writing samples onto the GPIO 3. This whole project is about transfering signals as samples from an Android phone to a stm32 to be applied. So the PRIORITY of the ISR has to be ABOVE everything else, otherwise the signal is not procesed in time, and is CORRUPTED. I put the NVIC priority of the ISR above the USB priority.
Thank you and I await patiently for expertise regarding instruction optimization.
Luis
