I have a question about cmsis dsp lib source.((arm_mult_q31.c)
Hi
I found something strange in cmsis dsp source.
I am using stm32f429 and LITTLE ENDIAN and cmsis dsp library V.1.5.1.
in cmsis dsp library(arm_mult_q31.c) :
void arm_mult_q31( q31_t * pSrcA, q31_t * pSrcB, q31_t * pDst, uint32_t blockSize)
{
...
while (blkCnt > 0U)
{
/* C = A * B */
/* Multiply the inputs and then store the results in the destination buffer. */
inA1 = *pSrcA++;
inA2 = *pSrcA++;
inA3 = *pSrcA++;
inA4 = *pSrcA++;
inB1 = *pSrcB++;
inB2 = *pSrcB++;
inB3 = *pSrcB++;
inB4 = *pSrcB++;
out1 = ((q63_t)inA1 * inB1) >> 32;
out2 = ((q63_t)inA2 * inB2) >> 32;
out3 = ((q63_t)inA3 * inB3) >> 32;
out4 = ((q63_t)inA4 * inB4) >> 32;
out1 = __SSAT(out1, 31);
out2 = __SSAT(out2, 31);
out3 = __SSAT(out3, 31);
out4 = __SSAT(out4, 31);
*pDst++ = out1 << 1U;
*pDst++ = out2 << 1U;
*pDst++ = out3 << 1U;
*pDst++ = out4 << 1U;
/* Decrement the blockSize loop counter */
blkCnt--;
}
...
}
In the case of Little Endian, it should be modified as follows. Is my thinking wrong?
Little endian case :
//out1 = ((q63_t)inA1 * inB1) >> 32;
//out2 = ((q63_t)inA2 * inB2) >> 32;
//out3 = ((q63_t)inA3 * inB3) >> 32;
//out4 = ((q63_t)inA4 * inB4) >> 32;
out1 = ((q63_t)inA1 * inB1);
out2 = ((q63_t)inA2 * inB2);
out3 = ((q63_t)inA3 * inB3);
out4 = ((q63_t)inA4 * inB4);
out1 = __SSAT(out1, 31);
out2 = __SSAT(out2, 31);
out3 = __SSAT(out3, 31);
out4 = __SSAT(out4, 31);
//*pDst++ = out1 << 1U;
//*pDst++ = out2 << 1U;
//*pDst++ = out3 << 1U;
//*pDst++ = out4 << 1U;
Thanks in advance!
