Hello Lisa,
From what you sent, we can easily reproduce the issue, thanks for providing the project.
Using what has been provided, the main issue is that the weights of the network are not initialized (we do not have access to the weights initializer file -i think).
In your case, you requested for the weights to be stored in internal RAM (684.601kB stored in npuRAM4 and 5). Since this memory is not initialized neither by the code, nor by an external debugger access, the weights used are "the random values sitting there when the ram was powered up". Leading to seemingly random outputs at each restart.
When you try to do a "validation on target" using the tools, the weights are actually loaded by the debugger on board, after powering-up the RAMs, ensuring proper outputs.
For your example, you should maybe store the weights in flash (and fetch them from flash at inference time) or copy/paste them from flash to internal rams.
Anyway, for testing purposes in your project, you can fake the fact that weights are all 0 everywhere by adding the following extra initialization step before your `for` loop that sets the inputs:
memset(0x34270000, 0, 448*1024); // Set npuRAM4 contents to 0
memset(0x342E0000, 0, 448*1024); // Set npuRAM5 contents to 0
SCB_CleanInvalidateDCache_by_Addr(0x34270000, 448*1024); // Ensure the 0 are all written to physical memory
SCB_CleanInvalidateDCache_by_Addr(0x342E0000, 448*1024); // Ensure the 0 are all written to physical memory
You will see that the outputs now are the same at every reset.
Keep us informed if there is still an issue.
Best regards.