Setting the board on debug mode for Profiling the NPU in NXP i.MX8M Plus board
Getting started with AI on NXP i.MX8M Plus RidgeRun documentation is currently under development. |
i.MX8M Plus debug mode
Before starting any application that you know is using the NPU, you can set up some flags in order to put the board in verbose mode, then you will have some extra outputs from the NPU driver, telling you how much time a certain model operation takes, or when is executing a CPU fallback due to some incompatible model operation.
This process is about exporting the following flags:
export CNN_PERF=1 NN_EXT_SHOW_PERF=1 VIV_VX_DEBUG_LEVEL=1 VIV_VX_PROFILE=1
After this, you can execute your accelerated application, and depending on the model, output like the following will be shown:
... layer id: 12 layer name:TensorAdd operation[0]:VXNNE_OPERATOR_TENSOR_ADD target:VXNNE_OPERATION_TARGET_SH. uid: 4 op_abs_id: 10 layer id: 15 layer name:TensorCopy operation[0]:unkown operation type target:VXNNE_OPERATION_TARGET_SH. uid: 15 op_abs_id: 19 shader kernel name: tensorCopy_F16toF32_2D execution time: 120 us prev_ptrs = 0xaaaafdaf9a40 prev_ptrs = 0xaaab003adc80 prev_ptrs = 0xaaab001cae00 prev_ptrs = 0xaaab001cee00 prev_ptrs = 0xaaab001d0e40 prev_ptrs = 0xaaab001d2e80 prev_ptrs = 0xaaab001d4280 prev_ptrs = 0xaaab001d5680 prev_ptrs = 0xaaab001d6480 Releasing object array 0xaaab001f3740 Releasing object array 0xaaab004a17c0 Releasing object array 0xaaab004b1890 Releasing object array 0xaaab004c1960 prev_ptrs = 0xaaab007f0400 prev_ptrs = 0xaaab007f3140 prev_ptrs = 0xaaab007f5f40 prev_ptrs = 0xaaab007f8d00 prev_ptrs = 0xaaab007fbac0 Releasing object array 0xaaab0080e3b0 Releasing object array 0xaaab0081f120 Exit VX Thread: 0x83695120 ...
Some execution time for each operations, unknown operations that will implicate on CPU fallback, and modified registers are specified in order to understand deeper the performance of your model on the NPU.