Exploring TensorFlow Lite delegates for prototyping
đ§ Documentation under development
The Getting started with AI on NXP i.MX8M Plus guide is currently under active development. Some sections may be incomplete or change without notice.
Questions? Contact RidgeRun or email to support@ridgerun.com.
NNAPI Delegate
With this delegate, we are able to ship the inference to the NPU accelerator. You need to ensure your model supports 8 or 16 bits quantization, otherwise, the NNAPI will send the unsupported operation back to the CPU, executing a CPU fallback, decreasing the performance of the entire inference time execution, as we will see in further sections. Also, you had to enable the TensorFlow Lite construction with NNAPI -DTFLITE_ENABLE_NNAPI=on flag for this step.
As we saw in the Cross-compiling apps for GStreamer, TensorFlow_Lite, and OpenCV in the minimal TensorFlow Lite example, then the process of delegating is related to adding the following lines before allocating the input tensors:
// <Your includes>
// The required includes for NNAPI:
#include "tensorflow/lite/delegates/nnapi/nnapi_delegate.h"
#include "tensorflow/lite/tools/delegates/delegate_provider.h"
void inference(){
// <Interpreter construction>
// NNAPI construction:
tflite::StatefulNnApiDelegate::Options options;
options.allow_fp16 = true;
options.allow_dynamic_dimensions = true;
options.disallow_nnapi_cpu = false;
options.accelerator_name = "vsi-npu";
auto delegate = tflite::evaluation::CreateNNAPIDelegate(options);
if (!delegate){
std::cout << "NNAPI delegate is not well created \n" << std::endl;
return ;
} else {
// Modifying the graph to support NNAPI operations:
interpreter->ModifyGraphWithDelegate(std::move(delegate));
// Allocating the input thensors:
TFLITE_MINIMAL_CHECK2(interpreter->AllocateTensors() == kTfLiteOk);
// <Feed your tensors>
}
}
Bonus XNNPACK Delegate