R2Inference - TensorFlow-Lite
Make sure you also check R2Inference's companion project: GstInference |
R2Inference |
---|
Introduction |
Getting started |
Supported backends |
Examples |
Model Zoo |
Contact Us |
|
TensorFlow Lite is an open-source software library that is part of TensorFlow™. This provides a deep learning framework for on-device inference. Tensorflow lite models can be used on Android and IOS, also can be used on systems like Raspberry Pi and Arm64-based boards. The TensorFlow lite backend supports .tflite
models and .tflite
quantized models.
Installation
R2Inference TensorFlow Lite backend depends on the C/C++ TensorFlow API. The installation process consists on downloading the source code, build and install it.
TensorFlow Python API and utilities can be installed with Python pip. These are not needed by R2Inference, but they are highly recommended if you need to generate models.
X86
You can install the C/C++ Tensorflow API for x86 following the next steps:
- Build and install Tensorflow Lite
Download Tensorflow source code:
git clone -b v2.0.1 https://github.com/tensorflow/tensorflow export TensorflowPath=/PATH/TENSORFLOW/SRC cd $TensorflowPath/tensorflow/lite/tools/make
Download dependencies:
./download_dependencies.sh
Build:
./build_lib.sh
Copy the static library to the libraries path:
cp gen/linux_x86_64/lib/libtensorflow-lite.a /usr/local/lib/
Install abseil dependency:
cd downloads/absl/ mkdir build && cd build cmake .. make && sudo make install
Cross-compile for ARM64
First, start installing the toolchain and needed libs:
sudo apt-get update sudo apt-get install crossbuild-essential-arm64
Download Tensorflow source code:
export TensorflowPath=/PATH/TENSORFLOW/SRC git clone -b v2.0.1 https://github.com/tensorflow/tensorflow cd $TensorflowPath/tensorflow/lite/tools/make
Download the build dependencies:
./tensorflow/lite/tools/make/download_dependencies.sh
Then compile:
./tensorflow/lite/tools/make/build_aarch64_lib.sh
The static library is generated in: tensorflow/lite/tools/make/gen/linux_aarch64/lib/libtensorflow-lite.a.
Nvidia Jetson (TX1, TX2, Xavier, Nano)
You can install the C/C++ Tensorflow API for Jetson devices following the next steps:
- Build and install Tensorflow Lite
Download Tensorflow source code:
export TensorflowPath=/PATH/TENSORFLOW/SRC git clone -b v2.0.1 https://github.com/tensorflow/tensorflow cd $TensorflowPath/tensorflow/lite/tools/make
Download dependencies:
./download_dependencies.sh
Build:
./build_aarch64_lib.sh
Copy the static library to the libraries path:
cp gen/aarch64_armv8-a/lib/libtensorflow-lite.a /usr/local/lib/
Install abseil dependency:
cd downloads/absl/ mkdir build && cd build cmake .. make && sudo make install
Generating a model for R2I
In Tensorflow, all file formats are based on protocol buffers. As a summary, protocol buffers (or protobuf, as referred on the documentation) are data structures for which there are a set of tools to generate classes in C, Python, and other languages in order to load, save, and access the data between the supported API's. More information about TensorFlow Model files can be found here. The steps to generate a graph model suitable for GstInference on the Tensorflow backend can be summarized in three main steps:
- Save the graph structure that describes your model
- Save checkpoint of your model training session (Session variables)
- Combine the graph structure with the checkpoint data (this step is typically referred to as freezing the graph)
Saving a session with TensorFlow python API
In Tensorflow, you can use a saver object to handle saving and restoring of your model graph metadata and the checkpoint (variables) data. In general terms, outside a Tensorflow session a graph contains only the information regarding the mathematical operation that is performed, while the variables are given a particular value inside a session. Typically, after training your model, you can use a saver object to save both your graph structure and data checkpoint. The following is an example when working on the Tensorflow default graph:
#! /usr/bin/env python3 import tensorflow as tf import os #file name is model_graph.py dir = os.path.dirname(os.path.realpath(__file__)) default_saver = tf.train.Saver() with tf.Session() as sess: sess.run(tf.global_variables_initializer()) # Perform your graph construction and training default_saver.save(sess, dir + '/data-all')
This will generate 4 files:
- model_graph.chkp.meta: Graph data and metadata (operations, configurations, etc), allows to load a graph and retrain it.
- model_graph.chkp.index: This file has a key-value table linking a tensor name and the location to find the corresponding data in the chkp.data files
- model_graph.chkp.data-00000-of-00001: Holds all variables (which includes weights of the graph) from the session at different timestamps
- checkpoint : A file that keeps a record of the latest checkpoint files saved
The most important files are the chkp.meta and chkp.data files. On a directory, with such files, you can use the freeze_graph.py method provided by Tensorflow resources on a directory with the files generated by the saver object in order to generate a protocol buffer file suitable for GstInference.
You can refer to R2Inference Model Zoo for pre-trained models suitable for evaluating GstInference.
Create a model using saved weights from a saved model
This example code takes and saved model from a directory and converts it to a Tensorflow lite model with .tflite extension.
import tensorflow as tf # Construct a basic model. root = tf.train.Checkpoint() root.v1 = tf.Variable(3.) root.v2 = tf.Variable(2.) root.f = tf.function(lambda x: root.v1 * root.v2 * x) # Save the model. export_dir = "/tmp/test_saved_model" input_data = tf.constant(1., shape=[1, 1]) to_save = root.f.get_concrete_function(input_data) tf.saved_model.save(root, export_dir, to_save) # Convert the model. converter = tf.lite.TFLiteConverter.from_saved_model(export_dir) tflite_model = converter.convert()
Tools
Convert tensorflow frozen graph to tflite
If Tensorflow Python Api is installed on the system you will find a tool to convert Tensorflow frozen graph (.pb) to Tensorflow lite format (.tflite). To convert models run:
export OUTPUT_FILE=/PATH/TO/OUTPUT_FILE export GRAPH_FILE=/PATH/TO/GRAPH_FILE
tflite_convert \ --output_file=OUTPUT_FILE \ --graph_def_file=GRAPH_FILE \ --input_arrays=input \ --output_arrays=output \ --enable_v1_converter
Where input_arrays are the name of the input node of the model and output_arrays the name of the output node.
Tensorboard
TensorBoard is a visualization tool for TensorFlow. You can use TensorBoard to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data like images that pass through it. To use TensorBoard you simply need to install TensorFlow core. Installing TensorFlow via pip should also automatically install TensorBoard. This tool is specially useful to determine the input and output layer name of undocumented graphs. TensorBoard can load any TensorFlow checkpoint generated with the same version (loading a checkpoint generated with a different Tensorflow version will result in errors).
tensorboard --logdir=route/to/checkpoint/dir
You will get a message similar to this:
TensorBoard 1.10.0 at http://mtaylor-laptop:6006 (Press CTRL+C to quit)
Open that address in your browser, go to graph and analyze the graph to determine the output node name. In this example the output node name is ArgMax because it's input is the resnet_model/final_dense
signal.
API
You can find the full documentation of the C API here and the Python API here. R2Inference uses only the C API and R2Inference takes care of the session, loading the graph, and executing. Because of this, we will only take a look at the options that you can change when using the C API through R2Inference.
R2Inference changes the options of the framework via the "IParameters" class. First you need to create an object:
r2i::RuntimeError error; std::shared_ptr<r2i::IParameters> parameters = factory->MakeParameters (error);
Then call the "Set" or "Get" virtual functions:
parameters->Set(<option>, <value>) parameters->Get(<option>, <value>)
Tensorflow Lite Options
Property | C API Counterpart | Value | Operation | Description |
---|---|---|---|---|
number_of_threads | Interpreter->SetNumThreads | Integer | R/W | Set Number of threads to run |
allow_fp16 | Interpreter->SetAllowFp16PrecisionForFp32 | Integer | R/W | Allow the usage of 16-bit float point instead of 32 bits |