R2Inference - TensorFlow-Lite

From RidgeRun Developer Wiki




Previous: Supported_backends/TensorFlow Index Next: Supported_backends/Caffe




TensorFlow Lite is an open source software library that is part of TensorFlow™. This provides a deep learning framework for on-device inference. Tensorflow lite models can be used on Android and IOS, also can be used on systems like Raspberry Pi and Arm64-based boards.

Installation

R2Inference TensorFlow Lite backend depends on the C/C++ TensorFlow API. The installation process consists on downloading the source code, build and install it.

TensorFlow python API and utilities can be installed with python pip. These are not needed by R2Inference, but they are highly recommended if you need to generate models.

X86

You can install the C/C++ Tensorflow API for x86 following the next steps:

  • Build and install Tensorflow Lite

Download Tensorflow source code:

git clone https://github.com/tensorflow/tensorflow
cd tensorflow/tensorflow/lite/tools/make

Download dependencies:

./download_dependencies.sh

Build:

./build_lib.sh

Copy the static library to the libraries path:

cp gen/linux_x86_64/lib/libtensorflow-lite.a /usr/lib/x86_64-linux-gnu/


Generating a model for R2I

In Tensorflow, all file formats are based on protocol buffers. As a summary, protocol buffers (or protobuf, as referred on the documentation) are data structures for which there are a set of tools to generate classes in C, Python, and other languages in order to load, save, and access the data between the supported API's. More information about TensorFlow Model files can be found here. The steps to generate a graph model suitable for GstInference on the Tensorflow backend can be summarized in three main steps:

  1. Save the graph structure that describes your model
  2. Save checkpoint of your model training session (Session variables)
  3. Combine the graph structure with the checkpoint data (this steps is typically refer to as freezing the graph)

Saving a session with TensorFlow python API

In Tensorflow, you can use a saver object to handle saving and restoring of your model graph metadata and the checkpoint (variables) data. In general terms, outside a Tensorflow session a graph contains only the information regarding the mathematical operation that is performed, while the variables are given a particular value inside a a session. Tipically, after training you model, you can the us a saver object to save both your graph structure and data checkpoint. The following is an example when working on the Tensorflow default graph:

#! /usr/bin/env python3

import tensorflow as tf
import os

#file name is model_graph.py
dir = os.path.dirname(os.path.realpath(__file__))

default_saver = tf.train.Saver() 

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  
  
  # Perform your graph construction and training
  
  
  default_saver.save(sess, dir + '/data-all')

This will generate 4 files:

  • model_graph.chkp.meta: Graph data and metadata (operations, configurations, etc), allows to load a graph and retrain it.
  • model_graph.chkp.index: This file has a key-value table linking a tensor name and the location to find the corresponding data in the chkp.data files
  • model_graph.chkp.data-00000-of-00001: Holds all variables (which includes weights of the graph) from the session at different timestamps
  • checkpoint : A file that keeps a record of latest checkpoint files saved

The most important files are the chkp.meta and chkp.data files. On a directory with such files you can use the freeze_graph.py method provided by Tensorflow resources on a directory with the files generated by the saver object in order to generate a protocol buffer file suitable for GstInference.

You can refer to R2Inference Model Zoo for pre-trained models suitable for evaluating GstInference.

Create a model using saved weights from a saved model

This example code take and saved model from a directory and convert it to Tensorflow lite model with .tflite extension.

import tensorflow as tf

# Construct a basic model.
root = tf.train.Checkpoint()
root.v1 = tf.Variable(3.)
root.v2 = tf.Variable(2.)
root.f = tf.function(lambda x: root.v1 * root.v2 * x)

# Save the model.
export_dir = "/tmp/test_saved_model"
input_data = tf.constant(1., shape=[1, 1])
to_save = root.f.get_concrete_function(input_data)
tf.saved_model.save(root, export_dir, to_save)

# Convert the model.
converter = tf.lite.TFLiteConverter.from_saved_model(export_dir)
tflite_model = converter.convert()

Tools

Convert tensorflow frozen graph to tflite

If Tensorflow Python Api is installed on the system you will find a tool to convert Tensorflow frozen graph (.pb) to Tensorflow lite format (.tflite). To convert models run:

tflite_convert \
  --output_file=/models/model.tflite \
  --graph_def_file=/models/frozen_graph.pb \
  --input_arrays=input \
  --output_arrays=output

Where input_arrays are the name of the input node of the model and output_arrays the name of the output node.

Tensorboard

TensorBoard is a visualization tool for TensorFlow. You can use TensorBoard to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data like images that pass through it. To use TensorBoard you simply need to install TensorFlow core. Installing TensorFlow via pip should also automatically install TensorBoard. This tool is specially useful to determine the input and output layer name of undocumented graphs. TensorBoard can load any TensorFlow checkpoint generated with the same version (loading a checkpoint generated with a different Tensorflow version will result on errors).

tensorboard --logdir=route/to/checkpoint/dir

You will get a message similar to this:

TensorBoard 1.10.0 at http://mtaylor-laptop:6006 (Press CTRL+C to quit)

Open that address in your browser, go to graph and analyze the graph to determine the output node name. In this example the output node name is ArgMax because it's input is the resnet_model/final_dense signal.

Resnet output node

API

You can find the full documentation of the C API here and the Python API here. R2Inference uses only the C API and R2Inference takes care of the session, loading the graph and executing. Because of this, we will only take a look at the options that you can change when using the C API through R2Inference.

R2Inference changes the options of the framework via the "IParameters" class. First you need to create an object:

r2i::RuntimeError error;
std::shared_ptr<r2i::IParameters> parameters = factory->MakeParameters (error);

Then call the "Set" or "Get" virtual functions:

parameters->Set(<option>, <value>)
parameters->Get(<option>, <value>)

Tensorflow Lite Options

Property C API Counterpart Value Operation Description
number_of_threads Interpreter->SetNumThreads Integer R/W Set Number of threads to run
allow_fp16 Interpreter->SetAllowFp16PrecisionForFp32 Integer R/W Allow the usage of 16 bit float point instead of 32 bits




Previous: Supported_backends/Tensorflow Index Next: Supported_backends/Caffe