GstInference/Benchmarks: Difference between revisions

Revision as of 15:37, 21 July 2020

GstInference Benchmarks

The following benchmarks were run with a source video (1920x1080@60). With the following base GStreamer pipeline, and environment variables:

$ VIDEO_FILE='video.mp4'
$ MODEL_LOCATION='graph_inceptionv1_tensorflow.pb'
$ INPUT_LAYER='input'
$ OUTPUT_LAYER='InceptionV1/Logits/Predictions/Reshape_1'

The environment variables were changed accordingly with the used model (Inception V1,V2,V3 or V4)

GST_DEBUG=inception1:1 gst-launch-1.0 filesrc location=$VIDEO_FILE ! decodebin ! videoconvert ! videoscale ! queue ! net.sink_model inceptionv1 name=net model-location=$MODEL_LOCATION backend=tensorflow backend::input-layer=$INPUT_LAYER  backend::output-layer=$OUTPUT_LAYER net.src_model ! perf ! fakesink -v

The Desktop PC had the following specifications:

Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
8 GB RAM
Cedar [Radeon HD 5000/6000/7350/8350 Series]
Linux 4.15.0-54-generic x86_64 (Ubuntu 16.04)

The Jetson Xavier power modes used were 2 and 6 (more information: Supported Modes and Power Efficiency)

View current power mode:

$ sudo /usr/sbin/nvpmodel -q

Change current power mode:

sudo /usr/sbin/nvpmodel -m x

Where x is the power mode ID (e.g. 0, 1, 2, 3, 4, 5, 6).

Summary

Desktop PC	CPU Library
Model	Framerate	CPU Usage
Inception V1	11.89	48
Inception V2	10.33	65
Inception V3	5.41	90
Inception V4	3.81	94

Jetson Xavier (15W)	CPU Library		GPU Library
Model	Framerate	CPU Usage	Framerate	CPU Usage
Inception V1	8.24	86	52.3	43
Inception V2	6.58	88	39.6	42
Inception V3	2.54	92	17.8	25
Inception V4	1.22	94	9.4	20

Jetson Xavier (30W)	CPU Library		GPU Library
Model	Framerate	CPU Usage	Framerate	CPU Usage
Inception V1	6.41	93	66.27	72
Inception V2	5.11	95	50.59	62
Inception V3	1.96	98	22.95	44
Inception V4	0.98	99	12.14	32

Framerate

CPU Usage

Introduction

Test benchmark video

The following video was used to perform the benchmark tests.
To download the video press right click on the video and select 'Save video as' and save this in your computer.

Test benchmark video

TensorFlow Lite Benchmarks

FPS measurement

CPU usage measurement

Test benchmark video

The following video was used to perform the benchmark tests.
To download the video press right click on the video and select 'Save video as' and save this in your computer.

Test benchmark video

ONNXRT Benchmarks

The Desktop PC had the following specifications:

Intel(R) Core(TM) Core i7-7700HQ CPU @ 2.80GHz
12 GB RAM
Linux 4.15.0-106-generic x86_64 (Ubuntu 16.04)
GStreamer 1.8.3

The following was the GStreamer pipeline used to obtain the results:

# MODELS_PATH has the following structure
#/path/to/models/
#├── InceptionV1_onnxrt
#│   ├── graph_inceptionv1_info.txt
#│   ├── graph_inceptionv1.onnx
#│   └── labels.txt
#├── InceptionV2_onnxrt
#│   ├── graph_inceptionv2_info.txt
#│   ├── graph_inceptionv2.onnx
#│   └── labels.txt
#├── InceptionV3_onnxrt
#│   ├── graph_inceptionv3_info.txt
#│   ├── graph_inceptionv3.onnx
#│   └── labels.txt
#├── InceptionV4_onnxrt
#│   ├── graph_inceptionv4_info.txt
#│   ├── graph_inceptionv4.onnx
#│   └── labels.txt
#├── TinyYoloV2_onnxrt
#│   ├── graph_tinyyolov2_info.txt
#│   ├── graph_tinyyolov2.onnx
#│   └── labels.txt
#└── TinyYoloV3_onnxrt
#    ├── graph_tinyyolov3_info.txt
#    ├── graph_tinyyolov3.onnx
#    └── labels.txt

model_array=(inceptionv1 inceptionv2 inceptionv3 inceptionv4 tinyyolov2 tinyyolov3)
model_upper_array=(InceptionV1 InceptionV2 InceptionV3 InceptionV4 TinyYoloV2 TinyYoloV3)
MODELS_PATH=/path/to/models/
INTERNAL_PATH=onnxrt
EXTENSION=".onnx"

gst-launch-1.0 \
filesrc location=$VIDEO_PATH num-buffers=600 ! decodebin ! videoconvert ! \
perf print-arm-load=true name=inputperf ! tee name=t t. ! videoscale ! queue ! net.sink_model t. ! queue ! net.sink_bypass \
${model_array[i]} backend=onnxrt name=net \
model-location="${MODELS_PATH}${model_upper_array[i]}_${INTERNAL_PATH}/graph_${model_array[i]}${EXTENSION}" \
net.src_bypass ! perf print-arm-load=true name=outputperf ! videoconvert ! fakesink sync=false

FPS Measurements

CPU Load Measurements

Test benchmark video

The following video was used to perform the benchmark tests.
To download the video press right click on the video and select 'Save video as' and save this in your computer.

Test benchmark video

Jetson AGX Xavier

FPS Measurements

CPU Load Measurements

Jetson TX2

FPS Measurements

CPU Load Measurements

Jetson Nano

FPS Measurements

CPU Load Measurements

Google Coral

FPS Measurements

CPU Load Measurements

x86

FPS Measurements

CPU Load Measurements

Previous: Example Applications/DispTec

Index

Next: Model Zoo

@@ Line 153: / Line 153: @@
 [[File:CPU Benchmarks gst-inference.png|1024px|frameless|thumb|center]]
+== Introduction ==
+=== Test benchmark video ===
+The following video was used to perform the benchmark tests.
+<br>
+To download the video press  right click on the video and select 'Save video as' and save this in your computer.
+[[File:Test benchmark video.mp4|500px|thumb|center|Test benchmark video]]
 == TensorFlow Lite Benchmarks ==
@@ Line 1,666: / Line 1,676: @@
 </html>
-=== CPU Measurements ===
+=== CPU Load Measurements ===
 <html>
@@ Line 1,709: / Line 1,719: @@
            width: 900,
            chart: {
-             title: 'Model Vs CPU per backend',
+             title: 'Model Vs CPU Load per backend',
            },
            series: {
@@ Line 1,732: / Line 1,742: @@
            view_coral_cpu.setColumns([0,1, 2]);
            materialChart_coral_cpu.draw(view_coral_cpu, Coral_materialOptions_cpu);
+        }
+        drawMaterialChart();
+        }
+</script>
+</html>
+== x86 ==
+=== FPS Measurements ===
+<html>
+<style>
+    .button {
+    background-color: #008CBA;
+    border: none;
+    color: white;
+    padding: 15px 32px;
+    text-align: center;
+    text-decoration: none;
+    display: inline-block;
+    font-size: 16px;
+    margin: 4px 2px;
+    cursor: pointer;
+  }
+</style>
+<div id="chart_fps_x86" style="margin: auto; width: 800px; height: 500px;"></div>
+<script>
+      google.charts.load('current', {'packages':['corechart', 'bar']});
+      google.charts.setOnLoadCallback(drawStuffx86Fps);
+      function drawStuffx86Fps() {
+        var chartDiv_Fps_x86 = document.getElementById('chart_fps_x86');
+        var table_models_fps_x86 = google.visualization.arrayToDataTable([
+          ['Model',                      //Column 0
+           'ONNXRT \n x86',
+           'TensorFlow \n x86',
+           'TensorFlow Lite \n x86'],        //Column 1
+          ['InceptionV1', 47.9, 63.7, 22.8], //row 1
+          ['InceptionV2', 32.7, 48.4, 14.2], //row 2
+          ['InceptionV3', 12.1, 20.5, 12.2], //row 3
+          ['InceptionV4', 5.26, 10.3, 10.2], //row 4
+          ['TinyYoloV2',  16, 24.3, 12.2], //row 5
+          ['TinyYoloV3',  18.4, 27.1, 10.2]  //row 6
+        ]);
+        var x86_materialOptions_fps = {
+          width: 900,
+          chart: {
+            title: 'Model Vs FPS per backend',
+          },
+          series: {
+          },
+          axes: {
+            y: {
+              distance: {side: 'left',label: 'FPS'}, // Left y-axis.
+            }
+          }
+        };
+        var materialChart_x86_fps = new google.charts.Bar(chartDiv_Fps_x86);
+        view_x86_fps = new google.visualization.DataView(table_models_fps_x86);
+        function drawMaterialChart() {
+          var materialChart_x86_fps = new google.charts.Bar(chartDiv_Fps_x86);
+          materialChart_x86_fps.draw(table_models_fps_x86, google.charts.Bar.convertOptions(x86_materialOptions_fps));
+          init_charts();
+        }
+        function init_charts(){
+          view_x86_fps.setColumns([0,1, 2, 3]);
+          materialChart_x86_fps.draw(view_x86_fps, x86_materialOptions_fps);
+        }
+        drawMaterialChart();
+        }
+</script>
+</html>
+=== CPU Load Measurements ===
+<html>
+<style>
+    .button {
+    background-color: #008CBA;
+    border: none;
+    color: white;
+    padding: 15px 32px;
+    text-align: center;
+    text-decoration: none;
+    display: inline-block;
+    font-size: 16px;
+    margin: 4px 2px;
+    cursor: pointer;
+  }
+</style>
+<div id="chart_cpu_x86" style="margin: auto; width: 800px; height: 500px;"></div>
+<script>
+      google.charts.load('current', {'packages':['corechart', 'bar']});
+      google.charts.setOnLoadCallback(drawStuffx86Cpu);
+      function drawStuffx86Cpu() {
+        var chartDiv_Cpu_x86 = document.getElementById('chart_cpu_x86');
+        var table_models_cpu_x86 = google.visualization.arrayToDataTable([
+          ['Model',                      //Column 0
+           'ONNXRT \n x86',
+           'TensorFlow \n x86',
+           'TensorFlow Lite \n x86'],        //Column 1
+          ['InceptionV1', 94.6, 74, 46], //row 1
+          ['InceptionV2', 100, 75, 43], //row 2
+          ['InceptionV3', 95.2, 79, 54], //row 3
+          ['InceptionV4', 88.8, 84, 50], //row 4
+          ['TinyYoloV2',  94, 79, 45], //row 5
+          ['TinyYoloV3',  91.4, 76, 44]  //row 6
+        ]);
+        var x86_materialOptions_cpu = {
+          width: 900,
+          chart: {
+            title: 'Model Vs CPU Load per backend',
+          },
+          series: {
+          },
+          axes: {
+            y: {
+              distance: {side: 'left',label: 'CPU Load'}, // Left y-axis.
+            }
+          }
+        };
+        var materialChart_x86_cpu = new google.charts.Bar(chartDiv_Cpu_x86);
+        view_x86_cpu = new google.visualization.DataView(table_models_cpu_x86);
+        function drawMaterialChart() {
+          var materialChart_x86_cpu = new google.charts.Bar(chartDiv_Cpu_x86);
+          materialChart_x86_cpu.draw(table_models_cpu_x86, google.charts.Bar.convertOptions(x86_materialOptions_cpu));
+          init_charts();
+        }
+        function init_charts(){
+          view_x86_cpu.setColumns([0,1, 2, 3]);
+          materialChart_x86_cpu.draw(view_x86_cpu, x86_materialOptions_cpu);
          }
          drawMaterialChart();