GstInference/Benchmarks: Difference between revisions

From RidgeRun Developer Wiki
No edit summary
No edit summary
Line 11: Line 11:
</html>
</html>


== GstInference Benchmarks ==
= GstInference Benchmarks =
The following benchmarks were run with a source video (1920x1080@60). With the following base [https://www.ridgerun.com/gstreamer GStreamer] pipeline, and environment variables:
 
== Introduction ==
 
This wiki summarizes a series of benchmarks on different hardware platforms based on the [https://github.com/RidgeRun/gst-inference/blob/master/tests/benchmark/run_benchmark.sh run_benchmark.sh] bash script that can be found in the official [https://github.com/RidgeRun/gst-inference GstInference repository]. The script is based on the following GStreamer pipeline:
 


<source lang="bash">
<source lang="bash">
$ VIDEO_FILE='video.mp4'
#Script to run each model
$ MODEL_LOCATION='graph_inceptionv1_tensorflow.pb'
run_all_models(){
$ INPUT_LAYER='input'
 
$ OUTPUT_LAYER='InceptionV1/Logits/Predictions/Reshape_1'
  model_array=(inceptionv1 inceptionv2 inceptionv3 inceptionv4 tinyyolov2 tinyyolov3)
</source>
  model_upper_array=(InceptionV1 InceptionV2 InceptionV3 InceptionV4 TinyYoloV2 TinyYoloV3)
The environment variables were changed accordingly with the used model (Inception V1,V2,V3 or V4)
  input_array=(input input input input input/Placeholder inputs )
  output_array=(InceptionV1/Logits/Predictions/Reshape_1 Softmax InceptionV3/Predictions/Reshape_1
  InceptionV4/Logits/Predictions add_8 output_boxes )
 
  mkdir -p logs/
  rm -f logs/*
 
  for ((i=0;i<${#model_array[@]};++i)); do
    echo Perf ${model_array[i]}
    gst-launch-1.0 \
    filesrc location=$VIDEO_PATH num-buffers=600 ! decodebin ! videoconvert ! \
    perf print-arm-load=true name=inputperf ! tee name=t t. ! videoscale ! queue ! net.sink_model t. ! queue ! net.sink_bypass \
    ${model_array[i]} backend=$BACKEND name=net backend::input-layer=${input_array[i]} backend::output-layer=${output_array[i]} \
    model-location="${MODELS_PATH}${model_upper_array[i]}_${INTERNAL_PATH}/graph_${model_array[i]}${EXTENSION}" \
    net.src_bypass ! perf print-arm-load=true name=outputperf ! videoconvert ! fakesink sync=false > logs/${model_array[i]}.log
  done
}
</source>  
 
=== Test benchmark video ===
The following video was used to perform the benchmark tests.
<br>
To download the video press  right click on the video and select 'Save video as' and save this in your computer.
 
[[File:Test benchmark video.mp4|500px|thumb|center|Test benchmark video]]


<source lang="bash">
== x86 ==
GST_DEBUG=inception1:1 gst-launch-1.0 filesrc location=$VIDEO_FILE ! decodebin ! videoconvert ! videoscale ! queue ! net.sink_model inceptionv1 name=net model-location=$MODEL_LOCATION backend=tensorflow backend::input-layer=$INPUT_LAYER  backend::output-layer=$OUTPUT_LAYER net.src_model ! perf ! fakesink -v
</source>


The Desktop PC had the following specifications:
The Desktop PC had the following specifications:
Line 32: Line 58:
*Linux 4.15.0-54-generic x86_64 (Ubuntu 16.04)
*Linux 4.15.0-54-generic x86_64 (Ubuntu 16.04)


The Jetson Xavier power modes used were 2 and 6 (more information: [https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%2520Linux%2520Driver%2520Package%2520Development%2520Guide%2Fpower_management_jetson_xavier.html%23wwpID0E0OM0HA Supported Modes and Power Efficiency])
=== FPS Measurements ===
 
<html>
 
<style>
    .button {
    background-color: #008CBA;
    border: none;
    color: white;
    padding: 15px 32px;
    text-align: center;
    text-decoration: none;
    display: inline-block;
    font-size: 16px;
    margin: 4px 2px;
    cursor: pointer;
  }
</style>
 
<div id="chart_fps_x86" style="margin: auto; width: 800px; height: 500px;"></div>
 
<script>
      google.charts.load('current', {'packages':['corechart', 'bar']});
      google.charts.setOnLoadCallback(drawStuffx86Fps);
     
      function drawStuffx86Fps() {
 
        var chartDiv_Fps_x86 = document.getElementById('chart_fps_x86');
 
        var table_models_fps_x86 = google.visualization.arrayToDataTable([
          ['Model',                      //Column 0
          'ONNXRT \n x86',
          'TensorFlow \n x86',
          'TensorFlow Lite \n x86'],        //Column 1
          ['InceptionV1', 47.9, 63.7, 22.8], //row 1
          ['InceptionV2', 32.7, 48.4, 14.2], //row 2
          ['InceptionV3', 12.1, 20.5, 12.2], //row 3
          ['InceptionV4', 5.26, 10.3, 10.2], //row 4
          ['TinyYoloV2',  16, 24.3, 12.2], //row 5
          ['TinyYoloV3',  18.4, 27.1, 10.2]  //row 6
        ]);
        var x86_materialOptions_fps = {
          width: 900,
          chart: {
            title: 'Model Vs FPS per backend',
          },
          series: {
          },
          axes: {
            y: {
              distance: {side: 'left',label: 'FPS'}, // Left y-axis.
            }
          }
        };
 
        var materialChart_x86_fps = new google.charts.Bar(chartDiv_Fps_x86);
        view_x86_fps = new google.visualization.DataView(table_models_fps_x86);


*View current power mode:
        function drawMaterialChart() {
          var materialChart_x86_fps = new google.charts.Bar(chartDiv_Fps_x86);
          materialChart_x86_fps.draw(table_models_fps_x86, google.charts.Bar.convertOptions(x86_materialOptions_fps));


<source lang="bash">
          init_charts();
$ sudo /usr/sbin/nvpmodel -q
        }
</source>
        function init_charts(){
          view_x86_fps.setColumns([0,1, 2, 3]);
          materialChart_x86_fps.draw(view_x86_fps, x86_materialOptions_fps);
        }
        drawMaterialChart();
        }


*Change current power mode:
</script>


<source lang="bash">
</html>
sudo /usr/sbin/nvpmodel -m x
</source>
Where x is the power mode ID (e.g. 0, 1, 2, 3, 4, 5, 6).


=== Summary ===  
=== CPU Load Measurements ===


{| class="wikitable" style="display: inline-table;"
<html>
! style="font-weight:bold; background-color:#efefef; color:#000000;" | Desktop PC
! colspan="2" style="text-align: center; font-weight:bold; background-color:#efefef; color:#000000;" | CPU Library
|-
| style="background-color:#e98d44; color:#000000;" | Model
| style="background-color:#e98d44; color:#000000;" | Framerate
| style="background-color:#e98d44; color:#000000;" | CPU Usage
|-
| style="background-color:#e98d44; color:#000000;" | Inception V1
| style="background-color:#fee3cd; color:#000000;" | 11.89
| style="background-color:#fee3cd; color:#000000;" | 48
|-
| style="background-color:#e98d44; color:#000000;" | Inception V2
| style="background-color:#fee3cd; color:#000000;" | 10.33
| style="background-color:#fee3cd; color:#000000;" | 65
|-
| style="background-color:#e98d44; color:#000000;" | Inception V3
| style="background-color:#fee3cd; color:#000000;" | 5.41
| style="background-color:#fee3cd; color:#000000;" | 90
|-
| style="background-color:#e98d44; color:#000000;" | Inception V4
| style="background-color:#fee3cd; color:#000000;" | 3.81
| style="background-color:#fee3cd; color:#000000;" | 94
|}


{| class="wikitable" style="display: inline-table;"
<style>
! style="font-weight:bold; background-color:#efefef; color:#000000;" | Jetson Xavier (15W)
    .button {
! colspan="2" style="text-align: center; font-weight:bold; background-color:#efefef; color:#000000;" | CPU Library
    background-color: #008CBA;
! colspan="2" style="text-align: center; font-weight:bold; background-color:#efefef; color:#000000;" | GPU Library
    border: none;
|-
    color: white;
| style="background-color:#2c79d3; color:#000000;" | Model
    padding: 15px 32px;
| style="background-color:#2c79d3; color:#000000;" | Framerate
    text-align: center;
| style="background-color:#2c79d3; color:#000000;" | CPU Usage
    text-decoration: none;
| style="background-color:#2c79d3; color:#000000;" | Framerate
    display: inline-block;
| style="background-color:#2c79d3; color:#000000;" | CPU Usage
    font-size: 16px;
|-
    margin: 4px 2px;
| style="background-color:#2c79d3; color:#000000;" | Inception V1
    cursor: pointer;
| style="background-color:#c5daf6; color:#000000;" | 8.24
  }
| style="background-color:#c5daf6; color:#000000;" | 86
</style>
| style="background-color:#c5daf6; color:#000000;" | 52.3
| style="background-color:#c5daf6; color:#000000;" | 43
|-
| style="background-color:#2c79d3; color:#000000;" | Inception V2
| style="background-color:#c5daf6; color:#000000;" | 6.58
| style="background-color:#c5daf6; color:#000000;" | 88
| style="background-color:#c5daf6; color:#000000;" | 39.6
| style="background-color:#c5daf6; color:#000000;" | 42
|-
| style="background-color:#2c79d3; color:#000000;" | Inception V3
| style="background-color:#c5daf6; color:#000000;" | 2.54
| style="background-color:#c5daf6; color:#000000;" | 92
| style="background-color:#c5daf6; color:#000000;" | 17.8
| style="background-color:#c5daf6; color:#000000;" | 25
|-
| style="background-color:#2c79d3; color:#000000;" | Inception V4
| style="background-color:#c5daf6; color:#000000;" | 1.22
| style="background-color:#c5daf6; color:#000000;" | 94
| style="background-color:#c5daf6; color:#000000;" | 9.4
| style="background-color:#c5daf6; color:#000000;" | 20
|}


{| class="wikitable" style="display: inline-table;"
<div id="chart_cpu_x86" style="margin: auto; width: 800px; height: 500px;"></div>
! style="font-weight:bold; background-color:#efefef; color:#000000;" | Jetson Xavier (30W)
! colspan="2" style="text-align: center; font-weight:bold; background-color:#efefef; color:#000000;" | CPU Library
! colspan="2" style="text-align: center; font-weight:bold; background-color:#efefef; color:#000000;" | GPU Library
|-
| style="background-color:#6aa758; color:#000000;" | Model
| style="background-color:#6aa758; color:#000000;" | Framerate
| style="background-color:#6aa758; color:#000000;" | CPU Usage
| style="background-color:#6aa758; color:#000000;" | Framerate
| style="background-color:#6aa758; color:#000000;" | CPU Usage
|-
| style="background-color:#6aa758; color:#000000;" | Inception V1
| style="background-color:#d8e9d3; color:#000000;" | 6.41
| style="background-color:#d8e9d3; color:#000000;" | 93
| style="background-color:#d8e9d3; color:#000000;" | 66.27
| style="background-color:#d8e9d3; color:#000000;" | 72
|-
| style="background-color:#6aa758; color:#000000;" | Inception V2
| style="background-color:#d8e9d3; color:#000000;" | 5.11
| style="background-color:#d8e9d3; color:#000000;" | 95
| style="background-color:#d8e9d3; color:#000000;" | 50.59
| style="background-color:#d8e9d3; color:#000000;" | 62
|-
| style="background-color:#6aa758; color:#000000;" | Inception V3
| style="background-color:#d8e9d3; color:#000000;" | 1.96
| style="background-color:#d8e9d3; color:#000000;" | 98
| style="background-color:#d8e9d3; color:#000000;" | 22.95
| style="background-color:#d8e9d3; color:#000000;" | 44
|-
| style="background-color:#6aa758; color:#000000;" | Inception V4
| style="background-color:#d8e9d3; color:#000000;" | 0.98
| style="background-color:#d8e9d3; color:#000000;" | 99
| style="background-color:#d8e9d3; color:#000000;" | 12.14
| style="background-color:#d8e9d3; color:#000000;" | 32
|}


=== Framerate ===
<script>
      google.charts.load('current', {'packages':['corechart', 'bar']});
      google.charts.setOnLoadCallback(drawStuffx86Cpu);
     
      function drawStuffx86Cpu() {


[[File:Framerate Benchmarks gst-inference.png|1024px|frameless|thumb|center]]
        var chartDiv_Cpu_x86 = document.getElementById('chart_cpu_x86');


=== CPU Usage ===
        var table_models_cpu_x86 = google.visualization.arrayToDataTable([
          ['Model',                      //Column 0
          'ONNXRT \n x86',
          'TensorFlow \n x86',
          'TensorFlow Lite \n x86'],        //Column 1
          ['InceptionV1', 94.6, 74, 46], //row 1
          ['InceptionV2', 100, 75, 43], //row 2
          ['InceptionV3', 95.2, 79, 54], //row 3
          ['InceptionV4', 88.8, 84, 50], //row 4
          ['TinyYoloV2',  94, 79, 45], //row 5
          ['TinyYoloV3',  91.4, 76, 44]  //row 6
        ]);
        var x86_materialOptions_cpu = {
          width: 900,
          chart: {
            title: 'Model Vs CPU Load per backend',
          },
          series: {
          },
          axes: {
            y: {
              distance: {side: 'left',label: 'CPU Load'}, // Left y-axis.
            }
          }
        };


[[File:CPU Benchmarks gst-inference.png|1024px|frameless|thumb|center]]
        var materialChart_x86_cpu = new google.charts.Bar(chartDiv_Cpu_x86);
        view_x86_cpu = new google.visualization.DataView(table_models_cpu_x86);


        function drawMaterialChart() {
          var materialChart_x86_cpu = new google.charts.Bar(chartDiv_Cpu_x86);
          materialChart_x86_cpu.draw(table_models_cpu_x86, google.charts.Bar.convertOptions(x86_materialOptions_cpu));


== Introduction ==
          init_charts();
        }
        function init_charts(){
          view_x86_cpu.setColumns([0,1, 2, 3]);
          materialChart_x86_cpu.draw(view_x86_cpu, x86_materialOptions_cpu);
        }
        drawMaterialChart();
        }


=== Test benchmark video ===
</script>
The following video was used to perform the benchmark tests.
<br>
To download the video press  right click on the video and select 'Save video as' and save this in your computer.


[[File:Test benchmark video.mp4|500px|thumb|center|Test benchmark video]]
</html>


== Jetson AGX Xavier ==
== Jetson AGX Xavier ==
The Jetson Xavier power modes used were 2 and 6 (more information: [https://docs.nvidia.com/jetson/l4t/index.html#page/Tegra%2520Linux%2520Driver%2520Package%2520Development%2520Guide%2Fpower_management_jetson_xavier.html%23wwpID0E0OM0HA Supported Modes and Power Efficiency])
*View current power mode:
<source lang="bash">
$ sudo /usr/sbin/nvpmodel -q
</source>
*Change current power mode:
<source lang="bash">
sudo /usr/sbin/nvpmodel -m x
</source>
Where x is the power mode ID (e.g. 0, 1, 2, 3, 4, 5, 6).


=== FPS Measurements ===
=== FPS Measurements ===
Line 758: Line 817:
           view_coral_cpu.setColumns([0,1, 2]);
           view_coral_cpu.setColumns([0,1, 2]);
           materialChart_coral_cpu.draw(view_coral_cpu, Coral_materialOptions_cpu);
           materialChart_coral_cpu.draw(view_coral_cpu, Coral_materialOptions_cpu);
        }
        drawMaterialChart();
        }
</script>
</html>
== x86 ==
=== FPS Measurements ===
<html>
<style>
    .button {
    background-color: #008CBA;
    border: none;
    color: white;
    padding: 15px 32px;
    text-align: center;
    text-decoration: none;
    display: inline-block;
    font-size: 16px;
    margin: 4px 2px;
    cursor: pointer;
  }
</style>
<div id="chart_fps_x86" style="margin: auto; width: 800px; height: 500px;"></div>
<script>
      google.charts.load('current', {'packages':['corechart', 'bar']});
      google.charts.setOnLoadCallback(drawStuffx86Fps);
     
      function drawStuffx86Fps() {
        var chartDiv_Fps_x86 = document.getElementById('chart_fps_x86');
        var table_models_fps_x86 = google.visualization.arrayToDataTable([
          ['Model',                      //Column 0
          'ONNXRT \n x86',
          'TensorFlow \n x86',
          'TensorFlow Lite \n x86'],        //Column 1
          ['InceptionV1', 47.9, 63.7, 22.8], //row 1
          ['InceptionV2', 32.7, 48.4, 14.2], //row 2
          ['InceptionV3', 12.1, 20.5, 12.2], //row 3
          ['InceptionV4', 5.26, 10.3, 10.2], //row 4
          ['TinyYoloV2',  16, 24.3, 12.2], //row 5
          ['TinyYoloV3',  18.4, 27.1, 10.2]  //row 6
        ]);
        var x86_materialOptions_fps = {
          width: 900,
          chart: {
            title: 'Model Vs FPS per backend',
          },
          series: {
          },
          axes: {
            y: {
              distance: {side: 'left',label: 'FPS'}, // Left y-axis.
            }
          }
        };
        var materialChart_x86_fps = new google.charts.Bar(chartDiv_Fps_x86);
        view_x86_fps = new google.visualization.DataView(table_models_fps_x86);
        function drawMaterialChart() {
          var materialChart_x86_fps = new google.charts.Bar(chartDiv_Fps_x86);
          materialChart_x86_fps.draw(table_models_fps_x86, google.charts.Bar.convertOptions(x86_materialOptions_fps));
          init_charts();
        }
        function init_charts(){
          view_x86_fps.setColumns([0,1, 2, 3]);
          materialChart_x86_fps.draw(view_x86_fps, x86_materialOptions_fps);
        }
        drawMaterialChart();
        }
</script>
</html>
=== CPU Load Measurements ===
<html>
<style>
    .button {
    background-color: #008CBA;
    border: none;
    color: white;
    padding: 15px 32px;
    text-align: center;
    text-decoration: none;
    display: inline-block;
    font-size: 16px;
    margin: 4px 2px;
    cursor: pointer;
  }
</style>
<div id="chart_cpu_x86" style="margin: auto; width: 800px; height: 500px;"></div>
<script>
      google.charts.load('current', {'packages':['corechart', 'bar']});
      google.charts.setOnLoadCallback(drawStuffx86Cpu);
     
      function drawStuffx86Cpu() {
        var chartDiv_Cpu_x86 = document.getElementById('chart_cpu_x86');
        var table_models_cpu_x86 = google.visualization.arrayToDataTable([
          ['Model',                      //Column 0
          'ONNXRT \n x86',
          'TensorFlow \n x86',
          'TensorFlow Lite \n x86'],        //Column 1
          ['InceptionV1', 94.6, 74, 46], //row 1
          ['InceptionV2', 100, 75, 43], //row 2
          ['InceptionV3', 95.2, 79, 54], //row 3
          ['InceptionV4', 88.8, 84, 50], //row 4
          ['TinyYoloV2',  94, 79, 45], //row 5
          ['TinyYoloV3',  91.4, 76, 44]  //row 6
        ]);
        var x86_materialOptions_cpu = {
          width: 900,
          chart: {
            title: 'Model Vs CPU Load per backend',
          },
          series: {
          },
          axes: {
            y: {
              distance: {side: 'left',label: 'CPU Load'}, // Left y-axis.
            }
          }
        };
        var materialChart_x86_cpu = new google.charts.Bar(chartDiv_Cpu_x86);
        view_x86_cpu = new google.visualization.DataView(table_models_cpu_x86);
        function drawMaterialChart() {
          var materialChart_x86_cpu = new google.charts.Bar(chartDiv_Cpu_x86);
          materialChart_x86_cpu.draw(table_models_cpu_x86, google.charts.Bar.convertOptions(x86_materialOptions_cpu));
          init_charts();
        }
        function init_charts(){
          view_x86_cpu.setColumns([0,1, 2, 3]);
          materialChart_x86_cpu.draw(view_x86_cpu, x86_materialOptions_cpu);
         }
         }
         drawMaterialChart();
         drawMaterialChart();

Revision as of 15:58, 21 July 2020




Previous: Example Applications/DispTec Index Next: Model Zoo





GstInference Benchmarks

Introduction

This wiki summarizes a series of benchmarks on different hardware platforms based on the run_benchmark.sh bash script that can be found in the official GstInference repository. The script is based on the following GStreamer pipeline:


#Script to run each model
run_all_models(){

  model_array=(inceptionv1 inceptionv2 inceptionv3 inceptionv4 tinyyolov2 tinyyolov3)
  model_upper_array=(InceptionV1 InceptionV2 InceptionV3 InceptionV4 TinyYoloV2 TinyYoloV3)
  input_array=(input input input input input/Placeholder inputs )
  output_array=(InceptionV1/Logits/Predictions/Reshape_1 Softmax InceptionV3/Predictions/Reshape_1 
  InceptionV4/Logits/Predictions add_8 output_boxes )

  mkdir -p logs/
  rm -f logs/*

  for ((i=0;i<${#model_array[@]};++i)); do
    echo Perf ${model_array[i]}
    gst-launch-1.0 \
    filesrc location=$VIDEO_PATH num-buffers=600 ! decodebin ! videoconvert ! \
    perf print-arm-load=true name=inputperf ! tee name=t t. ! videoscale ! queue ! net.sink_model t. ! queue ! net.sink_bypass \
    ${model_array[i]} backend=$BACKEND name=net backend::input-layer=${input_array[i]} backend::output-layer=${output_array[i]} \
    model-location="${MODELS_PATH}${model_upper_array[i]}_${INTERNAL_PATH}/graph_${model_array[i]}${EXTENSION}" \
    net.src_bypass ! perf print-arm-load=true name=outputperf ! videoconvert ! fakesink sync=false > logs/${model_array[i]}.log
  done
}

Test benchmark video

The following video was used to perform the benchmark tests.
To download the video press right click on the video and select 'Save video as' and save this in your computer.

Test benchmark video

x86

The Desktop PC had the following specifications:

  • Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
  • 8 GB RAM
  • Cedar [Radeon HD 5000/6000/7350/8350 Series]
  • Linux 4.15.0-54-generic x86_64 (Ubuntu 16.04)

FPS Measurements

CPU Load Measurements

Jetson AGX Xavier

The Jetson Xavier power modes used were 2 and 6 (more information: Supported Modes and Power Efficiency)

  • View current power mode:
$ sudo /usr/sbin/nvpmodel -q
  • Change current power mode:
sudo /usr/sbin/nvpmodel -m x

Where x is the power mode ID (e.g. 0, 1, 2, 3, 4, 5, 6).

FPS Measurements

CPU Load Measurements

Jetson TX2

FPS Measurements

CPU Load Measurements

Jetson Nano

FPS Measurements

CPU Load Measurements

Google Coral

FPS Measurements

CPU Load Measurements


Previous: Example Applications/DispTec Index Next: Model Zoo