GstInference/Benchmarks: Difference between revisions

Latest revision as of 20:39, 4 September 2024

GstInference Benchmarks - Introduction

This wiki summarizes a series of benchmarks on different hardware platforms based on the run_benchmark.sh bash script that can be found in the official GstInference repository. The script is based on the following GStreamer pipeline:

#Script to run each model
run_all_models(){

  model_array=(inceptionv1 inceptionv2 inceptionv3 inceptionv4 tinyyolov2 tinyyolov3)
  model_upper_array=(InceptionV1 InceptionV2 InceptionV3 InceptionV4 TinyYoloV2 TinyYoloV3)
  input_array=(input input input input input/Placeholder inputs )
  output_array=(InceptionV1/Logits/Predictions/Reshape_1 Softmax InceptionV3/Predictions/Reshape_1 
  InceptionV4/Logits/Predictions add_8 output_boxes )

  mkdir -p logs/
  rm -f logs/*

  for ((i=0;i<${#model_array[@]};++i)); do
    echo Perf ${model_array[i]}
    gst-launch-1.0 \
    filesrc location=$VIDEO_PATH num-buffers=600 ! decodebin ! videoconvert ! \
    perf print-arm-load=true name=inputperf ! tee name=t t. ! videoscale ! queue ! net.sink_model t. ! queue ! net.sink_bypass \
    ${model_array[i]} backend=$BACKEND name=net backend::input-layer=${input_array[i]} backend::output-layer=${output_array[i]} \
    model-location="${MODELS_PATH}${model_upper_array[i]}_${INTERNAL_PATH}/graph_${model_array[i]}${EXTENSION}" \
    net.src_bypass ! perf print-arm-load=true name=outputperf ! videoconvert ! fakesink sync=false > logs/${model_array[i]}.log
  done
}

Test benchmark video

The following video was used to perform the benchmark tests.
To download the video press right-click on the video and select 'Save video as' and save this on your computer.

Video 1. Test benchmark video

x86

The Desktop PC had the following specifications:

Intel(R) Core(TM) Core i7-7700HQ CPU @ 2.80GHz
12 GB RAM
Linux 4.15.0-106-generic x86_64 (Ubuntu 16.04)
GStreamer 1.8.3

FPS Measurements

CPU Load Measurements

Jetson AGX Xavier

The Jetson Xavier power modes used were 2 and 6 (more information: Supported Modes and Power Efficiency)

View current power mode:

$ sudo /usr/sbin/nvpmodel -q

Change current power mode:

sudo /usr/sbin/nvpmodel -m x

Where x is the power mode ID (e.g. 0, 1, 2, 3, 4, 5, 6).

FPS Measurements

CPU Load Measurements

Jetson TX2

FPS Measurements

CPU Load Measurements

Jetson Nano

FPS Measurements

CPU Load Measurements

Google Coral

The following benchmarks were performed on the Coral Dev Board.

FPS Measurements

CPU Load Measurements

Previous: Example Applications/DispTec

Index

Next: Model Zoo

@@ Line 1: / Line 1: @@
 <noinclude>
-{{GstInference/Head|previous=Example Applications/DispTec|next=Model Zoo|keywords=GstInference gstreamer pipelines, Inference gstreamer pipelines, NCSDK Inference gstreamer pipelines, GoogLeNet, TinyYolo v2, GoogLeNet x2, TensorFlow backend|title=GstInference Benchmarks}}
+{{GstInference/Head|previous=Example Applications/DispTec|next=Model Zoo|metakeywords=GstInference gstreamer pipelines, Inference gstreamer pipelines, NCSDK Inference gstreamer pipelines, GoogLeNet, TinyYolo v2, GoogLeNet x2, TensorFlow backend|title=GstInference Benchmarks}}
 </noinclude>
@@ Line 11: / Line 11: @@
 </html>
-= GstInference Benchmarks =
+== GstInference Benchmarks - Introduction ==
-== Introduction ==
 This wiki summarizes a series of benchmarks on different hardware platforms based on the [https://github.com/RidgeRun/gst-inference/blob/master/tests/benchmark/run_benchmark.sh run_benchmark.sh] bash script that can be found in the official [https://github.com/RidgeRun/gst-inference GstInference repository]. The script is based on the following GStreamer pipeline:
 <source lang="bash">
@@ Line 46: / Line 43: @@
 The following video was used to perform the benchmark tests.
 <br>
-To download the video press  right click on the video and select 'Save video as' and save this in your computer.
+To download the video press right-click on the video and select 'Save video as' and save this on your computer.
+<br>
-[[File:Test benchmark video.mp4|500px|thumb|center|Test benchmark video]]
+<br>
+[[File:Test benchmark video.mp4|thumb|border|center|500px|alt=Alt|Video 1. Test benchmark video]]
 == x86 ==
@@ Line 90: / Line 88: @@
            ['Model',                      //Column 0
             'ONNXRT \n x86',
-            'ONNXRT OpenVINO \n x86',
+            'ONNXRT OpenVINO (CPU_FP32) \n x86',
+           'ONNXRT OpenVINO (GPU_FP32)\n x86',
+           'ONNXRT OpenVINO (GPU_FP16)\n x86',
+           'ONNXRT OpenVINO (MYRIAD_FP16)\n x86',
             'TensorFlow \n x86',
             'TensorFlow Lite \n x86'],
-           ['InceptionV1', 47.9, 81.966, 55.3182, 18.8422], //row 1
+           ['InceptionV1', 47.9, 81.966, 70.580, 98.742, 46.294, 55.3182, 18.8422], //row 1
-           ['InceptionV2', 32.7, 63.352, 39.6438, 13.5714], //row 2
+           ['InceptionV2', 32.7, 63.352, 54.159, 77.449, 34.613, 39.6438, 13.5714], //row 2
-           ['InceptionV3', 12.1, 23.287, 16.2488, 4.9924], //row 3
+           ['InceptionV3', 12.1, 23.287, 20.878, 34.059, 11.999, 16.2488, 4.9924], //row 3
-           ['InceptionV4', 5.26, 10.927, 7.793, 2.583], //row 4
+           ['InceptionV4', 5.26, 10.927, 6.160, 4.548, 6.494, 7.793, 2.583], //row 4
-           ['TinyYoloV2', 33.559, 16, 18.1846, 7.2708], //row 5
+           ['TinyYoloV2', 33.559, 32.587, 0, 0, 0, 18.1846, 7.2708], //row 5
-           ['TinyYoloV3', 35.092, 18.4, 21.7334, 7.3042]  //row 6
+           ['TinyYoloV3', 35.092, 27.799, 0, 0, 0, 21.7334, 7.3042]  //row 6
          ]);
          var x86_materialOptions_fps = {
-           width: 900,
+           width: 1000,
            chart: {
              title: 'Model Vs FPS per backend',
@@ Line 124: / Line 125: @@
          }
          function init_charts(){
-           view_x86_fps.setColumns([0,1, 2, 3, 4]);
+           view_x86_fps.setColumns([0,1, 2, 3, 4, 5, 6, 7]);
            materialChart_x86_fps.draw(view_x86_fps, x86_materialOptions_fps);
          }
@@ Line 166: / Line 167: @@
            ['Model',                      //Column 0
             'ONNXRT \n x86',
+           'ONNXRT OpenVINO (CPU_FP32) \n x86',
+           'ONNXRT OpenVINO (GPU_FP32)\n x86',
+           'ONNXRT OpenVINO (GPU_FP16)\n x86',
+           'ONNXRT OpenVINO (MYRIAD_FP16)\n x86',
             'TensorFlow \n x86',
-           'ONNXRT OpenVINO \n x86',
             'TensorFlow Lite \n x86'],        //Column 1
-           ['InceptionV1', 94.6, 49, 74.2, 47.6], //row 1
+           ['InceptionV1', 94.6, 49, 31, 29, 14, 74.2, 47.6], //row 1
-           ['InceptionV2', 100, 52, 74.2, 43.6], //row 2
+           ['InceptionV2', 100, 52, 28, 29, 11, 74.2, 43.6], //row 2
-           ['InceptionV3', 95.2, 49, 81, 60.2], //row 3
+           ['InceptionV3', 95.2, 49, 28, 28, 13, 81, 60.2], //row 3
-           ['InceptionV4', 88.8, 49, 86, 50], //row 4
+           ['InceptionV4', 88.8, 49, 33, 46, 11, 86, 50], //row 4
-           ['TinyYoloV2',  94, 50, 80.6, 46], //row 5
+           ['TinyYoloV2',  94, 50, 0, 0, 0, 80.6, 46], //row 5
-           ['TinyYoloV3',  91.4, 46, 74.6, 42.4]  //row 6
+           ['TinyYoloV3',  91.4, 46, 0, 0, 0, 74.6, 42.4]  //row 6
          ]);
          var x86_materialOptions_cpu = {
-           width: 900,
+           width: 1000,
            chart: {
              title: 'Model Vs CPU Load per backend',
@@ Line 200: / Line 204: @@
          }
          function init_charts(){
-           view_x86_cpu.setColumns([0,1, 2, 3, 4]);
+           view_x86_cpu.setColumns([0,1, 2, 3, 4, 5, 6, 7]);
            materialChart_x86_cpu.draw(view_x86_cpu, x86_materialOptions_cpu);
          }
@@ Line 262: / Line 266: @@
             'TensorFlow \n Xavier (30 W)',       //Column 3
             'TensorFlow (GPU) \n Xavier (30 W)', //Column 4
-            'TensorRT \n Xavier'],               //Column 5
+            'TensorRT \n Xavier',               //Column 5
-           ['InceptionV1', 8.24, 52.3, 6.41, 66.27, 92.6], //row 1
+           'ONNXRT ACL \n Xavier'],            //Column 6
-           ['InceptionV2', 6.58, 39.6, 5.11, 50.59, 0], //row 2
+           ['InceptionV1', 8.24, 52.3, 6.41, 66.27, 92.6, 17.566], //row 1
-           ['InceptionV3', 2.54, 17.8, 1.96, 22.95, 24.9], //row 3
+           ['InceptionV2', 6.58, 39.6, 5.11, 50.59, 0, 12.729], //row 2
-           ['InceptionV4', 1.22, 9.4, 0.98, 12.14, 13.6], //row 4
+           ['InceptionV3', 2.54, 17.8, 1.96, 22.95, 24.9, 5.709], //row 3
-           ['TinyYoloV2',  0, 0, 0, 0, 69.7], //row 5
+           ['InceptionV4', 1.22, 9.4, 0.98, 12.14, 13.6, 2.747], //row 4
-           ['TinyYoloV3',  0, 0, 0, 0, 0]  //row 6
+           ['TinyYoloV2',  0, 0, 0, 0, 69.7, 9.367], //row 5
+           ['TinyYoloV3',  0, 0, 0, 0, 0, 10.520]  //row 6
          ]);
          var xavier_materialOptions_fps = {
@@ Line 294: / Line 299: @@
          }
          function init_charts(){
-           view_xavier_fps.setColumns([0,1,2,3,4,5]);
+           view_xavier_fps.setColumns([0,1,2,3,4,5,6]);
            materialChart_xavier_fps.draw(view_xavier_fps, xavier_materialOptions_fps);
          }
@@ Line 339: / Line 344: @@
             'TensorFlow \n Xavier (30 W)',       //Column 3
             'TensorFlow (GPU) \n Xavier (30 W)', //Column 4
-            'TensorRT \n Xavier'],               //Column 5
+            'TensorRT \n Xavier',               //Column 5
-           ['InceptionV1', 86, 72, 93, 72, 32], //row 1
+           'ONNXRT ACL \n Xavier'],            //Column 6
-           ['InceptionV2', 88, 62.6, 95, 62, 0], //row 2
+           ['InceptionV1', 86, 72, 93, 72, 32, 50], //row 1
-           ['InceptionV3', 92, 44, 98, 44, 6], //row 3
+           ['InceptionV2', 88, 62.6, 95, 62, 0, 49], //row 2
-           ['InceptionV4', 94, 32, 99, 32, 3], //row 4
+           ['InceptionV3', 92, 44, 98, 44, 6, 50], //row 3
-           ['TinyYoloV2',  0, 0, 0, 0, 16], //row 5
+           ['InceptionV4', 94, 32, 99, 32, 3, 50], //row 4
-           ['TinyYoloV3',  0, 0, 0, 0, 0]  //row 6
+           ['TinyYoloV2',  0, 0, 0, 0, 16, 50], //row 5
+           ['TinyYoloV3',  0, 0, 0, 0, 0, 50]  //row 6
          ]);
          var xavier_materialOptions_cpu = {
@@ Line 371: / Line 377: @@
          }
          function init_charts(){
-           view_xavier_cpu.setColumns([0,1,2,3,4,5]);
+           view_xavier_cpu.setColumns([0,1,2,3,4,5,6]);
            materialChart_xavier_cpu.draw(view_xavier_cpu, xavier_materialOptions_cpu);
          }
@@ Line 678: / Line 684: @@
 == Google Coral ==
+The following benchmarks were performed on the Coral Dev Board.
 === FPS Measurements ===
@@ Line 712: / Line 720: @@
             'TensorFlow Lite \n Coral',
             'TensorFlow Lite EdgeTPU \n Coral'],        //Column 1
-           ['InceptionV1', 3.11, 41.5], //row 1
+           ['InceptionV1', 3.11, 41.6], //row 1
-           ['InceptionV2', 2.31, 42], //row 2
+           ['InceptionV2', 2.31, 42.8], //row 2
-           ['InceptionV3', 0.9, 15.2], //row 3
+           ['InceptionV3', 0.9, 15.02], //row 3
-           ['InceptionV4', 0, 8.41], //row 4
+           ['InceptionV4', 0, 8.56], //row 4
-           ['TinyYoloV2',  0, 0], //row 5
+           ['MobileNetV2',  0, 41.12], //row 5
-           ['TinyYoloV3',  0, 0]  //row 6
+           ['MobileNetV2 + SSD',  0, 38.64]  //row 6
          ]);
          var Coral_materialOptions_fps = {
@@ Line 787: / Line 795: @@
             'TensorFlow Lite EdgeTPU \n Coral'],        //Column 1
            ['InceptionV1', 73, 32], //row 1
-           ['InceptionV2', 72, 32], //row 2
+           ['InceptionV2', 72, 37], //row 2
-           ['InceptionV3', 74, 12], //row 3
+           ['InceptionV3', 74, 14], //row 3
-           ['InceptionV4', 0, 6], //row 4
+           ['InceptionV4', 0, 5], //row 4
-           ['TinyYoloV2',  0, 0], //row 5
+           ['MobileNetV2',  0, 34], //row 5
-           ['TinyYoloV3',  0, 0]  //row 6
+           ['MobileNetV2 + SSD',  0, 45]  //row 6
          ]);
          var Coral_materialOptions_cpu = {