GstInference/Benchmarks: Difference between revisions
No edit summary |
No edit summary |
||
Line 1,477: | Line 1,477: | ||
== Jetson AGX Xavier == | == Jetson AGX Xavier == | ||
=== FPS | === FPS Measurements === | ||
<html> | <html> | ||
Line 1,500: | Line 1,500: | ||
<script> | <script> | ||
google.charts.load('current', {'packages':['corechart', 'bar']}); | google.charts.load('current', {'packages':['corechart', 'bar']}); | ||
google.charts.setOnLoadCallback( | google.charts.setOnLoadCallback(drawStuffXavierFps); | ||
function | function drawStuffXavierFps() { | ||
var chartDiv_Fps_Xavier = document.getElementById('chart_fps_xavier'); | var chartDiv_Fps_Xavier = document.getElementById('chart_fps_xavier'); | ||
Line 1,546: | Line 1,546: | ||
view_xavier_fps.setColumns([0,1,2,3,4,5]); | view_xavier_fps.setColumns([0,1,2,3,4,5]); | ||
materialChart_xavier_fps.draw(view_xavier_fps, xavier_materialOptions_fps); | materialChart_xavier_fps.draw(view_xavier_fps, xavier_materialOptions_fps); | ||
} | |||
drawMaterialChart(); | |||
} | |||
</script> | |||
</html> | |||
=== CPU Load Measurements === | |||
<html> | |||
<style> | |||
.button { | |||
background-color: #008CBA; | |||
border: none; | |||
color: white; | |||
padding: 15px 32px; | |||
text-align: center; | |||
text-decoration: none; | |||
display: inline-block; | |||
font-size: 16px; | |||
margin: 4px 2px; | |||
cursor: pointer; | |||
} | |||
</style> | |||
<div id="chart_cpu_xavier" style="margin: auto; width: 800px; height: 500px;"></div> | |||
<script> | |||
google.charts.load('current', {'packages':['corechart', 'bar']}); | |||
google.charts.setOnLoadCallback(drawStuffXavierCpu); | |||
function drawStuffXavierCpu() { | |||
var chartDiv_Cpu_Xavier = document.getElementById('chart_cpu_xavier'); | |||
var table_models_cpu_xavier = google.visualization.arrayToDataTable([ | |||
['Model', //Column 0 | |||
'TensorFlow \n Xavier (15 W)', //Column 1 | |||
'TensorFlow (GPU) \n Xavier (15 W)', //Column 2 | |||
'TensorFlow \n Xavier (30 W)', //Column 3 | |||
'TensorFlow (GPU) \n Xavier (30 W)', //Column 4 | |||
'TensorRT \n Xavier'], //Column 5 | |||
['InceptionV1', 86, 72, 93, 72, 32], //row 1 | |||
['InceptionV2', 88, 62.6, 95, 62, 0], //row 2 | |||
['InceptionV3', 92, 44, 98, 44, 6], //row 3 | |||
['InceptionV4', 94, 32, 99, 32, 3], //row 4 | |||
['TinyYoloV2', 0, 0, 0, 0, 16], //row 5 | |||
['TinyYoloV3', 0, 0, 0, 0, 0] //row 6 | |||
]); | |||
var xavier_materialOptions_cpu = { | |||
width: 900, | |||
chart: { | |||
title: 'Model Vs CPU load per backend', | |||
}, | |||
series: { | |||
}, | |||
axes: { | |||
y: { | |||
distance: {side: 'left',label: 'CPU Load'}, // Left y-axis. | |||
} | |||
} | |||
}; | |||
var materialChart_xavier_cpu = new google.charts.Bar(chartDiv_Cpu_Xavier); | |||
view_xavier_cpu = new google.visualization.DataView(table_models_cpu_xavier); | |||
function drawMaterialChart() { | |||
var materialChart_xavier_cpu = new google.charts.Bar(chartDiv_Cpu_Xavier); | |||
materialChart_xavier_cpu.draw(table_models_cpu_xavier, google.charts.Bar.convertOptions(xavier_materialOptions_cpu)); | |||
init_charts(); | |||
} | |||
function init_charts(){ | |||
view_xavier_cpu.setColumns([0,1,2,3,4,5]); | |||
materialChart_xavier_cpu.draw(view_xavier_cpu, xavier_materialOptions_cpu); | |||
} | } | ||
drawMaterialChart(); | drawMaterialChart(); |
Revision as of 20:02, 17 July 2020
Make sure you also check GstInference's companion project: R2Inference |
GstInference |
---|
Introduction |
Getting started |
Supported architectures |
InceptionV1 InceptionV3 YoloV2 AlexNet |
Supported backends |
Caffe |
Metadata and Signals |
Overlay Elements |
Utils Elements |
Legacy pipelines |
Example pipelines |
Example applications |
Benchmarks |
Model Zoo |
Project Status |
Contact Us |
|
GstInference Benchmarks
The following benchmarks were run with a source video (1920x1080@60). With the following base GStreamer pipeline, and environment variables:
$ VIDEO_FILE='video.mp4' $ MODEL_LOCATION='graph_inceptionv1_tensorflow.pb' $ INPUT_LAYER='input' $ OUTPUT_LAYER='InceptionV1/Logits/Predictions/Reshape_1'
The environment variables were changed accordingly with the used model (Inception V1,V2,V3 or V4)
GST_DEBUG=inception1:1 gst-launch-1.0 filesrc location=$VIDEO_FILE ! decodebin ! videoconvert ! videoscale ! queue ! net.sink_model inceptionv1 name=net model-location=$MODEL_LOCATION backend=tensorflow backend::input-layer=$INPUT_LAYER backend::output-layer=$OUTPUT_LAYER net.src_model ! perf ! fakesink -v
The Desktop PC had the following specifications:
- Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz
- 8 GB RAM
- Cedar [Radeon HD 5000/6000/7350/8350 Series]
- Linux 4.15.0-54-generic x86_64 (Ubuntu 16.04)
The Jetson Xavier power modes used were 2 and 6 (more information: Supported Modes and Power Efficiency)
- View current power mode:
$ sudo /usr/sbin/nvpmodel -q
- Change current power mode:
sudo /usr/sbin/nvpmodel -m x
Where x is the power mode ID (e.g. 0, 1, 2, 3, 4, 5, 6).
Summary
Desktop PC | CPU Library | |
---|---|---|
Model | Framerate | CPU Usage |
Inception V1 | 11.89 | 48 |
Inception V2 | 10.33 | 65 |
Inception V3 | 5.41 | 90 |
Inception V4 | 3.81 | 94 |
Jetson Xavier (15W) | CPU Library | GPU Library | ||
---|---|---|---|---|
Model | Framerate | CPU Usage | Framerate | CPU Usage |
Inception V1 | 8.24 | 86 | 52.3 | 43 |
Inception V2 | 6.58 | 88 | 39.6 | 42 |
Inception V3 | 2.54 | 92 | 17.8 | 25 |
Inception V4 | 1.22 | 94 | 9.4 | 20 |
Jetson Xavier (30W) | CPU Library | GPU Library | ||
---|---|---|---|---|
Model | Framerate | CPU Usage | Framerate | CPU Usage |
Inception V1 | 6.41 | 93 | 66.27 | 72 |
Inception V2 | 5.11 | 95 | 50.59 | 62 |
Inception V3 | 1.96 | 98 | 22.95 | 44 |
Inception V4 | 0.98 | 99 | 12.14 | 32 |
Framerate
CPU Usage
TensorFlow Lite Benchmarks
FPS measurement
CPU usage measurement
Test benchmark video
The following video was used to perform the benchmark tests.
To download the video press right click on the video and select 'Save video as' and save this in your computer.
ONNXRT Benchmarks
The Desktop PC had the following specifications:
- Intel(R) Core(TM) Core i7-7700HQ CPU @ 2.80GHz
- 12 GB RAM
- Linux 4.15.0-106-generic x86_64 (Ubuntu 16.04)
- GStreamer 1.8.3
The following was the GStreamer pipeline used to obtain the results:
# MODELS_PATH has the following structure #/path/to/models/ #├── InceptionV1_onnxrt #│ ├── graph_inceptionv1_info.txt #│ ├── graph_inceptionv1.onnx #│ └── labels.txt #├── InceptionV2_onnxrt #│ ├── graph_inceptionv2_info.txt #│ ├── graph_inceptionv2.onnx #│ └── labels.txt #├── InceptionV3_onnxrt #│ ├── graph_inceptionv3_info.txt #│ ├── graph_inceptionv3.onnx #│ └── labels.txt #├── InceptionV4_onnxrt #│ ├── graph_inceptionv4_info.txt #│ ├── graph_inceptionv4.onnx #│ └── labels.txt #├── TinyYoloV2_onnxrt #│ ├── graph_tinyyolov2_info.txt #│ ├── graph_tinyyolov2.onnx #│ └── labels.txt #└── TinyYoloV3_onnxrt # ├── graph_tinyyolov3_info.txt # ├── graph_tinyyolov3.onnx # └── labels.txt model_array=(inceptionv1 inceptionv2 inceptionv3 inceptionv4 tinyyolov2 tinyyolov3) model_upper_array=(InceptionV1 InceptionV2 InceptionV3 InceptionV4 TinyYoloV2 TinyYoloV3) MODELS_PATH=/path/to/models/ INTERNAL_PATH=onnxrt EXTENSION=".onnx" gst-launch-1.0 \ filesrc location=$VIDEO_PATH num-buffers=600 ! decodebin ! videoconvert ! \ perf print-arm-load=true name=inputperf ! tee name=t t. ! videoscale ! queue ! net.sink_model t. ! queue ! net.sink_bypass \ ${model_array[i]} backend=onnxrt name=net \ model-location="${MODELS_PATH}${model_upper_array[i]}_${INTERNAL_PATH}/graph_${model_array[i]}${EXTENSION}" \ net.src_bypass ! perf print-arm-load=true name=outputperf ! videoconvert ! fakesink sync=false
FPS Measurements
CPU Load Measurements
Test benchmark video
The following video was used to perform the benchmark tests.
To download the video press right click on the video and select 'Save video as' and save this in your computer.
TensorRT Benchmarks
FPS Measurements
CPU Load Measurements
Jetson AGX Xavier
FPS Measurements
CPU Load Measurements