Jump to content

GStreamer Performance Review for the Dragonwing EVK Board

From RidgeRun Developer Wiki


Follow us on: YouTube Twitter LinkedIn Email Share this page

Share This Page




Problems running the pipelines shown on this page? Please see our GStreamer Debugging guide for help .


Performance

This section will review the performance measurements for multiple scenarios regarding different encoding and decoding elements as well as camera capture. The idea is to highlight the processing speed and differences in utilizing either GPU, VPU or CPU based elements.

How to Measure Performance

The performance analysis was done using three different benchmarks: element latency, average behavior and limit performance of each pipeline.

Element Latency

Element latency was measured with RidgeRun's script tool in the Pipeline Latency section for RidgeRun's Developer Manual. This script parses and summarizes latency measurements obtained from a GStreamer pipeline and gives a detailed table with statistics such as average latency, min/max latency, and percentiles to quantify latency.

Average Behavior

The average behavior of each pipeline was measured by reading CPU and GPU utilization with the following commands during realistic testing cases for each pipeline, using the example pipelines mentioned for each element tested:

watch ps -o pid,cmd,%mem,rss -C gst-launch-1.0
watch -n 1 cat /sys/class/kgsl/kgsl-3d0/gpubusy

These commands display the raw and relative memory usage, and the raw and relative usage of the GPU. Also, the RidgeRun team developed a GStreamer element named perf utilized to measure CPU performance and FPS output. You can find more information on how to install it and usage on GstPerf.

Limit Performance

Limit performance can be measured by stressing the FPS output to the max while avoiding unnecessary overhead to the pipeline and any bottleneck that coming from elements outside the testing scope. This is why pipelines are structured in this way: source, element under test and a fakesink, similar to the next pipeline:

gst-launch-1.0 videotestsrc num-buffers=1 pattern=ball ! "video/x-raw,format=${FORMAT},height=${HEIGHT},width=${WIDTH}" ! imagefreeze ! queue ! testelement ! queue ! perf print-cpu-load=true ! fakesink

This pipeline structure removes any overhead added by unrelated elements and allows the tested element to operate at its performance limit. As you can see, the perf element is also added with the print-cpu-load=true to verify CPU usage and max FPS produced.

copy element

The addition of the copy elements was necessary in the limit performance measurement for the transformation elements in the Video Transformation section. This is due how the imagefreeze interacts with the transformation elements and the EGL textures package. You can find the source code and necessary commands for installation here:

Find the copy element source code here

#include <gst/gst.h>
#include <gst/base/gstbasetransform.h>
#include <gst/video/video.h>


G_BEGIN_DECLS

#define GST_TYPE_COPY (gst_copy_get_type())
G_DECLARE_FINAL_TYPE(GstCopy, gst_copy, GST, COPY, GstBaseTransform)

G_END_DECLS


GST_DEBUG_CATEGORY_STATIC (gst_copy_debug_category);

#define GST_CAT_DEFAULT gst_copy_debug_category

struct _GstCopy
{
  GstBaseTransform base_copy;
};


static GstFlowReturn gst_copy_transform_frame (GstBaseTransform * filter,
    GstBuffer *, GstBuffer *);
static gboolean gst_copy_start (GstBaseTransform * trans);

enum
{
  PROP_0,
};

#define GST_BAYER_CAPS_MAKE(format) \
  "video/x-bayer,"                  \
  "format=" format                  \
  ","                               \
  "width=" GST_VIDEO_SIZE_RANGE     \
  ","                               \
  "height=" GST_VIDEO_SIZE_RANGE    \
  ","                               \
  "framerate=" GST_VIDEO_FPS_RANGE

#define VIDEO_SRC_CAPS GST_VIDEO_CAPS_MAKE(GST_VIDEO_FORMATS_ALL)
//    GST_BAYER_CAPS_MAKE("{ rggb, bggr, gbrg, grbg }")

#define VIDEO_SINK_CAPS GST_VIDEO_CAPS_MAKE(GST_VIDEO_FORMATS_ALL)
//   GST_BAYER_CAPS_MAKE("{ rggb, bggr, gbrg, grbg }")

G_DEFINE_TYPE_WITH_CODE (GstCopy, gst_copy, GST_TYPE_BASE_TRANSFORM,
    GST_DEBUG_CATEGORY_INIT (gst_copy_debug_category, "copy", 0,
        "debug category for copy element"));


static void
gst_copy_class_init (GstCopyClass * klass)
{
  GObjectClass *gobject_class = G_OBJECT_CLASS (klass);
  GstBaseTransformClass *base_transform_class =
      GST_BASE_TRANSFORM_CLASS (klass);

  /* Setting up pads and setting metadata should be moved to
     base_class_init if you intend to subclass this class. */
  gst_element_class_add_pad_template (GST_ELEMENT_CLASS (klass),
      gst_pad_template_new ("src", GST_PAD_SRC, GST_PAD_ALWAYS,
          gst_caps_from_string (VIDEO_SRC_CAPS)));
  gst_element_class_add_pad_template (GST_ELEMENT_CLASS (klass),
      gst_pad_template_new ("sink", GST_PAD_SINK, GST_PAD_ALWAYS,
          gst_caps_from_string (VIDEO_SINK_CAPS)));

  gst_element_class_set_static_metadata (GST_ELEMENT_CLASS (klass),
      "Copy element", "Generic",
      "Copy element over a video stream",
      "Luis G. Leon-Vega <luis.leon@ridgerun.com>");

  base_transform_class->transform =
      GST_DEBUG_FUNCPTR (gst_copy_transform_frame);
  base_transform_class->start = GST_DEBUG_FUNCPTR (gst_copy_start);
}

static void
gst_copy_init (GstCopy * self)
{
}

static gboolean
gst_copy_start (GstBaseTransform * trans)
{
  gst_base_transform_set_in_place (trans, FALSE);
  return TRUE;
}

static GstFlowReturn
gst_copy_transform_frame (GstBaseTransform * filter,
    GstBuffer * inframe, GstBuffer * outframe)
{
  GstMapInfo inmap;
  GstMapInfo outmap;

  gst_buffer_map (inframe, &inmap, GST_MAP_READ);
  gst_buffer_map (outframe, &outmap, GST_MAP_WRITE);

  memcpy(outmap.data, inmap.data, inmap.size);

  gst_buffer_unmap (outframe, &outmap);
  gst_buffer_unmap (inframe, &inmap);

  return GST_FLOW_OK;
}

static gboolean
plugin_init (GstPlugin * plugin)
{
  if (gst_element_register (plugin, "copy", GST_RANK_PRIMARY,
          GST_TYPE_COPY) == FALSE) {
    return FALSE;
  }

  return TRUE;
}

#ifndef PACKAGE
#define PACKAGE "copy"
#endif


GST_PLUGIN_DEFINE (GST_VERSION_MAJOR,
    GST_VERSION_MINOR,
    copy,
    "Image processing library plugin",
    plugin_init, "0.1.0", "Proprietary", "RidgeRun", "https://www.ridgerun.com")


  • Install the copy element:
gcc -fPIC -c gstcopy.c -o gstcopy.o `pkg-config gstreamer-base-1.0 gstreamer-1.0 --cflags` && gcc -shared gstcopy.o -o libgstcopy.so `pkg-config gstreamer-1.0 gstreamer-video-1.0 --libs`

sudo cp libgstcopy.so /usr/lib/aarch64-linux-gnu/gstreamer-1.0/

Simple Pipeline

 gst-launch-1.0 videotestsrc is-live=true num-buffers=300 ! "video/x-raw,width=1920,height=1080,framerate=30/1" ! videoconvert ! fakesink
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
videoconvert0 300 0.090 0.055 0.159 0.087 0.107 0.122 0.150
capsfilter0 300 0.051 0.026 0.089 0.049 0.062 0.069 0.078
Total Avg (ms) 0.140
Total Avg (ns) 140331

Software MP4 Encoding Pipeline

 gst-launch-1.0 -v videotestsrc is-live=true num-buffers=300 ! "video/x-raw,width=1920,height=1080,framerate=30/1" ! videoconvert ! x264enc ! mp4mux ! filesink location=test-video.mp4
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
x264enc0 291 4105.206 1977.190 6905.197 3936.793 5775.008 6340.635 6796.034
mp4mux0 300 0.549 0.098 4.186 0.226 0.873 3.201 3.625
videoconvert0 300 0.093 0.067 3.127 0.077 0.103 0.109 0.139
capsfilter0 300 0.049 0.033 0.086 0.048 0.056 0.063 0.075
Total Avg (ms) 4105.898
Total Avg (ns) 4105897678

Software MP4 Decoding Pipeline

gst-launch-1.0 -v filesrc location=test-video.mp4 ! qtdemux ! h264parse ! avdec_h264 ! fakesink
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
avdec_h264-0 151 7.589 2.435 11.529 8.184 10.059 10.468 10.990
h264parse0 300 0.096 0.038 0.317 0.088 0.133 0.144 0.176
qtdemux0 300 0.044 0.018 0.109 0.044 0.057 0.060 0.087
Total Avg (ms) 7.730
Total Avg (ns) 7729636

VPU accelerated MP4 Encoding Pipeline

 gst-launch-1.0 -e -v videotestsrc is-live=true num-buffers=300 ! "video/x-raw,width=1920,height=1080,framerate=30/1" ! videoconvert ! v4l2h264enc ! h264parse ! mp4mux ! filesink location=test-video-v4l2.mp4
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
v4l2h264enc0 300 3.920 3.119 10.432 3.923 4.311 4.331 4.598
mp4mux0 300 0.389 0.044 0.684 0.450 0.607 0.629 0.672
h264parse0 300 0.232 0.105 2.049 0.223 0.326 0.349 0.400
videoconvert0 300 0.111 0.056 0.208 0.109 0.120 0.143 0.175
capsfilter0 300 0.077 0.031 0.138 0.077 0.082 0.083 0.122
Total Avg (ms) 4.729
Total Avg (ns) 4728737

VPU accelerated MP4 Decoding Pipeline

gst-launch-1.0 -v filesrc location=test-video-v4l2.mp4 ! qtdemux ! h264parse ! v4l2h264dec ! fakesink
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
v4l2h264dec0 300 1.595 0.654 8.088 1.564 2.082 2.210 2.545
h264parse0 300 0.085 0.049 0.448 0.073 0.123 0.165 0.237
qtdemux0 300 0.044 0.026 0.114 0.038 0.061 0.075 0.110
Total Avg (ms) 1.723
Total Avg (ns) 1723325

Hardware Accelerated Source and Sink Pipeline

 gst-launch-1.0 -v gltestsrc is-live=true num-buffers=300 ! "video/x-raw(memory:GLMemory),format=RGBA,width=1920,height=1080,framerate=30/1" ! glimagesink
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
gluploadelement0 300 0.103 0.083 0.223 0.099 0.125 0.135 0.177
capsfilter0 300 0.071 0.033 0.131 0.070 0.075 0.079 0.108
glcolorbalance0 300 0.046 0.022 0.108 0.036 0.083 0.088 0.100
glcolorconvertelement0 300 0.041 0.021 0.109 0.037 0.041 0.067 0.097
Total Avg (ms) 0.261
Total Avg (ns) 260705

Hardware Accelerated MP4 Encoding Pipeline

gst-launch-1.0 -v gltestsrc is-live=true num-buffers=300 ! "video/x-raw(memory:GLMemory),format=RGBA,width=1920,height=1080,framerate=30/1" ! glcolorconvert ! gldownload ! "video/x-raw,format=NV12,width=1920,height=1080,framerate=30/1" ! v4l2h264enc ! h264parse ! mp4mux ! filesink location=gltestsrc-hw.mp4
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
v4l2h264enc0 300 12.649 6.691 17.707 12.730 14.558 15.204 16.562
h264parse0 300 4.305 0.422 11.472 4.177 7.386 9.345 10.540
gldownloadelement0 300 2.513 0.335 13.888 2.767 3.043 3.148 3.330
glcolorconvertelement0 300 1.602 0.725 14.020 1.616 1.738 1.767 1.811
mp4mux0 300 0.469 0.057 0.682 0.565 0.595 0.607 0.629
capsfilter1 300 0.117 0.043 7.097 0.092 0.116 0.126 0.139
capsfilter0 300 0.080 0.030 0.141 0.081 0.096 0.098 0.111
Total Avg (ms) 21.734
Total Avg (ns) 21734135

Hardware Accelerated MP4 Decoding Pipeline

gst-launch-1.0 -v filesrc location=gltestsrc-hw.mp4 ! qtdemux ! h264parse ! avdec_h264 ! glupload ! glimagesink
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
avdec_h264-0 300 240.334 8.167 259.325 243.318 248.689 250.206 254.286
gluploadelement0 300 1.622 0.136 3.513 1.336 3.287 3.326 3.396
glcolorconvertelement0 300 1.424 0.544 24.528 1.270 1.842 1.901 1.989
h264parse0 300 0.125 0.042 0.272 0.116 0.156 0.161 0.201
gluploadelement1 300 0.114 0.037 0.189 0.131 0.159 0.171 0.179
glcolorbalance0 300 0.102 0.044 0.183 0.120 0.150 0.155 0.166
qtdemux0 300 0.065 0.027 0.148 0.063 0.075 0.090 0.108
Total Avg (ms) 243.787
Total Avg (ns) 243786789

Camera recording at 1280x720@30fps

gst-launch-1.0 -e qtiqmmfsrc name=camsrc camera=0 ! \ video/x-raw,format=NV12,width=1280,height=720,framerate=30/1,\ interlace-mode=progressive,colorimetry=bt601 ! v4l2h264enc \ capture-io-mode=4 output-io-mode=5 extra-controls="controls,video_bitrate=6000000,\ video_bitrate_mode=0;" ! h264parse ! mp4mux ! filesink location=camera-test.mp4
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
v4l2h264enc0 323 3.195 1.970 7.934 3.174 3.537 3.582 3.643
mp4mux0 323 0.261 0.062 1.508 0.171 0.588 0.614 0.715
h264parse0 323 0.195 0.124 0.689 0.198 0.258 0.271 0.311
capsfilter0 323 0.071 0.043 0.125 0.070 0.076 0.103 0.114
Total Avg (ms) 3.722
Total Avg (ns) 3721899

Camera recording at 1920x1080@30fps

gst-launch-1.0 -e qtiqmmfsrc name=camsrc camera=0 ! \ video/x-raw,format=NV12,width=1920,height=1080,framerate=30/1,\ interlace-mode=progressive,colorimetry=bt601 ! v4l2h264enc \ capture-io-mode=4 output-io-mode=5 extra-controls="controls,video_bitrate=6000000,\ video_bitrate_mode=0;" ! h264parse ! mp4mux ! filesink location=camera-test.mp4
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
v4l2h264enc0 300 4.691 3.028 11.355 4.718 5.031 5.105 5.174
mp4mux0 300 0.333 0.062 0.753 0.270 0.616 0.631 0.693
h264parse0 300 0.224 0.118 0.758 0.223 0.269 0.300 0.389
capsfilter0 300 0.069 0.043 0.181 0.068 0.077 0.100 0.120
Total Avg (ms) 5.318
Total Avg (ns) 5317801

Camera recording at 4K@30fps

gst-launch-1.0 -e qtiqmmfsrc name=camsrc camera=0 ! \ video/x-raw,format=NV12,width=3840,height=2160,framerate=30/1,\ interlace-mode=progressive,colorimetry=bt601 ! v4l2h264enc \ capture-io-mode=4 output-io-mode=5 extra-controls="controls,video_bitrate=6000000,\ video_bitrate_mode=0;" ! h264parse ! mp4mux ! filesink location=camera-test.mp4
ELEMENT COUNT Average (ms) Minimum (ms) Maximum (ms) P50 (ms) P90 (ms) P95 (ms) P99 (ms)
v4l2h264enc0 297 10.137 7.969 28.000 10.043 10.616 10.747 12.042
mp4mux0 297 0.362 0.058 1.257 0.362 0.633 0.688 0.847
h264parse0 297 0.224 0.119 0.994 0.208 0.283 0.313 0.631
capsfilter0 297 0.072 0.033 0.226 0.070 0.077 0.103 0.114
Total Avg (ms) 10.796
Total Avg (ns) 10796452


Info
More performance information is going to be covered in the following sections


Cookies help us deliver our services. By using our services, you agree to our use of cookies.