BIPS/Performance: Difference between revisions

From RidgeRun Developer Wiki
mNo edit summary
No edit summary
Line 32: Line 32:
We can see that BIPS is capable of completing over 7000 exchanges per second, meaning that each Pull and Push duration is quite low.
We can see that BIPS is capable of completing over 7000 exchanges per second, meaning that each Pull and Push duration is quite low.


[[File:Fps minsize.png| center | 500 px ]]
[[File:Fps minsize v2.png| center | 500 px ]]


We can see that in this particular case, the small size of the Buffers result in a relatively low bandwidth in the KBPS range.
We can see that in this particular case, the small size of the Buffers result in a relatively low bandwidth in the KBPS range.
Line 38: Line 38:


The latency between a Production cycle and a Consumption cycle is generally under a millisecond. We already have optimizations along the way that can even reduce these latencies further, down to the microsecond scale.
The latency between a Production cycle and a Consumption cycle is generally under a millisecond. We already have optimizations along the way that can even reduce these latencies further, down to the microsecond scale.
[[File:Latency minsize.png| center | 500 px ]]
[[File:Latency minsize v2.png| center | 500 px ]]


== Using 4K RGBA Buffers ===
== Using 4K RGBA Buffers ===
Line 49: Line 49:
Given the speed and size of the Buffers being exchanged, BIPS has proven to be capable of exchanging up to and over 6 GB per second between different processes.
Given the speed and size of the Buffers being exchanged, BIPS has proven to be capable of exchanging up to and over 6 GB per second between different processes.


[[File:Bw 4k rgba.png| center | 500 px ]]
[[File:Bw 4k rgbav2.png| center | 500 px ]]


We can observe that as the overhead in each Processing cycle increases, the latency between a Production cycle and a Consumption cycle increases. This is an important detail to consider in a given use case.
We can observe that as the overhead in each Processing cycle increases, the latency between a Production cycle and a Consumption cycle increases. This is an important detail to consider in a given use case.

Revision as of 20:57, 8 January 2023



Previous: Examples/GStreamer to Python Index Next: Contact_Us





Introduction

To evaluate BIPS' performance, we measure Frame Rates, Bandwidths and Latencies under different conditions. We use the provided example binaries to obtain these measurements, please refer to BIPS/Examples/C++ for more information.

The following diagram illustrates the basic timing when running a producer process in parallel with a consumer process.

Frame Rate

We measure the Frame Rate in Frames Per Second (FPS) as the number of production or consumption cycles a client completes in a second. Notice that the Frame Rate includes a Pull and Push duration, which directly correspond to BIPS' overhead, but it also includes a Processing duration (labeled Production or Consumption in the diagram) and an Inter Iteration duration. These will vary according to the use case. We provide performance results for cases that minimize the Processing and Inter Iteration durations, as well as cases that might simulate a realistic exchange of image buffers between clients.

Bandwidth

We measure the Bandwidth in Bytes Per Second (BPS) as the number of bytes that are exchanged between clients in a second. To obtain the Bandwidth, we multiply the Buffer Size and the Frame Rate, as well as a factor of 2 to account for the Pull - Push exchange from Producer to Consumer, and the Pull - Push exchange from Consumer to Producer.

Latency

We measure the Inter-Process Latency as the delay between the Production Cycle and the Consumption Cycle as shown in the diagram. To capture this metric, the Producer attaches a timestamp to the Buffer at the time of Production, and then the Consumer reads the timestamp during the time of Consumption. The Consumer then takes a timestamp difference between the time of Consumption and the time of Production as measured in the Buffer timestamp.

Results

Here we plot the Frame Rate, Bandwidth and Latency as a function of the number of buffers we use in each test. Each test runs for at least 100 iterations. The metrics are then averaged across the different iterations.

Using 4 byte Buffers

To reduce the Processing and Inter Iteration overhead, we exchange small, 4 byte Buffers.

We can see that BIPS is capable of completing over 7000 exchanges per second, meaning that each Pull and Push duration is quite low.

We can see that in this particular case, the small size of the Buffers result in a relatively low bandwidth in the KBPS range.

The latency between a Production cycle and a Consumption cycle is generally under a millisecond. We already have optimizations along the way that can even reduce these latencies further, down to the microsecond scale.

Using 4K RGBA Buffers =

To simulate a relatively demanding use case, we exchange 4K RGBA Buffers.

BIPS can consistently maintain a Frame Rate quite over the standard 60 FPS for 4K RGBA Buffer exchange. The main overhead in this case involves the production and consumption of the Buffers, which will greatly vary with the use case.

Given the speed and size of the Buffers being exchanged, BIPS has proven to be capable of exchanging up to and over 6 GB per second between different processes.

File:Bw 4k rgbav2.png

We can observe that as the overhead in each Processing cycle increases, the latency between a Production cycle and a Consumption cycle increases. This is an important detail to consider in a given use case.



Previous: Examples/GStreamer to Python Index Next: Contact_Us