High performance SD card tuning using the EXT4 file system: Difference between revisions

From RidgeRun Developer Wiki
No edit summary
Line 101: Line 101:


  echo 100 > /proc/sys/vm/dirty_writeback_centisecs  
  echo 100 > /proc/sys/vm/dirty_writeback_centisecs  
  echo 100 > /proc/sys/vm/dirty_expire_centisecs  
  echo 100 > /proc/sys/vm/dirty_expire_centisecs
 
After setting these parameters, CPU load peaks were no longer detected.
 
<pre>
perf0: frames: 1341 current: 29.85 average: 29.99 arm-load: 15
perf1: frames: 1371 current: 30.13 average: 29.99 arm-load: 20      <--- No peaks to 100% anymore
perf0: frames: 1372 current: 29.91 average: 29.99 arm-load: 20
perf1: frames: 1401 current: 29.86 average: 29.99 arm-load: 20
perf0: frames: 1403 current: 30.12 average: 29.99 arm-load: 20
perf1: frames: 1432 current: 30.13 average: 29.99 arm-load: 12
perf0: frames: 1433 current: 29.86 average: 29.99 arm-load: 14
perf1: frames: 1462 current: 29.86 average: 29.99 arm-load: 15
perf0: frames: 1464 current: 30.13 average: 29.99 arm-load: 13
</pre>


= References =
= References =

Revision as of 19:55, 14 February 2014

While trying to dual-record from two 1080P@30fps input channels at the same time, we faced the problem of saving all these huge amount of incoming data to the filesystem that eventually drowned the system. The setup was an ext3 filesystem in a Class 10 SD card, but that was not enough and just some seconds after starting writing the files the system would stall.

Solution

The solution involves using an ext4 filesystem instead of ext3, and then optimizing this filesystem for a better write performance. Additionally, depending on how intense your gstreamer pipeline is, you may be required to tune the kernel's virtual memory (vm) subsystem, to avoid peaks on the CPU load when the kernel's flusher threads wakes up to write, minimizing any chance to stall the system.

1. Enable ext4 support in the kernel

Kernel Configuration:

File Systems ->
  <*> The Extended 4 (ext4) filesystem
    [*] Ext4 extended attributes
    [*]  Ext4 POSIX Access Control Lists
    [*]  Ext4 Security Labels
Enable the block layer
  [*] Support for large (2TB+) block devices and files

2. Prepare your SD card with an optimized ext4 filesystem

2.1 Create the ext4 partition

There could be many tools to prepare an ext4 partition in your SD card. Probably, the simplest way is to use gparted.

sudo gparted

See Figure 1 for an example setup of a new ext4 partition in an 8GB SD card using gparted. In this example, the SD card was assigned the device node /dev/sdb, and the new partition will be mapped to /dev/sdb1. This could be different in your computer.

Figure 1. Creating an ext4 partition with gparted

Note: Please be careful, you don't want to format the wrong device or lose information.

2.2 Optimize the ext4 partition

The optimization consists on selecting the data mode for our fileystem as data=writeback. Over the three existing possible data modes (the other two being data=ordered and data=journal), writeback is the one that gets the best write performance out of ext4 by disabling journaling. For more information, read the "Data Mode" section from the Ext4 Filesystem kernel documentation.

sudo tune2fs -o journal_data_writeback /dev/sdb1
sudo tune2fs -O ^has_journal /dev/sdb1
sudo e2fsck -f /dev/sdb1

You can check your filesystem setup using:

dumpe2fs /dev/sdb1 | less

3. Mount the ext4 partition in your system

Mount the ext4 partition specifying the following flags for optimization:

  • noatime,nodiratime - avoid writing access times
  • data=writeback - tell mount we are mounting a filesystem with no journaling

To mount the ext4 fileystem at /mnt/sd in your board:

mkdir -p /mnt/sd
mount -t ext4 -O noatime,nodirame,data=writeback /dev/mmcblk0p1 /mnt/sd

Note: These same flags can be specified in the /etc/fstab file.

4. Optimizing the virtual memory subsystem

The kernel's virtual memory subsystem allows to tune the writeout of dirty data to disk. In a write-intense scenario like dual-recording 1080P@30fps, this tuning can be fundamental to avoid crashing the system. Why? In just a fractions of a second, the amount of input data we've captured is huge and we need to write it eventually to disk; if the system waits too long to wake up its kernel's flusher threads, there will be too many data to write, and then the write process would consume completely the CPU for even seconds; consequences involve undesired behaviour (i.e. dropping frames while recording), and eventually these CPU load peaks can stall the system. The default setup of the vm subsystem is not optimal for these cases.

Here's the output of the gstperf element running the dual-record pipeline without vm optimizations. The CPU load peaks to 100% when the kernel's flusher threads wake up.

perf0: frames: 1096 	current: 29.86	 average: 30.00	arm-load: 17
perf1: frames: 1127 	current: 30.13	 average: 30.00	arm-load: 19
perf0: frames: 1127 	current: 29.89	 average: 29.99	arm-load: 19
perf1: frames: 1157 	current: 29.77	 average: 29.99	arm-load: 29
perf0: frames: 1158 	current: 30.13	 average: 30.00	arm-load: 33
perf1: frames: 1188 	current: 29.98	 average: 29.99	arm-load: 100     <----- CPU Load peak
perf0: frames: 1188 	current: 29.86	 average: 29.99	arm-load: 100
perf1: frames: 1219 	current: 30.13	 average: 30.00	arm-load: 100
perf0: frames: 1219 	current: 29.90	 average: 29.99	arm-load: 100
perf1: frames: 1249 	current: 29.85	 average: 29.99	arm-load: 88
perf0: frames: 1250 	current: 30.11	 average: 29.99	arm-load: 83
perf1: frames: 1280 	current: 30.14	 average: 30.00	arm-load: 26
perf0: frames: 1280 	current: 29.87	 average: 29.99	arm-load: 25
perf1: frames: 1310 	current: 29.86	 average: 29.99	arm-load: 19
perf0: frames: 1311 	current: 30.14	 average: 29.99	arm-load: 20

There are three parameters of the vm subsystem that are particularly useful for our problem:

  • dirty_writeback_centisecs - the kernel flusher threads will periodically wake up and write 'old' data out to disk
  • dirty_expire_centisecs - when dirty data is old enough to be eligible for writeout by the kernel flusher thread
  • dirty_ratio - a percentage of total available memory that contains free pages and reclaimable pages, the number of pages at which a process which is generating disk writes will itself start writing out dirty data

Using RidgeRun's SDK for the DM8168 Z3 board, these are the default values for these parameters in the system:

/ # cat /proc/sys/vm/dirty_writeback_centisecs 
500
/ # cat /proc/sys/vm/dirty_expire_centisecs 
3000
/ # cat /proc/sys/vm/dirty_ratio 
20

There is not a recipe to tweak these parameters, it all depends on your system and your application requirements. Since these parameters can affect all the processes, you may want to prioritize one parameter or another. For the case presented in this wiki page (dual-recording@30fps), the best tuning we came across was to change dirty_writeback_centisecs and dirty_expire_centisecs both to the 100 value, meaning that every second the kernel's flusher threads will wake up, and every second all the incoming data would expire and be written by those threads. These frequent writes (every second) allowed to avoid peaks on the CPU load.

echo 100 > /proc/sys/vm/dirty_writeback_centisecs 
echo 100 > /proc/sys/vm/dirty_expire_centisecs

After setting these parameters, CPU load peaks were no longer detected.

perf0: frames: 1341 	current: 29.85	 average: 29.99	arm-load: 15
perf1: frames: 1371 	current: 30.13	 average: 29.99	arm-load: 20      <--- No peaks to 100% anymore
perf0: frames: 1372 	current: 29.91	 average: 29.99	arm-load: 20
perf1: frames: 1401 	current: 29.86	 average: 29.99	arm-load: 20
perf0: frames: 1403 	current: 30.12	 average: 29.99	arm-load: 20
perf1: frames: 1432 	current: 30.13	 average: 29.99	arm-load: 12
perf0: frames: 1433 	current: 29.86	 average: 29.99	arm-load: 14
perf1: frames: 1462 	current: 29.86	 average: 29.99	arm-load: 15
perf0: frames: 1464 	current: 30.13	 average: 29.99	arm-load: 13

References

http://blogofterje.wordpress.com/2012/01/14/optimizing-fs-on-sd-card/

https://www.kernel.org/doc/Documentation/filesystems/ext4.txt

https://www.kernel.org/doc/Documentation/sysctl/vm.txt