High performance SD card tuning using the EXT4 file system

From RidgeRun Developer Wiki


Write-intense scenarios, like recording two 1080P video streams at 30 frames per second simultaneously, place a significant load on the SD card. Without proper tuning, most systems will not be able to write the data fast enough to keep up with the incoming video stream.

Solution

The solution involves using an EXT4 file system for your high-speed SD card, along with optimizing this filesystem parameters to tune for a better write performance. Additionally, depending on your application (or GStreamer pipeline, if using GStreamer), you may be required to tune the kernel's virtual memory (vm) subsystem. The goal is to average out the CPU load reducing spikes caused by improper resource allocation. A common issue is the pdflush daemon waking up to flush the cached write data to the SD card, which can cause a system stall, dropped frames, and worse if the CPU load spike to significant.

Enable EXT4 support in the kernel

Kernel Configuration:

File Systems ->
  <*> The Extended 4 (ext4) filesystem
    [*] Ext4 extended attributes
    [*]  Ext4 POSIX Access Control Lists
    [*]  Ext4 Security Labels
Enable the block layer
  [*] Support for large (2TB+) block devices and files

These are the resulting CONFIGs that should be set:

CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_XATTR=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
...
CONFIG_LBDAF=y

Prepare your SD card with an optimized EXT4 filesystem

Create the EXT4 partition

There are many tools to prepare an EXT4 partition in your SD card. We recommend gparted.

sudo gparted

See Figure 1 for an example setup of a new EXT4 partition in an 8 GB SD card using gparted. In this example, the SD card was assigned the device node /dev/sdb, and the new partition will be mapped to /dev/sdb1. This could be different in your computer.

Figure 1. Creating an EXT4 partition with gparted

Optimize the EXT4 partition

The optimization consists on selecting the data mode for our fileystem as data=writeback. The other two possible data modes (data=ordered and data=journal) do not have as good of bandwidth performance as writeback. The EXT4 journaling is disabled when using writeback. For details on the attributes of the various EXT4 data modes, read the "Data Mode" section from the Ext4 Filesystem kernel documentation.

sudo umount /dev/sdb1
sudo tune2fs -o journal_data_writeback /dev/sdb1
sudo tune2fs -O ^has_journal /dev/sdb1
sudo e2fsck -f /dev/sdb1

You can check your filesystem setup using:

sudo dumpe2fs /dev/sdb1 | less

Mount the EXT4 partition in your system

Mount the EXT4 partition specifying the following flags for optimization:

  • noatime - avoid writing access times
  • data=writeback - tell mount we are mounting a filesystem with no journaling

To mount the EXT4 fileystem at /mnt/sd in your board:

mkdir -p /mnt/sd
mount -t ext4 -O noatime,data=writeback /dev/mmcblk0p1 /mnt/sd

You can also specify these same flags in /etc/fstab.

Optimizing the virtual memory subsystem

The kernel's virtual memory subsystem allows you to tune the writeout of dirty data to disk. In a write-intense scenario like dual-recording 1080P@30, this tuning can be fundamental to avoid crashing the system. Why? In just a fraction of a second, the amount of input data captured is huge and we need to write it eventually to disk; if the system waits too long to wake up it's kernel's flusher threads, there will be too much data to write, and then the write process would consume completely the CPU for even seconds. Such a CPU spike will cause undesired behaviour (i.e. dropping frames while recording), and eventually these CPU load peaks can stall the system (as it will run out of avaiable memory). The default setup of the vm subsystem is not optimal for high bandwidth file write use modes.

Below is the output of the Gstreamer gstperf element running the dual-record pipeline without vm optimizations. The CPU load peaks to 100% when the kernel's flusher threads wake up.

perf0: frames: 1096 	current: 29.86	 average: 30.00	arm-load: 17
perf1: frames: 1127 	current: 30.13	 average: 30.00	arm-load: 19
perf0: frames: 1127 	current: 29.89	 average: 29.99	arm-load: 19
perf1: frames: 1157 	current: 29.77	 average: 29.99	arm-load: 29
perf0: frames: 1158 	current: 30.13	 average: 30.00	arm-load: 33
perf1: frames: 1188 	current: 29.98	 average: 29.99	arm-load: 100     <----- CPU Load peak
perf0: frames: 1188 	current: 29.86	 average: 29.99	arm-load: 100
perf1: frames: 1219 	current: 30.13	 average: 30.00	arm-load: 100
perf0: frames: 1219 	current: 29.90	 average: 29.99	arm-load: 100
perf1: frames: 1249 	current: 29.85	 average: 29.99	arm-load: 88
perf0: frames: 1250 	current: 30.11	 average: 29.99	arm-load: 83
perf1: frames: 1280 	current: 30.14	 average: 30.00	arm-load: 26
perf0: frames: 1280 	current: 29.87	 average: 29.99	arm-load: 25
perf1: frames: 1310 	current: 29.86	 average: 29.99	arm-load: 19
perf0: frames: 1311 	current: 30.14	 average: 29.99	arm-load: 20

There are three parameters of the vm subsystem that are particularly useful for our problem:

  • dirty_writeback_centisecs - the kernel flusher threads will periodically wake up and write 'old' data out to disk
  • dirty_expire_centisecs - when dirty data is old enough to be eligible for writeout by the kernel flusher thread
  • dirty_ratio - a percentage of total available memory that contains free pages and reclaimable pages, the number of pages at which a process which is generating disk writes will itself start writing out dirty data

Using RidgeRun's SDK for the DM8168 Z3 board, these are the default values for these parameters in the system:

/ # cat /proc/sys/vm/dirty_writeback_centisecs 
500
/ # cat /proc/sys/vm/dirty_expire_centisecs 
3000
/ # cat /proc/sys/vm/dirty_ratio 
20

There is not a recipe to tweak these parameters, it all depends on your system and your application requirements. Since these parameters can affect all the processes, you may want to prioritize one parameter or another. For the case presented in this wiki page (dual-recording@30fps), the best tuning we came across was to change dirty_writeback_centisecs and dirty_expire_centisecs both to the 100 value, meaning that every second the kernel's flusher threads will wake up, and every second all the incoming data would expire and be written by those threads. These frequent writes (every second) allowed to avoid peaks on the CPU load.

echo 100 > /proc/sys/vm/dirty_writeback_centisecs 
echo 100 > /proc/sys/vm/dirty_expire_centisecs

Pre-erasing SD card

If you have an environment where you know you will start out with an SD card that can be formatted, then you may want to first erase all the NAND sectors in the SD card, followed by format. In this way, when you are writing files, you will not have to wait for the sector to first be erased. Simply formatting the SD card does not automatically erase all the NAND sectors.

You can use the erase tool developed by Arnd Bergmann, which you can clone from https://git.linaro.org/people/arnd.bergmann/flashbench.git OR ssh://git@git.linaro.org/people/arnd.bergmann/flashbench.git

I use pre-erasing when I am doing performance measurements and I want to have repeatable results.

Common problems

Unable to mount the EXT4 filesystem

Problem

Unable to mount the EXT4 filesystem using the mount command.

Solution

Likely your kernel wasn't built with EXT4 support. To see what file systems are supported by your kernel, run:

cat /proc/filesystems

Follow the Enable EXT4 support in the kernel section above, and re-compile the kernel afterwards.

References