Filesystems unraveled

Updated 2016-10-08

A filesystem is the structure of some data storage. They allow for storing file hierarchically (folders), remembering metadata (timestamps, owners, etc.).

The mechanics lying behind filesystems is often misunderstood. As a consequence, installing an operating systems is often perceived as a complex operation. A few enlightening explanations might help a great deal to alleviate this fear.

Structure

The most important thing to grasp is that every computer storage device, from a hardware point of view, is a continuous segment of memory. The way data gets organized by partitions, folders, metadata, etc. is defined logically by tools and operating systems.

Boot sectors and partition tables

There are 2 types of boot sectors: the legacy Master Boot Record (MBR) and the newer GUID Partition Table (GPT). GPT has less limitations in regard to the number and the size of the partitions.

The boot sector and the partition table typically reside at the very beginning of the disk. They are not on any partition. This would not make sense since the partition table defines the partition layout. On Linux, hard disk drives are typically referenced by the path /dev/sdX, where X is a letter, and their partitions by the path /dev/sdXN, where N a number.

The OS and programs running on it identify partitions by the sector address (or logical block address, a.k.a. LBA) stored in the partition table. The standard starting sector for partitions is at byte 2048. In the past, the first partition used to be written at byte 63, which may cause performance issues since it does not align with the physical sector size of the drive. See the references for more explanations.

Tools for creating and manipulating the boot sector and the partition table include dd, syslinux, grub, fdisk, gdisk and more. A tool like fdisk will manipulate the partition table found at the beginning of the designated storage media. As such, it usually only makes sense to call fdisk over a hard disk drive, such as fdisk /dev/sdX, and not over a partition.

The MBR has different partition types: primary, extended and logical. See this Arch Wiki article for more details.

GPT has only one partition type.

Partitions

Partitions need to be initialized before they can be used by the OS, that is, the header must be created. This header has different names depending on the filesystem type (e.g. table of content, superblock). The header will usually occupy the first sectors of the partition.

Tools such as mkfs can be used to initialize partitions.

As mentioned before, partitions are purely logical: it is possible to write data across partitions with dd. Although that would probably destroy the logical integrity of some partitions.

If you remove the partition entry N, then partition N won’t exist in the eyes of the OS. But dd can force reading data at any position on disk, and thus recover data from lost partitions. If you re-add the partition entry with the same LBA addresses, then the partition will be accessible just like before.

Bootloaders

The bootloader is a program that resides partly on a partition and partly on the boot sector.

When the computer starts, it will boot the designated media. It will look for an MBR or a GPT in the first sectors and run the executable code of the boot loader. This code can be configured to boot an OS located at a specific partition.

Disk usage and apparent size

Every file on the system has 2 “size” properties: the disk usage and the apparent size.

The apparent size is the number of bytes contained in a file. It represents the information held by the file, and as such it is the same across different file systems. It can be queried with ls -l or du -b (GNU) / du -A (BSD).

Disk usage is highly dependent on file systems. It can be queried with ls -s or du. Disk usage accounts for several properties of the file:

The disk usage is usually higher than the apparent size because of metadata and fragmentation, but it can also be smaller if the file is sparse.

Let’s experiment:

$ dd of=sparse-file bs=1k seek=5120 count=0
0+0 records in
0+0 records out
0 bytes (0 B) copied, 5.668e-05 s, 0.0 kB/s

$ du sparse-file
0

$ du -b sparse-file
5242880

Alternatively we can also use

truncate -s 5M sparse-file

The file is full of zeros and requires only 4 bytes on the filesystem, although it contains 5242880 information bytes.

Online resizing

If the kernel and the filesystem support it, it is possible to resize online partitions, e.g. the system partition. Note that while extending a partition is not problematic, shrinking a partition can cause data loss.

Let’s see how this works on an ext4 filesystem.

Warning: The whole process should not be interrupted. Back up your partition table and the data if possible. Make sure the computer is powered by a battery or a UPS.

And there is no need to restart the computer!

See (8)resize2fs for more options.

References