Filesystems


Files and Filesystems

In Unix, files and filesystems are fundamental components of the operating system's structure. A file is a collection of data stored on disk, which can include anything from text documents and images to executable programs. Files are organized within directories in a hierarchical structure, allowing for efficient data management and retrieval.

A filesystem, on the other hand, is a method and data structure that the operating system uses to manage files on a disk or partition. It provides a way to store, retrieve, and organize files, supporting features like file permissions, links, and metadata. Common Unix filesystems include ext4, XFS, and Btrfs, each offering different capabilities and optimizations. Understanding these concepts is essential for managing data and system resources effectively in a Unix environment.

Types of Files in a UNIX Filesystem

Unix and Unix-like systems, including Linux, organize files in a hierarchical structure called a filesystem. Files can be classified based on their purpose, storage method, and visibility.

Classification Based on Purpose

  1. Ordinary files are the most common type, containing text, data, or program code. They cannot contain other files or directories.
  2. Directory files function as folders to organize other files, with the root directory (/) being the top-level directory of the entire filesystem. Users' files are usually stored in their respective home directories, such as /home/adam/.
  3. Device files represent hardware devices as if they were files. Block-oriented devices, like hard drives, transfer data in large blocks, whereas character-oriented devices, such as keyboards or modems, handle data one byte at a time.
  4. Link files serve as references to other files. Hard links are essentially duplicates of the original file and behave identically, while soft links (or symbolic links) act as indirect pointers to a file or directory, similar to shortcuts in Windows.

The file command in Linux is a powerful tool that helps identify and classify these file types by analyzing their content and structure. Below are insights on how the file command interprets and reports on these various classifications:

Classificationfile Command Examplefile Command Output ExampleExplanation
Text Filesfile document.txtdocument.txt: ASCII textThe file command detects text encoding (e.g., ASCII, UTF-8) or identifies binary format for executables.
Binary Filesfile programprogram: ELF 64-bit LSB executable, x86-64, dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2Provides detailed information about binary files, including architecture and format.
Directory Filesfile /home/user/home/user: directoryIdentified as a "directory". Contains pointers to other files and directories.
Block Devicesfile /dev/sda/dev/sda: block specialThe file command distinguishes between block devices (e.g., "block special") and character devices (e.g., "character special").
Character Devicesfile /dev/tty/dev/tty: character specialUseful for identifying the type of device a file represents.
Symbolic Linksfile /usr/bin/python/usr/bin/python: symbolic link to /usr/bin/python3.8Symbolic links are reported with their target file or directory.
Hard Linksfile hardlinkfile(Output identical to the original file)Hard links are identical to the original file in the file command output.

Classification Based on Storage

  1. Regular files contain text, data, or program code and are stored directly in the file system.
  2. Virtual files provide an interface to other programs or the kernel. They do not contain traditional data but rather information about processes and system parameters, typically found in directories like /proc and /sys.
  3. Remote files are stored on a remote Network File System (NFS) server. They can be accessed and manipulated as if they were stored locally.

The table below provides the most effective commands for identifying each type of file based on their specific attributes and locations:

ClassificationCommand ExampleCommand Output ExampleExplanation
Regular Filesfile document.txtdocument.txt: ASCII textThe file command detects the type of content in regular files, such as text, binary, or executable.
Virtual Filesstat /proc/cpuinfoFile: /proc/cpuinfo\nSize: 0\tBlocks: 0\tIO Block: 4096 regular file\nDevice: 0,5\tInode: 4026532255The stat command shows a typical file size of 0, located in /proc or /sys, indicating it's a virtual file.
Remote Filesdf -T /mnt/nfs/remote_fileFilesystem Type 1K-blocks Used Available Use% Mounted on\nnfsserver:/export nfs 1024000 102400 924000 10% /mnt/nfsThe df -T command displays the filesystem type as nfs, identifying it as a remote file on an NFS server.

Classification Based on Visibility

  1. Visible files are displayed when you list the contents of a directory using commands like ls.
  2. Hidden files are not displayed in a standard directory listing. They start with a period (.) and typically store configuration data or system files. They can be revealed using the ls -a command.

Note About File Names

Filenames are case-sensitive. This means the operating system treats "Test," "TEST," and "test" as different files. Also, most file types in Linux are determined by file content and not by the file extension, unlike systems like Windows.

Special Directory Names

In a filesystem, certain directory names have special meanings that simplify navigation and file management:

  1. The ./ notation refers to the current directory, meaning the directory where you are presently located. It is commonly used when executing a script or a program located in the current directory. For instance, if you have a script named script.sh in the current directory, you can run it using ./script.sh. This tells the system to look for script.sh in the current directory.
  2. The ../ notation refers to the parent directory, which is the directory one level up from the current directory in the filesystem hierarchy. It is useful for navigating upwards in the directory structure. For example, if you are in /home/user/Documents and you use cd ../, you will move up to /home/user.
  3. The ~/ notation is a shorthand for the current user's home directory. The home directory is a personal space allocated to a user where personal files and settings are stored. For example, ~/Documents would refer to the Documents directory within the home directory of the current user. This is especially useful for referencing files and directories in a user's home space without needing to specify the full path.

These notations provide a convenient way to navigate and manage files and directories efficiently in a command-line environment. They are particularly useful in scripting and automation tasks, where paths need to be specified dynamically or relative to the current context.

Directory Structure

Linux organizes everything within a single directory hierarchy that starts with the root directory (/).

/ ├── bin ├── boot ├── dev ├── etc │ ├── network │ └── ssh ├── home │ └── [user] │ ├── Documents │ ├── Downloads │ ├── Music │ ├── Pictures │ ├── Videos │ └── Desktop ├── lib ├── media ├── mnt ├── opt ├── proc ├── root ├── run ├── sbin ├── srv ├── sys ├── tmp ├── usr │ ├── bin │ ├── include │ ├── lib │ ├── local │ └── share └── var ├── cache ├── lib ├── local ├── lock ├── log └── tmp

Key directories within the Linux file system include:

DirectoryDescription
/The root directory: the starting point of the file system hierarchy, all other directories branch off from here.
/binEssential low-level system utilities, executable by all users.
/usr/binA place for user commands and applications typically used after the system boot process.
/sbinContains system binaries crucial for booting, restoring, recovering, and/or repairing the system, supplementing the binaries in /bin.
/libContains shared libraries essential for the binaries found in /bin and /sbin.
/usr/libLibraries necessary for /usr/bin binaries and applications.
/tmpA directory for storing temporary files, which are usually cleared at each reboot.
/homeThe home ground for personal directories of each user, where they can store their personal files.
/etcA repository for system-wide configuration files and scripts utilized during the boot process.
/devA space hosting device nodes that correspond to hardware devices connected to the system.
/varManages variable data such as system logs, mail and printer spool directories, alongside transient and temporary files.
/rootServes as the home directory for the root (superuser) account, distinct from the root directory (/).
/bootStores files vital for the boot process, including the Linux kernel and the boot loader.
/media and /mntServe as mount points for file systems and removable devices like CDs, USB drives, etc.

File System Types

A file system is a method of organizing, storing, and retrieving data on a storage device, like a hard drive, SSD, or USB drive. It manages the available space on the device, keeping track of which sectors belong to which files and directories.

Several types of file systems can be used on Linux systems, each designed with specific use-cases and features:

  • The ext2 (Second Extended Filesystem) is among the first file systems crafted specifically for Linux, and while it is efficient, it lacks advanced features like journaling or encryption.
  • An improved version of ext2, ext3 (Third Extended Filesystem), includes support for journaling, which helps protect against data loss by maintaining a log of changes that are yet to be committed to the file system.
  • Currently the default Linux file system, ext4 (Fourth Extended Filesystem) supports larger file sizes and file systems, provides improved performance and reliability, and includes features such as delayed allocation and journal checksumming.
  • Originally developed by IBM, the JFS (Journaled File System) is optimized to handle large file systems efficiently and includes journaling capabilities.
  • The NFS (Network File System) is not a traditional file system but rather a protocol that allows a system to access files over a network as though they were on its local hard drive.
  • Serving as a software layer in the kernel, the VFS (Virtual File System) provides a common interface to various file systems, enabling the operating system to uniformly access and manage different types of file systems.
  • The FAT (File Allocation Table) system is simple and widely used, particularly on removable storage devices, making it a suitable choice for interoperability across various operating systems.
  • A standard for Windows NT and its later versions, NTFS (New Technology File System) can be accessed on Linux but is not native, which may result in a lack of full functionality.
  • Known for its strong performance and reliability, ReiserFS (Reiser File System) is a general-purpose, journaled file system that efficiently handles large numbers of small files, making it a popular choice on servers.
  • The Btrfs (B-tree File System) is a copy-on-write (CoW) file system for Linux, designed to provide advanced features focused on fault tolerance, repair, and simplified administration, with capabilities like snapshots, subvolumes, and built-in RAID.
  • Developed by Silicon Graphics, Inc., XFS is a high-performance, journaling file system that excels at parallel I/O, making it particularly effective for applications involving large files that require high-performance I/O operations.
Categoryext2ext3ext4JFSNFSVFSFATNTFSReiserFSBtrfsXFS
Design PurposeSimple Linux file systemImproved ext2 with journalingDefault Linux file systemLarge file systemsNetwork file accessCommon interface for FSRemovable storageWindows file systemSmall files, serversAdvanced Linux featuresHigh-performance, large files
JournalingNoYesYesYesN/A (protocol)N/ANoYesYesYesYes
PerformanceEfficient for small FSBetter than ext2Better than ext3High efficiencyDepends on networkN/ASimple, lower performanceGood, Windows optimizedGood for small filesGood, advanced featuresExcellent, parallel I/O
Maximum File/FS SizeSmaller than ext3/4Larger than ext2Very largeVery largeN/AN/ALimited by designVery largeLargeVery largeVery large
SuitabilityBasic, older Linux systemsGeneral Linux useModern Linux systemsEnterprise, large dataNetwork environmentsKernel-level operationsWide compatibilityWindows environmentsServer useLinux, advanced useLarge file handling
Encryption SupportNoNoYes (since 4.1)NoN/AN/ANoYesNoYesNo
Data RecoveryHarderEasier than ext2Easier than ext3GoodDepends on implementationN/ASimpler but riskierGoodGoodVery goodGood
Use in Large ServersLess commonCommonVery commonYesYes, for shared storageN/ALess commonLess commonYesYesYes
Use in Personal DevicesLess commonLess commonCommonLess commonLess commonN/AVery commonCommon in WindowsLess commonGrowingLess common

Managing File Systems

Managing file systems is a fundamental skill that involves various operations such as checking existing file systems, installing necessary tools, creating new file systems on fresh partitions or drives, and modifying existing ones. This comprehensive guide provides detailed steps for each of these tasks, complete with commands, expected outputs, and practical considerations.

Checking Existing File Systems

Before performing any file system operations, it's essential to understand the current state of your system.

1. List Mounted File Systems

Use the df -T command to display all mounted file systems along with their types.

df -T

Expected Output:

Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/sda1 ext4 492G 215G 253G 46% / udev devtmpfs 16G 0 16G 0% /dev tmpfs tmpfs 3.2G 1.3M 3.2G 1% /run

2. List Block Devices with File System Information

Use lsblk with the -f option to display file system types for all block devices.

lsblk -f

Expected Output:

NAME FSTYPE LABEL UUID MOUNTPOINT sda ├─sda1 ext4 rootfs a1b2c3d4-e5f6-7890-abcd-ef1234567890 / ├─sda2 ext4 home 12345678-90ab-cdef-1234-567890abcdef /home └─sda3 swap 1a2b3c4d-5e6f-7890-abcd-ef1234567890 [SWAP]

3. Check Supported File Systems

List all file systems currently supported by the kernel:

cat /proc/filesystems

Expected Output:

nodev sysfs nodev tmpfs nodev bdev ext3 ext4 vfat xfs

Ensuring File System Support

Before creating a new file system, ensure that your system has the necessary support for it. This involves checking for kernel module support, installing the required utilities, and verifying that your system is compatible with the file system you intend to use.

1. Check Kernel Support

To confirm that your kernel supports the desired file system, check if the module is loaded:

lsmod | grep xfs

If the output includes lines referencing xfs, it indicates that the XFS module is loaded. If no output appears, the module isn’t loaded and may need to be installed or enabled.

2. Install File System Utilities

Once kernel support is confirmed, install any necessary tools. For example, to manage the XFS file system, you need the xfsprogs package:

sudo apt update sudo apt install xfsprogs

Expected Output:

Reading package lists... Done Building dependency tree Reading state information... Done The following NEW packages will be installed: xfsprogs ... Setting up xfsprogs (5.4.0-1ubuntu2) ...

This output indicates that xfsprogs has been successfully installed, and you can now proceed with XFS-specific file system tasks.

3. Verify Compatibility

Ensure that your system’s kernel and hardware support the file system by checking the module information:

modinfo xfs

Expected Output:

This command provides details on the XFS module, including its dependencies, supported versions, and any required firmware. Reviewing this information confirms compatibility with your current setup.

Identifying the Device

Accurate identification of the target device is crucial to avoid modifying the wrong disk, which could result in data loss. These commands will help you list and inspect block devices.

1. List All Block Devices

To get an overview of all attached storage devices, use:

lsblk

Expected Output:

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 100G 0 disk └─sda1 8:1 0 50G 0 part / sdb 8:16 0 200G 0 disk └─sdb1 8:17 0 50G 0 part /mnt/data

The lsblk output lists all block devices and their partitions, along with their sizes and mount points. This lets you identify the device you intend to work with (e.g., /dev/sdb).

2. Detailed Device Information

To gather further details about each device and its partitions, use fdisk:

sudo fdisk -l

Expected Output:

Disk /dev/sdb: 100 GiB, 107374182400 bytes, 209715200 sectors Units: sectors of 1 * 512 = 512 bytes ...

This output provides detailed information on each disk, including size, sector count, and partitioning scheme. Use this to confirm the correct device before making any modifications.

3. Identify Unpartitioned Space

If your target device is new and unpartitioned, you’ll need to partition it first. Use fdisk, gdisk, or parted to create new partitions.

Unmounting the Device (if applicable)

Before modifying a device, it’s essential to ensure that it is not in use. Unmounting prevents accidental data corruption during the process.

1. Check if the Device is Mounted

Determine if the target device is currently mounted:

mount | grep sdb1

Expected Output:

/dev/sdb1 on /mnt/data type xfs (rw)

If the output lists the device, it means it’s mounted. Note the mount point (e.g., /mnt/data) so you can unmount it in the next step.

2. Unmount the Device

Unmount the device before making any changes:

sudo umount /dev/sdb1

Expected Output:

No output indicates the device was unmounted successfully.

3. Handle Busy Devices

If you encounter a “device is busy” error, identify processes using the device:

sudo lsof /dev/sdb1

Expected Output:

The output lists any processes using the device. For example:

COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME bash 1234 user cwd DIR 8,17 4096 2 /mnt/data

Terminate these processes or stop the services to release the device, allowing it to be unmounted.

Creating a File System

Once the target device is unmounted, you can proceed with creating the new file system. The file system type you choose will depend on factors like performance, reliability, and feature support.

1. Choose the File System Type

Select a file system type that meets your needs. For example, ext4 is commonly used for general-purpose storage due to its balance of performance and features. Alternatively, you may choose xfs for large filesystems or high-performance needs.

2. Create the File System

To format the device with a specific file system type, use the appropriate mkfs command. For ext4, for instance:

sudo mkfs.ext4 /dev/sdb1

Expected Output:

mke2fs 1.45.5 (07-Jan-2020) /dev/sdb1 contains a ext4 file system Proceed anyway? (y,N) y Creating filesystem with 26214400 4k blocks and 6553600 inodes Filesystem UUID: 123e4567-e89b-12d3-a456-426614174000 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, ... Allocating group tables: done Writing inode tables: done Creating journal (131072 blocks): done Writing superblocks and filesystem accounting information: done

This output shows the mkfs process. It provides details on block allocation, journal creation, and the UUID for the new file system. If asked to proceed, type y to continue. Successful completion indicates the file system is now ready for use.

3. Label the File System (Optional)

Assigning a label to your file system makes it easier to identify, especially when managing multiple disks. To label an ext4 file system:

sudo e2label /dev/sdb1 mydata

No output means the label was applied successfully. You can verify the label by running sudo e2label /dev/sdb1.

Changing Existing File Systems

Modifying existing file systems requires extra caution. Backup any essential data before proceeding to avoid data loss. Here are common operations:

1. Resizing a File System

To resize a file system, unmount it first:

sudo umount /dev/sdb1

Then, adjust its size. For example, to shrink an ext4 file system to 50 GB:

sudo resize2fs /dev/sdb1 50G

Expected Output:

resize2fs 1.45.5 (07-Jan-2020) Resizing the filesystem on /dev/sdb1 to 13107200 (4k) blocks. The filesystem on /dev/sdb1 is now 13107200 (4k) blocks long.

This output confirms the new size of the file system. Note that shrinking can cause data loss if the specified size is smaller than the amount of data stored on the partition.

2. Converting File Systems

Some file systems support in-place conversions. For example, converting ext2 to ext3 to enable journaling can be done as follows:

sudo tune2fs -O has_journal /dev/sdb1

Expected Output:

tune2fs 1.45.5 (07-Jan-2020) Setting filesystem feature 'has_journal' Creating journal inode: done This filesystem will be automatically checked every 29 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.

This output shows that journaling has been enabled on the file system. This feature enhances data integrity but may slightly reduce performance.

3. Backup Before Changes

Backing up important data is always recommended before altering file systems. Tools like rsync, tar, or dd can be used for this purpose. For example:

sudo rsync -av /mnt/mydata/ /mnt/backup/

Mounting the New File System

To make the new file system accessible to your system, you’ll need to mount it. Mounting allows you to interact with the file system and its contents.

1. Create a Mount Point

Choose a directory to serve as the mount point. If the directory does not exist, create it with:

sudo mkdir -p /mnt/mydata

Expected Output:

No output indicates successful directory creation.

2. Mount the File System

Once the mount point exists, mount the file system:

sudo mount /dev/sdb1 /mnt/mydata

Expected Output:

There’s no output if the mount is successful. You can verify by listing mounted file systems or checking the mount point with df.

3. Verify the Mount

To confirm that the file system is mounted correctly, use:

df -h /mnt/mydata

Expected Output:

Filesystem Size Used Avail Use% Mounted on /dev/sdb1 99G 60M 94G 1% /mnt/mydata

The output shows disk usage information for /dev/sdb1, confirming it is mounted on /mnt/mydata. You’re now ready to use the new file system for storage or data operations.

Additional Considerations

When working with file systems, automating mounting, ensuring file system health, and managing permissions are essential practices. Here’s how to handle these additional considerations.

1. Automate Mounting at Boot

To ensure your new file system is automatically mounted at system startup, add it to the /etc/fstab file.

First, edit the file:

sudo nano /etc/fstab

Add the following line to the end of the file (replace the UUID and mount point as needed):

UUID=123e4567-e89b-12d3-a456-426614174000 /mnt/mydata ext4 defaults 0 2

Entry Components:

  • The UUID is a unique identifier for the partition, which can be obtained using the blkid command, ensuring precise identification of the partition across reboots.
  • /mnt/mydata is the mount point, specifying the directory where the partition will be accessible once mounted.
  • The ext4 refers to the file system type, indicating how data is organized on the partition, with ext4 being a common choice for Linux systems.
  • defaults are the mount options, which typically include settings like read-write access and automatic mounting during startup.
  • A value of 0 means the partition will skip the dump backup process, as backups are not required for this entry.
  • The 2 indicates the order in which fsck performs checks on the partition, with 2 meaning it will be checked after the root partition, providing structure for file system checks.

Find the UUID for the device by running:

sudo blkid /dev/sdb1

Expected Output:

/dev/sdb1: UUID="123e4567-e89b-12d3-a456-426614174000" TYPE="ext4"

This output provides the UUID for /dev/sdb1. Use this identifier in the /etc/fstab entry to avoid issues if device names change on reboot. After adding this to /etc/fstab, the system will automatically mount the file system at /mnt/mydata on startup.

2. File System Maintenance

Keeping your file system healthy and monitored is crucial for data integrity and performance.

Check for Errors:

Use fsck to check the file system for errors and repair any issues.

sudo fsck /dev/sdb1

Expected Output:

fsck from util-linux 2.34 e2fsck 1.45.5 (07-Jan-2020) /dev/sdb1: clean, 10/6553600 files, 262144/26214400 blocks

This output confirms that the file system check is complete. If fsck finds no errors, it will indicate that the file system is clean. If it detects issues, fsck will attempt to repair them based on the options you specify.

Monitor Disk Usage:

Regularly monitor available space and usage with:

df -h

Expected Output:

Filesystem Size Used Avail Use% Mounted on /dev/sdb1 99G 60M 94G 1% /mnt/mydata

This output shows disk usage for all mounted file systems, including the newly created one. Use this information to ensure the file system has enough space for your needs.

3. Permissions and Ownership

To control who can access your mounted file system, set ownership and permissions.

Change ownership to a specific user and group:

sudo chown user:group /mnt/mydata

Expected Output:

No output means the command succeeded.

Set permissions:

sudo chmod 755 /mnt/mydata

Expected Output:

Again, no output indicates success.

These commands set the ownership of /mnt/mydata to a specified user and group. The chmod command assigns permissions, where 755 means the owner has read, write, and execute permissions, while others have read and execute only.

4. Security Considerations

When storing sensitive data, consider additional security measures, such as encryption and access control.

Encryption:

Use LUKS (Linux Unified Key Setup) to encrypt the partition:

sudo cryptsetup luksFormat /dev/sdb1

Expected Output:

WARNING! ======== This will overwrite data on /dev/sdb1 irrevocably. Are you sure? (Type uppercase yes): YES Enter passphrase for /dev/sdb1: Verify passphrase:

LUKS prompts for confirmation before encrypting the partition. Follow the prompts to set a passphrase, which will be required for future access.

Access Control (ACL):

To set fine-grained permissions, enable and configure Access Control Lists (ACLs). For example, to grant a user read access:

sudo setfacl -m u:username:r /mnt/mydata

Expected Output:

No output, indicating success.

This command grants username read-only access to /mnt/mydata. Use getfacl to verify ACLs or to modify permissions for other users as needed.

© Krishna Neupane Since @ 1995. All rights reserved.