Hardware 3 Flashcards
Cluster
A cluster is the smallest section of the disk that can be used to store a file. Each cluster on a disk has an address.
So if you have a file system with a cluster size of 32kb, and save a file that is 64kb in size, then that file will be spread across two clusters.
Likewise, if you have a file of size 1kb and you save it to the disk, it will take up one whole cluster and actually use up 32kb of space on the disk. That’s because two files can’t use the same cluster, so the minimum file size on a file system with a cluster size of 32kb is 32kb.
Slack Space
When you have a file smaller than the file system’s cluster size, it will take up an entire cluster and thus the remaining space is wasted - known as ‘slack space’.
File System
The file system determines how files are stored on the device, and what features the file system offers. Not all file systems are supported by every OS.
Every file system stores at least two pieces of information per file. The first is the data in the file, in other words, the contents of the file. The next is metadata.
The metadata is stored in an index which provides a list of files and the locations where they can be found on the disk. The exact method for doing this varies from file system to file system, but the concept remains the same.
Deleting a File
If a file is deleted, then the index entry is removed, but the content of the file isn’t removed from the disk.
Instead, that cluster is marked as overwritable, meaning the contents of a new file could overwrite the data there.
The reason for this is simply efficiency; it is pointless to overwrite the data of a deleted file with 0s and then allow the contents of a new file to overwrite those 0s.
Instead, we just mark that cluster as overwritable and allow a new file to overwrite the contents of the previous file. That’s one overwrite procedure instead of two, and the result is the same.
(This is the reason why you can sometimes recover deleted files from a hard drive; the contents of the file remains even if the metadata has been removed. )
Multiple Clusters
So the real question is, how are files tracked over multiple clusters? If you have a file of size 64kb on a file system with a 32kb cluster size, then that file will take up two clusters on the disk.
If the first cluster the file is stored in has a cluster immediately after it that is free, then the rest of the file will be placed there.
If there is no cluster free immediately after the first cluster, then the rest of the file will be put into a different cluster, and the address of the next cluster will be added to the end of the first cluster.
Some file systems will use a file allocation table to map each cluster, so the first cluster will point to the table entry, which contains the addresses of the next cluster.
And that cluster will also have an entry in the table, which points to the next cluster and so on until the file has been read.
FAT32
Introduced with Windows 95. It uses a File Allocation Table to map each cluster, which is where the name FAT comes from.
The FAT32 file system doesn’t support files larger than 4 GB, which seemed a huge amount back in 1995, but these days is hardly anything.
It doesn’t support file permissions because it doesn’t store metadata such as who created a file; therefore it was primarily used in USB drives, which could be connected to any computer.
exFAT
The exFAT file system is a file system designed for USB drives and other removable media, so it doesn’t support permissions.
It was introduced in 2006, but it took a few years to gain enough traction for USB drive manufacturers to start loading it by default. It is based on FAT32, however it has been completely modernised.
The file size limit is so large that it effectively has no maximum file size. It supports Windows, Mac and Linux.
New Technology File System (NTFS)
Used by modern versions of Windows.
It is an advanced file system with many features, including permissions support , encryption support and shadow copies (which are effectively backups of files).
The NTFS file system is also more reliable than older file systems; to a limited extent, it is capable of healing from data corruption.
The downside is that there is limited support for the NTFS file system amongst non-Windows operating systems. For example, if you connect an NTFS formatted drive to a Mac computer, you’ll find that you can read files on the drive but not write to the drive.
Extended File System 3 (EXT3)
An older file system often used in Linux.
It was introduced in 2001 and supports permissions and encryption, although no shadow copy.
The EXT3 file system features a maximum file size of 2TB.
It is a ‘journaling’ file system, which means that changes to the disk are tracked in a separate part of the file system known as the ‘journal’. This can help to recover the drive in the event of a disk corruption that might result from a sudden shutdown or jolt.
Extended File System 4 (EXT4)
The modern file system that is used in Linux.
It was introduced in 2008 and supports permissions and encryption, although again no shadow copy.
The EXT4 file system has a maximum file size so large that, in practical terms, there is no maximum file size.
Other than that, the EXT4 file system allows you to optionally turn off the journal and features a faster disk check process.
Hierarchical File System Plus (HFS+)
Was, until very recently the file system that Apple used in Mac OS X.
It supports files so large that there are effectively no file size limitations, and it also has a journal similar to EXT3 and EXT4.
As with most modern file systems, it supports permissions and encryption, amongst other features.
Apple File System (APFS)
The APFS debuts on Mac OS High Sierra and is their latest proprietary file system.
It undoubtedly supports permissions and encryptions at least up to the level of HFS+, and it is thought that duplicate files can be stored without using additional space, with changes to one copy of a file being saved as a delta (the difference between the old file and the new file) to lower space requirements.