File System Refresher
Summary
- File System Concept
- Access Rights
- Developer’s Interface
- Allocation Strategies
- Buffer Cache
- Journaling
- Direct Memory Access
File System Concept
Abstractions over where data is located on disk
Key Abstractions:
- File
- Filename
- Directory Tree
Access Rights
Unix-like filesystems provide an owner and a group to each file, then specify rwx for owner, group, and others.
Access Control Lists can be used to extend this.
Developer’s Interface
Two ways to think of a file:
- A position cursor that can move through a file
- A big buffer of data
Position Cursor
int fd = open("file.txt", ...);
read(fd, ...);
write(fd, ...); // Overwrites
lseek(fd, ...); // Moves cursor
close(fd, ...);
MMap
fd = open("file.txt", ...);
buf = mmap(..., fd, ...);
// Manipulate buffer
munmap(buf, ...);
close(fd);
Allocation Strategies
Platter
is full disk
Track
is a circle on the disk
Sector
is an arc on the track
Block
is one or more sectors
- Block id is id of starting sector
There are several ways to keep track of which blocks are free
- List of free blocks
- Bit vector showing busy or free
Ideally FS will allow:
- Simple and Fast file creation
- Flexible size
- Efficient use of space
- Fast sequential access
- Fast random access
Examples:
- FAT
- Ext{2,3,4}
File Allocation Table
Each file is a linked list of blocks
- The table keeps track of if a block is free and its next node
Block Number | Busy | Next |
---|---|---|
0 | 0 | |
1 | 1 | -1 |
2 | 1 | 6 |
This stores all links for all files, but not where files start
- That’s for Directory Table
Directory table has filename, starting block, and metadata (permissions)
- / has fixed location so we know where to start
Inode Structure
Used in Ext{2,3,4} file formats
Inodes are fixed length, and contain metdata, and 15 pointers.
- First 12 pointers point directly to data blocks
- 13th pointer points to a table of more pointers to more data blocks
- Adds one layer of indirection
- 14th pointer points to a table pointing to more tables pointing to disks
- Adds two layers of indirection
- 15th pointer uses triple indirection
Directories are inodes that map to other inodes
Buffer Cache
Contents of disk can be cached in main system memory
- Called
Unified Buffer Cache
Journaling
Journaling reserves contiguous blocks on disk to try and transform random IO into sequential IO
Direct Memory Access
Hard drives have DMA controller