Distributed File Systems
Summary
- NFS
- DFS
- Dynamic Management
- Log Based Striping and Stripe Groups
- Unix File System
- Client Reading a File
- Client Writing a File
NFS
Built by Sun Microsystems in 1985
DFS
No central server
- Each file distributed across several nodes
Client/Server roles are interchangeable
- Especially when clients have their own cache
Cooperative Cache
Preliminaries (Striping a file to multiple disks)
Increase I/O Bandwidth by using parallel disks
Failure protection by ECC
Drawbacks:
- Costs
- Small writes are innefficient
Preliminaries (Log structured file system)
Buffer changes to multiple files in one contiguous log segment data structure
Push log segement to disk once it fills up or periodically
Only logs are written to disk
- Creates holes where the same block was overwritten
- Needs to be cleaned periodically
Preliminaries (Softare RAID)
Zebra file system (UC Berkeley)
- Combines LFS and RAID
- Use commodity hardware
- stripe by segment on multiple nodes’s disks in software
Putting them all together
xFS
- A DFS
- log based striping (from Zebra)
- Cooperative caching (from priper UCB work)
- Dynamic Management of data and metadata
- Subsetting storage servers
- Distributed log cleaning
Dynamic Management
Traditional NFS with centralized servers
- memory contents
- metadata
- file cache
- client caching directory
XFS
- metadata dynamically distributed
- cooperative file caching
Log Based Striping and Stripe Groups
Changes that a client makes to a file are written to an append-only log in-memory
Periodically flush log to (a subset) storage servers.
Stripe Group
Subset servers into stripe groups
Parallel client activities
Increased availability
Efficient log cleaning
Cooperative Caching
Cache coherence
- Single writer, multiple readers
- file block unit of coherence
On write-request, an invalidation is sent.
- Manager provides (and can revoke) a token
Read requests revoke write tokens
Log Cleaning
Remove old blocks
data | segment | |
---|---|---|
1’ 2’ 5’ | segment 1 | Writes to file blocks 1, 2, 5 |
1’’ 3’ 4’ | segment 2 | Block 1 overwritten (kill old block), writes to block 3 & 4 |
2’’ | segment 3 | Block 2 overwritten (kill old block) |
Coalesce to 5' 1'' 3' 4' 2''
- New segment
- GC old segments
- All clients/servers responsible for GC
Unix File System
XFS Data Structures
mmap is replicated to all nodes
Client Reading a File
Client Writing a File
Client notifies mgr on write
Periodically flush log segment to stripe group