Reaves.dev

v0.1.0

built using

Phoenix v1.7.12

Distributed File Systems

Stephen M. Reaves

::

2024-04-04

Notes about Lecture 7c for CS-6210

Summary

NFS

Built by Sun Microsystems in 1985

Glanlanserver1server1lan->server1server2server2lan->server2client1client1client1->lanclient2client2client2->lanclient3client3client3->lanm1file cacheserver1->m1d1server1->d1m2file cacheserver2->m2d2server2->d2

DFS

No central server

Client/Server roles are interchangeable

Preliminaries (Striping a file to multiple disks)

Increase I/O Bandwidth by using parallel disks

Failure protection by ECC

Drawbacks:

Preliminaries (Log structured file system)

Buffer changes to multiple files in one contiguous log segment data structure

Push log segement to disk once it fills up or periodically

Only logs are written to disk

Preliminaries (Softare RAID)

Zebra file system (UC Berkeley)

Putting them all together

xFS

Dynamic Management

Traditional NFS with centralized servers

XFS

Log Based Striping and Stripe Groups

Changes that a client makes to a file are written to an append-only log in-memory

Periodically flush log to (a subset) storage servers.

Stripe Group

Subset servers into stripe groups

Parallel client activities

Increased availability

Efficient log cleaning

Cooperative Caching

Cache coherence

On write-request, an invalidation is sent.

Read requests revoke write tokens

cooperative caching

Log Cleaning

Remove old blocks

datasegment
1’ 2’ 5’segment 1Writes to file blocks 1, 2, 5
1’’ 3’ 4’segment 2Block 1 overwritten (kill old block), writes to block 3 & 4
2’’segment 3Block 2 overwritten (kill old block)

Coalesce to 5' 1'' 3' 4' 2''

Unix File System

{filename, offset}i-node data blocks on disk {\text{filename, offset}} \rightarrow \text{i-node} \rightarrow \text{ data blocks on disk}

XFS Data Structures

XFS Data Structures

mmap is replicated to all nodes

Client Reading a File

Gcluster_mgr_memmgr memoryfofilename+offsetddirectoryfo->ddecfirst access?d->decucunix cachedec->ucnommapmmap (repl)dec->mmapyesldbdata blockuc->ldbmgrmgrmmap->mgrdec2In memory?mgr->dec2pucpeer unix cachepdbdata blockpuc->pdbdec2->pucyesimapimapdec2->imapnossstorage serverinindex node of logseqid of requested data blockss->insgm2stripe group map (repl)in->sgm2ss2storage serversdbdata blockss2->sdbsgmstripe group map (repl)imap->sgmimap->sgm2potential short-cut onsecond access of inodesgm->sssgm2->ss2

Client Writing a File

Client notifies mgr on write

Periodically flush log segment to stripe group