Reaves.dev

v0.1.0

built using

Phoenix v1.7.17

Global Memory Systems

Stephen M. Reaves

::

2024-03-29

Notes about Lecture 7a for CS-6210

Summary

Context for Global Memory System

Gp1Pm1memp1->m1d1p1->d1p2Pm2memp2->m2d2p2->d2p3Pm3memp3->m3d3p3->d3LANLANLAN->p1LAN->p2LAN->p3

Memory Pressure:

Memory Manager:

Global Memory System:

GSM Basics

Cache refers to physical memory (DRAM), not processor cache

Sense of Community to handle page faults at a node

Physical memory can be broken up into Local, Working memory or Global, Spare memory

Handling Page Faults Case 1

Common case:

Gcluster_pHost Pcluster_qHost Ql1Localg1global [y]g2global [x]g1->g2putl2Localg2->l1get

Adding X to local set, means we need to kick Y out of global section

Handling Page Faults Case 2

Common case with memory pressure at P

Gcluster_pHost Pcluster_qHost Ql1Localg2global [x]l1->g2putl2Localg2->l1get

Handling Page Faults Case 3

Faulting Page on disk:

Gcluster_pHost Pcluster_qHost Ql1Localg1global [y]g2global [x]g1->g2putl2Localdl2->deitherg2->dord->l1get

Send page being swapped out to node that has the globally oldest page

Handling Page Faults Case 3

Gcluster_pHost Pcluster_qHost Qcluster_rHost Rl1Localg1global [y]g3global [z]g1->g3putl2Locall2->l1getg2global [x]l3Localdl3->deitherg3->dor

Faulting page actively shared:

Behavior of Algorithm

overtime, idle nodes become memory servers

globalMemorySystemBehaviour

Geriatrics

Epoch parameters:

Pick a manager per epoch

Each Epoch:

Each node is given a weight for all the pages. The node with the highest weight has the highest number of pages that are going to be replaced. This means that node i relatively idle (from a memory perspective) and should be the next initiator.

Action at a node on page fault:

Think Globally, Act Locally

Implementation in Unix

GvmmVM Managerwriteread freeflFree listvmm:3->fldnfsDisk NFSvmm:2->dnfs:wvmm:s->dnfs:wgmsGMSvmm:s->gmsget/put pageubcUnified Buffer Cachewriteread freeubc:3->flubc:2->dnfs:eubc:s->dnfs:eubc:s->gmsget/put pagepfPage faultspf->vmm:1pfsPage Fault, FS Read/Writepfs->ubc:1podPage out daemonpod->vmm:npod->ubc:npod->gmsgms->flalloc/free pagegms->dnfs:wrgmsRemote GMSgms->rgms

GMS Integrated with DEC OSF/1

Maintaing age is tricky

Data Structures

Virtual Address -> UID (IP_Addr/disk_partition/i-node/offset)

3 main data structures:

Putting the Data Structures to Work

Common Case:

Gcluster_aNode Acluster_bNode Bcluster_cNode Cvavauiduidva->uidPODPODuid->PODGCDGCDPOD->GCDUIDPFDPFDGCD->PFDUIDPFD->PODhit

What about misses?

Page Eviction

Gcluster_aNode Acluster_bNode Bcluster_cNode CPODPODGCDGCDPOD->GCDUpdate (UID,c)PFDPFDPOD->PFDPut page (UID)

Paging Daemon: