Reaves.dev

v0.1.0

built using

Phoenix v1.7.17

Synchronization

Stephen M. Reaves

::

2024-02-01

Notes about Lecture 4b for CS-6210

Summary

Primitives

Lock makes sures it is the only thing accessing a shared piece of data

Barrier Synchronization limits execution of all threads until everybody reaches a particular point in execution.

R/W locks allow one writer xor multiple readers

Atomic Operations

We need to be able to Read-Modify-Write atomically to use Locks properly

Scalability Issues

Latency

Waiting Time

Contention

Spinlock

Naive Spinlock

Spin on Test and Set

L can either be locked or unlocked

while (test_and_set(L) == locked);

Caching Spinlock

Spin on read

In cache-coherent systems, the cache needs to be updated after memory writes, so threads can spin on cached value of lock.

while (L == locked);
if (test_and_set(L) == locked) go_back();

In write-invalidate cache coherence systems, acquiring the lock will result in n*n operations

Spinlocks With Delay

while (L == locked || test_and_set(L) == locked) {
  sleep(rand() % 5); // or whatever number
}

Exponential delay increases the amount of time we wait

Ticket Lock

More concerned with fairness

Assign tickets like at restaurants

struct lock {
  int next_ticket;
  int now_serving;
}

void acquire_lock(lock L) {
  int my_ticket = fetch_and_inc(L->next_ticket);

  while (L->now_serving != my_ticket) {
    sleep(my_ticket - L->now_serving);
  }

  return;
}

void release_lock(lock L) {
  L->now_serving++;

  return;
}

Spinlock Summary

Ideally we would only signal one thread and not have all threads fight at once.

Queuing Lock

Array Based Queuing Lock

Associated with each lock, an array of flags (queue, ring buffer)

void lock(L l) {
  my_place = fetch_and_inc(queulast);
  while (flags[my_place % N] == must_wait);
}

void unlock(L l) {
  flag[current % N] = must_wait;
  flag[current +1 % N] = has_lock;
}

Only one atomic operation per critical section

First come, first serve

No contention due to my_place being distinct per thread

Need O(N) memory for each lock in algorithm

Linked List Based Queuing Lock

For each lock

struct node {
  bool got_it;
  node next;
}

Head is a dummy node pointing to nil

Dummy node always points to last-requester. Currently running node points to next, which may or may not be last-requester

Waiting nodes spin on got_it flag. When node is done, set next->got_it = true;

// How to make this atomic?
void join(L l, node me) {
  l->last_requester->next = me;
  l->last_requester = me;
}

/*
void atomic_fetch_and_store(l, me) {
  return what was stored in l
  store me in l
  node ret = l
  l = me;
  return ret
}
  */

void lock(L l, node me) {
  join(l, me);
  while (me.got_it == false);
}

// Race conditions
void unlock(L l, node curr) {
  if (curr->next == NULL) {
    // atomic
    if (l->last_requester = me){
      l->last_requester = NULL;
    }
    // comp_and_swap(L, me, nil)
    // end atomic
  } else {
    curr->next.got_it = true;
  }
  free(curr);
}

Fair lock

Unique spin location

Mostly 1 atomic operation per critical section

Space complexity is relative to nodes actively waiting on lock

linked list maintenance can be a chore