Byzantine Fault Tolerance

Stephen M. Reaves

2023-03-10

Notes about Lesson 16 of CS-7210

Required Readings

Practical Byzantine Fault Tolerance

Optional

Summary

Byzantine Failure and Byzantine Generals
Byzantine Fault Tolerance
Practical Byzantine Fault Tolerance
pBFT Algorithm
Byzantine Consensus vs Blockchain

Byzantine Failure and Byzantine Generals

Byzantine failures are where a node fails and continues to behave incorrectly

Could be malicious or arbitrary

Byzantine Fault Tolerance

Goal:

Reach consensus
- Safety, liveness, validity
Tolerate up to f failures
Asynchronous network

Use cryptographic messages to verify message endpoints and to verify the message has not been tampered with.

To tolerate f failures, we need 3f + 1 nodes

To guard against corrupt leader, we need additional checks among participants

To handle [FLP Theorem](/posts/consensusindistributedsystems/ have bounded delay (or eventual synchrony) for liveness

Practical Byzantine Fault Tolerance

pBFT algorithm was introduced by Miguel Castro and Barbara Liskov in 1999

High performance algorithm

Setup:

System is a replicated service
- n = 3f + 1
- Primary/Leader and backups
All operations are within a view
Each replica maintains the service state
- Must be consistent
- State includes:
  - Service State
  - Message Log
  - Current View
Communication integrity guaranteed cryptographically
- Digests
- Public keys

Quorum = n - f nodes

pBFT Algorithm

Client request -> pre-prepare -> prepare -> commit -> reply

PrePrepare

Primary node multicasts pre-prepare piggybacked with message to backups

The message is stored in the log

pre-prepare request includes

sequence number
digest of message
signature

Backups accept pre-prepare request if:

Signature and digest are correct
View is correct
Sequence number is new
Sequence number is between two watermarks

Then replica enters prepare phase

Prepare

Multicasts prepare message and adds the message to log, then waits for 2f matching prepares which have same view, sequence number, and digest. Then it moves to commit stage.

Commit

Replica multicasts a commit message and adds the message to log, waits for 2f +1 commit messages, then sends reply to client.

Byzantine Consensus vs Blockchain

Blockchain’s distributed ledger is similar to Paxos’s replicated log

Can we build Blockchain network with just pBFT?

Pros	Cons
Decentralized consensus	Tolerate `f` faults in `N=3f+1`?
Byzantine failures	Communication costs `O(n^3)`
Unreliable network

Not everyone is able to participate in maintaining the ledger

Only the miners that have Proof-of-Work

There are also incentives for good behavior