Byzantine Fault Tolerance
Required Readings
Optional
- Blockchain in the Lens of BFT
- Byzantine General’s Problem
- Mapping the Blockchain Project Ecosystem
- The Saddest Moment
Summary
- Byzantine Failure and Byzantine Generals
- Byzantine Fault Tolerance
- Practical Byzantine Fault Tolerance
- pBFT Algorithm
- Byzantine Consensus vs Blockchain
Byzantine Failure and Byzantine Generals
Byzantine failures are where a node fails and continues to behave incorrectly
Could be malicious or arbitrary
Byzantine Fault Tolerance
Goal:
- Reach consensus
- Safety, liveness, validity
- Tolerate up to
f
failures - Asynchronous network
Use cryptographic messages to verify message endpoints and to verify the message has not been tampered with.
To tolerate f
failures, we need 3f + 1
nodes
To guard against corrupt leader, we need additional checks among participants
To handle [FLP Theorem](/posts/consensusindistributedsystems/
have bounded delay
(or eventual synchrony) for liveness
Practical Byzantine Fault Tolerance
pBFT
algorithm was introduced by Miguel Castro and Barbara Liskov in 1999
High performance algorithm
Setup:
- System is a replicated service
n = 3f + 1
- Primary/Leader and backups
- All operations are within a
view
- Each replica maintains the service state
- Must be consistent
- State includes:
- Service State
- Message Log
- Current View
- Communication integrity guaranteed cryptographically
- Digests
- Public keys
Quorum = n - f
nodes
pBFT Algorithm
Client request -> pre-prepare -> prepare -> commit -> reply
PrePrepare
Primary node multicasts pre-prepare
piggybacked with message to backups
The message is stored in the log
pre-prepare
request includes
- sequence number
- digest of message
- signature
Backups accept pre-prepare request if:
- Signature and digest are correct
- View is correct
- Sequence number is new
- Sequence number is between two watermarks
Then replica enters prepare phase
Prepare
Multicasts prepare message
and adds the message to log, then waits for 2f
matching prepares which have same view, sequence number, and digest. Then it
moves to commit stage.
Commit
Replica multicasts a commit message
and adds the message to log, waits for
2f +1
commit messages, then sends reply to client.
Byzantine Consensus vs Blockchain
Blockchain’s distributed ledger is similar to Paxos’s replicated log
Can we build Blockchain network with just pBFT?
Pros | Cons |
---|---|
Decentralized consensus | Tolerate f faults in N=3f+1 ? |
Byzantine failures | Communication costs O(n^3) |
Unreliable network |
Not everyone is able to participate in maintaining the ledger
- Only the miners that have Proof-of-Work
There are also incentives for good behavior