21. Raft Consensus Algorithm

What is consensus?

Eliminating the single point of failure (e.g. master node) requires either:

Replicated State Machines

State vs replication log: assuming the size of the changes in a reasonable interval (e.g. day) is an order of magnitude or more smaller than the size of the state.

Low level: replicate entire database state, bit-for-bit

High level: transactional level, most common

Raft

Designed to be easy to understand

Visualization of Algorithm

Leader Election

Select one server to act as a cluster leader.

Election process:

If multiple nodes call for an election at the same time, redo the election.

Log Replication

Safety

Only a server with an up-to-date log (entries all committed) can become the leader.

If a system becomes partitioned, the smaller partition will never commit its transactions as it can never reach confirmation from a majority of nodes.

Hence, when the network rejoins, all of the in-progress items will be overwritten.

If the partitioning leads to no partition having more than half the nodes, nothing will be committed.