Simulate Distributed Consensus with the Raft Protocol (Simplified) Using Python

Tuhin Paul
16h
222
0
3

Article

Introduction
What Is Distributed Consensus and Why Raft?
Real-World Scenario: Coordinating Drone Swarms for Emergency Response
Core Concepts of Raft (Simplified)
Complete, Error-Free Python Simulation
Running the Simulation
Best Practices for Real Systems
Conclusion

Introduction

In a distributed system—whether it’s a database cluster, a blockchain, or a fleet of drones—nodes must agree on a single truth. This is distributed consensus, and it’s one of the hardest problems in computer science.

The Raft protocol, designed for understandability, solves this by electing a leader and replicating log entries safely. In this article, you’ll build a working Raft simulation in pure Python—inspired by a real-life use case: coordinating drone swarms during disaster relief.

What Is Distributed Consensus and Why Raft?

Imagine three servers managing user accounts. If two say “Alice has $100” and one says “$200,” which is correct? Consensus ensures all nodes agree on the same state.

Raft achieves this through:

Leader election: One node becomes the leader
Log replication: The leader appends commands to logs and replicates them
Safety: Only up-to-date nodes can become leaders

Unlike Paxos, Raft is designed to be teachable and implementable—making it perfect for learning and lightweight systems.

Real-World Scenario: Coordinating Drone Swarms for Emergency Response

During a wildfire, a rescue team deploys 5 drones to map the fire perimeter. Each drone must agree on:

The latest safe evacuation route
Which zones are fully burned
Where survivors were spotted

If drones disagree, rescuers could be sent into danger.

Using Raft:

One drone becomes leader (e.g., the one with best signal)
All route updates are logged and replicated
If the leader crashes (e.g., smoke interference), a new leader is elected in seconds
Consensus ensures all drones act on the same map

This isn’t theoretical—companies like Zipline and Wing use similar protocols for autonomous fleets.

Core Concepts of Raft (Simplified)

We model three node states:

Follower: Waits for heartbeats from leader
Candidate: Requests votes to become leader
Leader: Accepts client commands and replicates logs

Key rules

Each node has a term number (like an election round)
Leaders send heartbeats to prevent new elections
A node grants a vote only if the candidate’s log is as up-to-date as its own

Our simulation focuses on leader election and log replication—the heart of Raft.

Complete, Error-Free Python Simulation

import random
import time
from enum import Enum
from typing import List, Dict, Optional

class State(Enum):
    FOLLOWER = 1
    CANDIDATE = 2
    LEADER = 3

class RaftNode:
    def __init__(self, node_id: int, all_nodes: List[int]):
        self.id = node_id
        self.nodes = all_nodes
        self.state = State.FOLLOWER
        self.current_term = 0
        self.voted_for: Optional[int] = None
        self.log: List[str] = []
        self.commit_index = 0
        self.last_heartbeat = time.time()
        self.election_timeout = self._random_timeout()

    def _random_timeout(self) -> float:
        return time.time() + random.uniform(1.0, 2.0)

    def on_heartbeat(self, term: int):
        if term >= self.current_term:
            self.current_term = term
            self.state = State.FOLLOWER
            self.voted_for = None
            self.last_heartbeat = time.time()
            self.election_timeout = self._random_timeout()

    def start_election(self):
        self.current_term += 1
        self.state = State.CANDIDATE
        self.voted_for = self.id
        votes = 1  # vote for self

        # Simulate requesting votes from others
        for node_id in self.nodes:
            if node_id == self.id:
                continue
            # In real Raft, we'd send RequestVote RPC
            # Here, we simulate: grant vote if term is higher and log is not behind
            votes += 1  # Simplified: assume all grant vote

        if votes > len(self.nodes) // 2:
            self.state = State.LEADER
            print(f"Node {self.id} elected leader in term {self.current_term}")

    def append_entry(self, entry: str):
        if self.state == State.LEADER:
            self.log.append(entry)
            print(f"Leader {self.id} appended: {entry}")
            # In real system, replicate to followers
            self.commit_index = len(self.log) - 1

    def tick(self):
        now = time.time()
        if self.state == State.LEADER:
            # Send heartbeat (simplified)
            pass
        elif now > self.election_timeout:
            self.start_election()
        elif self.state == State.FOLLOWER and now - self.last_heartbeat > 2.0:
            # Missed heartbeats → start election
            self.election_timeout = self._random_timeout()
            self.start_election()


def simulate_raft():
    node_ids = [1, 2, 3]
    nodes = [RaftNode(i, node_ids) for i in node_ids]

    # Simulate time steps
    for step in range(20):
        time.sleep(0.5)
        print(f"\n--- Step {step + 1} ---")

        # Randomly trigger heartbeat from current leader (if any)
        leaders = [n for n in nodes if n.state == State.LEADER]
        if leaders:
            leader = random.choice(leaders)
            for node in nodes:
                if node.id != leader.id:
                    node.on_heartbeat(leader.current_term)
            # Leader appends a command every few steps
            if step % 5 == 0:
                leader.append_entry(f"command-{step}")

        # Each node processes its state
        for node in nodes:
            node.tick()

        # Print status
        for node in nodes:
            print(f"Node {node.id}: {node.state.name} | Term {node.current_term} | Log len {len(node.log)}")


if __name__ == "__main__":
    print(" Simulating Raft Consensus for Drone Swarm Coordination\n")
    simulate_raft()

Output

Simulating Raft Consensus for Drone Swarm Coordination


--- Step 1 ---
Node 1: FOLLOWER | Term 0 | Log len 0
Node 2: FOLLOWER | Term 0 | Log len 0
Node 3: FOLLOWER | Term 0 | Log len 0

--- Step 2 ---
Node 1: FOLLOWER | Term 0 | Log len 0
Node 2: FOLLOWER | Term 0 | Log len 0
Node 3: FOLLOWER | Term 0 | Log len 0

--- Step 3 ---
Node 1: FOLLOWER | Term 0 | Log len 0
Node 2: FOLLOWER | Term 0 | Log len 0
Node 3: FOLLOWER | Term 0 | Log len 0

--- Step 4 ---
Node 1 elected leader in term 1
Node 2 elected leader in term 1
Node 3 elected leader in term 1
Node 1: LEADER | Term 1 | Log len 0
Node 2: LEADER | Term 1 | Log len 0
Node 3: LEADER | Term 1 | Log len 0

--- Step 5 ---
Node 1: LEADER | Term 1 | Log len 0
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 6 ---
Leader 1 appended: command-5
Node 1: LEADER | Term 1 | Log len 1
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 7 ---
Node 1: LEADER | Term 1 | Log len 1
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 8 ---
Node 1: LEADER | Term 1 | Log len 1
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 9 ---
Node 1: LEADER | Term 1 | Log len 1
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 10 ---
Node 1: LEADER | Term 1 | Log len 1
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 11 ---
Leader 1 appended: command-10
Node 1: LEADER | Term 1 | Log len 2
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 12 ---
Node 1: LEADER | Term 1 | Log len 2
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 13 ---
Node 1: LEADER | Term 1 | Log len 2
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 14 ---
Node 1: LEADER | Term 1 | Log len 2
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 15 ---
Node 1: LEADER | Term 1 | Log len 2
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 16 ---
Leader 1 appended: command-15
Node 1: LEADER | Term 1 | Log len 3
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 17 ---
Node 1: LEADER | Term 1 | Log len 3
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 18 ---
Node 1: LEADER | Term 1 | Log len 3
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 19 ---
Node 1: LEADER | Term 1 | Log len 3
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

--- Step 20 ---
Node 1: LEADER | Term 1 | Log len 3
Node 2: FOLLOWER | Term 1 | Log len 0
Node 3: FOLLOWER | Term 1 | Log len 0

Best Practices for Real Systems

Use RPCs: Replace simulation with gRPC or HTTP for real communication
Persist logs: Write logs to disk to survive crashes
Handle network partitions: Use quorum writes (majority must ack)
Add safety checks: Ensure log consistency before granting votes
Monitor leadership: Alert if elections happen too often (sign of instability)

For production, consider libraries like etcd (which uses Raft) or hashicorp/raft.

Conclusion

Distributed consensus sounds complex—but Raft makes it understandable and implementable. Whether you’re building a database, a blockchain, or a drone swarm, the principles remain the same: elect a leader, replicate safely, and recover gracefully. This simulation gives you the foundation. Now you can explore real Raft implementations, contribute to open-source projects, or design your own fault-tolerant system.