Replica Sets: The Heart of Availability

In a production environment, running a standalone MongoDB instance is a recipe for disaster. If the server crashes, your application goes down. To solve this, MongoDB uses Replica Sets.

1. What is a Replica Set?

A Replica Set is a group of mongod processes that maintain the same data set. It provides redundancy (data copies) and high availability (automatic failover).

The Architecture

A standard Replica Set consists of:

  • One Primary: The only node that accepts Write operations. It records all changes in its Oplog.
  • Multiple Secondaries: These nodes replicate the Primary’s oplog and apply the operations to their own data set. They can serve Read requests if configured.
  • Arbiters (Optional): Nodes that do not hold data. They participate in elections to ensure a quorum but cannot become Primary.
Application Primary Reads & Writes Secondary Replicates Oplog Secondary Replicates Oplog Writes Replication

2. The Oplog: How Replication Works

The heartbeat of replication is the Oplog (local.oplog.rs). It is a capped collection (fixed size) that records all operations that modify data.

  1. Client sends a write (e.g., insert, update, delete) to the Primary.
  2. Primary applies the write to its data and records the operation in the oplog.
  3. Secondaries tail the Primary’s oplog and apply the operations to their own datasets.

[!NOTE] Oplog entries are idempotent. Applying the same oplog entry multiple times results in the same state. For example, an “inc” (increment) operation is stored as “set n to n+1” so it can be replayed safely.

Oplog Window

The Oplog Window is the time difference between the oldest and newest entry in the oplog.

  • If your oplog is 50GB and you write 1GB/hour, your window is 50 hours.
  • Danger: If a Secondary goes offline for longer than the oplog window, it falls off the oplog and must perform a full Initial Sync (copying all data from scratch), which is expensive.
Oldest Entry (T-50h) Newest Entry (Now) Deleted Entries Active Oplog Window

3. Automatic Failover & Elections

Replica sets use an election protocol (based on Raft) to ensure a Primary is always available.

  • Heartbeats: Members ping each other every 2 seconds.
  • Timeout: If a Primary doesn’t respond for 10 seconds, Secondaries mark it as unreachable.
  • Election: Eligible Secondaries nominate themselves. The node with the most up-to-date oplog usually wins.

Interactive: Replica Set Simulator

Visualize how elections work when a node fails.

PRIMARY
👑
Node 1
Vote: 1
SECONDARY
💾
Node 2
Vote: 1
SECONDARY
💾
Node 3
Vote: 1
> Cluster healthy. Node 1 is Primary.

4. Connecting to a Replica Set

When connecting, you must provide the Connection String with multiple nodes. The driver will automatically discover the current Primary.

Read Preferences

You can offload read traffic to Secondaries using readPreference.

  • primary: (Default) Reads only from Primary. Strong consistency.
  • primaryPreferred: Reads from Primary, but fails over to Secondary if Primary is down.
  • secondary: Reads only from Secondaries. Eventual consistency.
  • secondaryPreferred: Reads from Secondary, but fails over to Primary if no Secondaries are available.
  • nearest: Reads from the node with the lowest network latency.

import com.mongodb.ReadPreference;
import com.mongodb.client.MongoClients;
import com.mongodb.client.MongoClient;
import com.mongodb.client.MongoDatabase;
import com.mongodb.MongoClientSettings;
import com.mongodb.ServerAddress;
import java.util.Arrays;

public class ReplicaSetConnect {
    public static void main(String[] args) {
        // Connection String with multiple seeds
        String uri = "mongodb://node1:27017,node2:27017,node3:27017/?replicaSet=myReplSet";

        // Option 1: Using URI
        try (MongoClient mongoClient = MongoClients.create(uri)) {
             MongoDatabase db = mongoClient.getDatabase("mydb");
             System.out.println("Connected to: " + db.getName());
        }

        // Option 2: Advanced Settings for Read Preference
        MongoClientSettings settings = MongoClientSettings.builder()
            .applyToClusterSettings(builder ->
                builder.hosts(Arrays.asList(
                    new ServerAddress("node1", 27017),
                    new ServerAddress("node2", 27017),
                    new ServerAddress("node3", 27017))))
            // Prefer reading from Secondaries for analytics
            .readPreference(ReadPreference.secondaryPreferred())
            .build();

        try (MongoClient client = MongoClients.create(settings)) {
            // This read will go to a secondary
            client.getDatabase("analytics").getCollection("logs").countDocuments();
        }
    }
}
package main

import (
    "context"
    "fmt"
    "log"
    "time"

    "go.mongodb.org/mongo-driver/mongo"
    "go.mongodb.org/mongo-driver/mongo/options"
    "go.mongodb.org/mongo-driver/mongo/readpref"
)

func main() {
    // 1. Basic Connection String
    uri := "mongodb://node1:27017,node2:27017,node3:27017/?replicaSet=myReplSet"

    // 2. Configure Client with Read Preference
    // ModeSecondaryPreferred: Read from secondary if available, else primary
    rp, err := readpref.New(readpref.SecondaryPreferred)
    if err != nil {
        log.Fatal(err)
    }

    clientOpts := options.Client().
        ApplyURI(uri).
        SetReadPreference(rp)

    ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
    defer cancel()

    client, err := mongo.Connect(ctx, clientOpts)
    if err != nil {
        log.Fatal(err)
    }
    defer client.Disconnect(ctx)

    // Verify connection
    err = client.Ping(ctx, rp)
    if err != nil {
        log.Fatal("Could not ping cluster:", err)
    }

    fmt.Println("Connected to Replica Set with SecondaryPreferred!")
}

5. Summary

  • Replica Sets provide redundancy and automatic failover.
  • The Primary handles all writes; Secondaries replicate via the Oplog.
  • Elections happen automatically when the Primary fails.
  • Use Read Preferences to scale reads, but beware of stale data (Eventual Consistency).