Consistency Models
Imagine you are building a high-traffic ticketing system. A user sees 1 ticket remaining and clicks “Buy”. A millisecond later, another user sees the same 1 ticket remaining and also clicks “Buy”. If your database isn’t perfectly synchronized across all its servers, you’ve just double-sold a seat.
In a single-server database (like SQLite or a standard PostgreSQL instance), consistency is straightforward. If you write Tickets=0, the immediate next read sees Tickets=0.
However, at DynamoDB’s massive scale, data isn’t stored on just one machine. To ensure high availability and durability, DynamoDB automatically replicates your data across three Availability Zones (AZs) (distinct physical data centers). This physical distance introduces a fundamental law of physics: the speed of light. Data takes time to travel, creating a delay known as Replication Lag.
Because of this lag, you must explicitly choose between Speed (Eventual Consistency) and Correctness (Strong Consistency) for every read operation.
1. The CAP Theorem in Practice
The CAP Theorem states that a distributed data store can only guarantee two out of the following three: Consistency, Availability, and Partition Tolerance. Since network partitions (P) are a given in the cloud, DynamoDB behaves as an AP system by default (Available & Partition Tolerant). However, it allows you to dynamically shift to CP (Consistent & Partition Tolerant) on a per-read basis.
🧠 Analogy: The Restaurant Kitchen Think of DynamoDB as a massive restaurant with three kitchens (Availability Zones). When an order comes in, the Head Chef (Leader node) writes it down and immediately starts cooking. They then yell the order to the two Sous Chefs (Replica nodes). If a waiter asks a Sous Chef what’s cooking before they heard the Head Chef yell, they might give stale information.
Eventual Consistency (The Default)
- Behavior: When you write, DynamoDB acknowledges success as soon as the Leader node writes the data to its disk. Replication to the two replica AZs happens asynchronously (typically taking 10-20ms).
- Risk: If you read from a replica node immediately after writing, you might see old data. This is called a “stale read”.
- Benefit: Fastest read latency (often single-digit milliseconds), highest availability (can read from any healthy node), and lowest cost (0.5 Read Capacity Units (RCU) per 4KB).
- Use Case: Social media feeds, comment sections, video view counts, or any system where a slight delay in data synchronization is acceptable.
Strong Consistency
- Behavior: DynamoDB ensures the read reflects all successful prior writes. It typically routes the read request directly to the Leader node to guarantee freshness.
- Risk: Slightly higher latency. Additionally, if the Leader node is momentarily unavailable or network routing fails, the read request will fail (reducing Availability).
- Cost: Twice as expensive as eventual consistency (1.0 RCU per 4KB).
- Use Case: Financial transactions, inventory counts (like the ticketing system), or enforcing uniqueness (“Check if username taken”).
⚔️ War Story: The Stale Profile Update A social networking app used Eventual Consistency for user profiles. A user updated their profile picture and immediately refreshed the page. The read request hit a replica node that hadn’t received the update yet. The user saw their old picture, assumed the upload failed, and re-uploaded it 5 times in a row, flooding the write pipeline.
The Fix: The engineers modified the client to use Strong Consistency for reads occurring within 5 seconds of a write, and Eventual Consistency otherwise.
2. Interactive: Replication Simulator
Visualize how data propagates across Availability Zones and why “Eventual Consistency” can return stale data. Click Write New Version, then quickly click Read (Random Node) to try and catch a stale read before replication completes.
Replication Lag Visualizer
3. Code Implementation: Consistent Reads
By default, the DynamoDB SDK uses Eventual Consistency. To enforce Strong Consistency, explicitly set the ConsistentRead parameter to true on your read operations (GetItem, Query, or Scan).
Java Implementation
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;
import software.amazon.awssdk.services.dynamodb.model.*;
import java.util.Map;
public class GetItemStronglyConsistent {
public static void main(String[] args) {
DynamoDbClient ddb = DynamoDbClient.create();
GetItemRequest request = GetItemRequest.builder()
.tableName("Orders")
.key(Map.of("OrderId", AttributeValue.builder().s("101").build()))
.consistentRead(true) // <--- The Consistency Toggle
.build();
GetItemResponse response = ddb.getItem(request);
System.out.println("Item: " + response.item());
}
}
Go Implementation
package main
import (
"context"
"fmt"
"log"
"github.com/aws/aws-sdk-go-v2/aws"
"github.com/aws/aws-sdk-go-v2/config"
"github.com/aws/aws-sdk-go-v2/service/dynamodb"
"github.com/aws/aws-sdk-go-v2/service/dynamodb/types"
)
func main() {
cfg, _ := config.LoadDefaultConfig(context.TODO())
svc := dynamodb.NewFromConfig(cfg)
input := &dynamodb.GetItemInput{
TableName: aws.String("Orders"),
Key: map[string]types.AttributeValue{
"OrderId": &types.AttributeValueMemberS{Value: "101"},
},
ConsistentRead: aws.Bool(true), // <--- The Consistency Toggle
}
result, err := svc.GetItem(context.TODO(), input)
if err != nil {
log.Fatalf("Error: %v", err)
}
fmt.Println("Item:", result.Item)
}