DynamoDB Streams

DynamoDB is not just a storage engine; it is an event source. DynamoDB Streams powers the “reactive” nature of modern serverless applications by providing a time-ordered sequence of item-level changes in your table.

[!IMPORTANT] Streams are the nervous system of DynamoDB. They allow you to decouple your write path (database) from your processing path (analytics, search, notifications), enabling true Change Data Capture (CDC) without dual-writes.


1. How Streams Work

When you enable a Stream on a table, DynamoDB begins capturing information about every modification to data items in that table.

The Architecture of a Stream

  • Retention: Stream records are stored for exactly 24 hours. After that, they are automatically removed.
  • Ordering: DynamoDB guarantees that for any given Item (Primary Key), records appear in the exact order they occurred.
  • Sharding: Just like the table itself, the stream is partitioned into Shards. As your table scales (partitions split), the stream shards split automatically.
  • Deduplication: Streams provide at-least-once delivery. Your consumers must be idempotent.

Stream Lifecycle Visualizer

📝
Table Write
Stream Shard
24h Retention
Consumer (Lambda)

Records flow from the table to the stream shard immediately. Consumers read from the shard using an iterator.


2. Stream View Types

When you enable a stream, you must choose what data is written to it. This is the Stream View Type. This choice impacts cost (throughput) and utility.

View Type Description Use Case
KEYS_ONLY Only the key attributes of the modified item. Cache invalidation, “Kick” notifications where the consumer fetches data.
NEW_IMAGE The entire item, as it appears after it was modified. Replication, Search Indexing (current state).
OLD_IMAGE The entire item, as it appeared before it was modified. Audit logs, Undo functionality.
NEW_AND_OLD_IMAGES Both the new and the old images of the item. Computing deltas (e.g., “Balance changed from 50 to 100”), Complex validation.

Interactive: View Type Selector

Select a view type to see what the Stream Record JSON looks like for an UPDATE operation where status changes from PENDING to ACTIVE.

// Table Item (PK="101")
UPDATE Orders SET status = 'ACTIVE'
{ "EventID": "c4ca4238...", "EventName": "MODIFY", "Dynamodb": { "Keys": { "PK": { "S": "101" } }, "OldImage": { "PK": { "S": "101" }, "status": { "S": "PENDING" } }, "NewImage": { "PK": { "S": "101" }, "status": { "S": "ACTIVE" } }, "StreamViewType": "NEW_AND_OLD_IMAGES" } }

3. Code Implementation

Enabling streams is usually done via Infrastructure as Code (Terraform/CDK), but here is how you might interact with the configuration using the SDK.

import software.amazon.awssdk.services.dynamodb.DynamoDbClient;
import software.amazon.awssdk.services.dynamodb.model.*;

public class EnableStreams {
    public static void main(String[] args) {
        DynamoDbClient ddb = DynamoDbClient.create();

        UpdateTableRequest request = UpdateTableRequest.builder()
            .tableName("Orders")
            .streamSpecification(StreamSpecification.builder()
                .streamEnabled(true)
                .streamViewType(StreamViewType.NEW_AND_OLD_IMAGES)
                .build())
            .build();

        UpdateTableResponse response = ddb.updateTable(request);
        System.out.println("Stream ARN: " +
            response.tableDescription().latestStreamArn());
    }
}
package main

import (
	"context"
	"fmt"
	"github.com/aws/aws-sdk-go-v2/aws"
	"github.com/aws/aws-sdk-go-v2/config"
	"github.com/aws/aws-sdk-go-v2/service/dynamodb"
	"github.com/aws/aws-sdk-go-v2/service/dynamodb/types"
)

func main() {
	cfg, _ := config.LoadDefaultConfig(context.TODO())
	client := dynamodb.NewFromConfig(cfg)

	input := &dynamodb.UpdateTableInput{
		TableName: aws.String("Orders"),
		StreamSpecification: &types.StreamSpecification{
			StreamEnabled:  aws.Bool(true),
			StreamViewType: types.StreamViewTypeNewAndOldImages,
		},
	}

	resp, err := client.UpdateTable(context.TODO(), input)
	if err != nil {
		panic(err)
	}

	fmt.Println("Stream ARN:", *resp.TableDescription.LatestStreamArn)
}

[!TIP] Production Note: Always enable NEW_AND_OLD_IMAGES if you are unsure. It provides the most flexibility (you can see the delta), though it consumes slightly more throughput if your items are very large (400KB limit applies to the entire stream record).


4. Key Metric: IteratorAge

When monitoring Streams, the single most important metric is IteratorAge.

  • Definition: The age of the last record processed by your consumer.
  • Ideal Value: 0 (or milliseconds). This means you are processing events instantly as they happen.
  • High Value: If this climbs to minutes or hours, your Lambda is falling behind the write rate of the table, or it is failing and retrying the same batch repeatedly.

[!CAUTION] If IteratorAge exceeds 24 hours, you will suffer data loss. Records older than 24 hours drop off the stream and are gone forever before your consumer sees them.


5. Summary

  • Streams allow you to build event-driven systems off your database.
  • Strict Ordering is guaranteed per Item Key.
  • View Types control the data payload (Keys vs. Full Images).
  • IteratorAge is your “Health Check” metric.

Next, we will look at the most common consumer for DynamoDB Streams: AWS Lambda.