LSI vs GSI

The biggest limitation of DynamoDB’s core architecture is that you can only query efficiently using your Primary Key. If you need to access your data by any other attribute—like finding a User by Email instead of UserID, or listing Orders by Status—you need Secondary Indexes.

Secondary Indexes are essentially “shadow copies” of your data that allow you to query it using a different key structure. There are two types:

  1. Global Secondary Index (GSI)
  2. Local Secondary Index (LSI)

Choosing the right one is critical for performance, cost, and scalability.


1. Global Secondary Index (GSI)

A GSI allows you to query data across all partitions using a completely different Partition Key (and optional Sort Key). It is “Global” because it can span the entire dataset, regardless of where the original item lives.

Key Characteristics

  • Scope: Uses a different Partition Key from the base table.
  • Scalability: Unlimited size. It partitions and scales independently of the base table.
  • Consistency: Supports Eventual Consistency only. Replication from the base table to the GSI is asynchronous (usually sub-second latency, but not guaranteed).
  • Throughput: Has its own Provisioned Throughput (RCU/WCU) separate from the base table.
  • Flexibility: Can be created or deleted at any time.

Interactive: The GSI “Flip”

Visualizing how a GSI “flips” your access pattern.

Base Table (PK: UserID)
PK: USR#1 Email: a@b.com
GSI (PK: Email)
PK: a@b.com PK: USR#1
Choose a query path.

2. Local Secondary Index (LSI)

An LSI allows you to query data in the same partition using a different Sort Key. It is “Local” because it is scoped to the base table’s Partition Key.

Key Characteristics

  • Scope: Must share the same Partition Key as the base table.
  • Consistency: Supports Strong Consistency. If you write to the table and immediately read from the LSI with ConsistentRead: true, you get the latest data.
  • Creation: Must be defined at table creation time. You cannot add LSIs later.
  • Throughput: Consumes the base table’s WCU and RCU.

[!WARNING] The 10GB Trap: The most dangerous limitation of an LSI is that the total size of all items (Base + LSIs) for a single Partition Key cannot exceed 10 GB. If you hit this limit, writes will fail. This is why LSIs are rarely used in modern designs unless Strong Consistency on a non-primary attribute is an absolute hard requirement.


3. Comparison: LSI vs GSI

Feature Local Secondary Index (LSI) Global Secondary Index (GSI)
Partition Key Same as Base Table Can be different
Sort Key Different Can be different
Size Limit 10 GB per Partition Key Unlimited
Consistency Strong or Eventual Eventual Only
Throughput Shares Base Table’s Units Independent Units
Creation At Table Creation Only Any Time

Interactive: Strategy Selector

Answer three questions to get a recommendation.

1. Do you need Strong Consistency on the index query?

2. Is the table already created?

3. Will any partition exceed 10GB?

Recommendation:

Select options above to get a recommendation.


4. Projection Strategies

When creating an index, you must decide how much data to copy from the base table. This is called Projection.

  1. KEYS_ONLY:
    • What: Copies only the index keys and base table keys.
    • Use Case: Checking existence or fetching IDs to lookup later.
    • Cost: Lowest storage and write cost.
  2. INCLUDE:
    • What: Copies keys + specific attributes you choose.
    • Use Case: Covering specific queries (e.g., “Find user by email and return Name and Role”).
    • Cost: Moderate.
  3. ALL:
    • What: Copies the entire item.
    • Use Case: Avoids fetching from the base table entirely.
    • Cost: Highest. Doubles your storage and Write Capacity costs.

Interactive: Cost Estimator

Select a projection type to see the impact on Write Capacity Units (WCU).

WCU Multiplier
1.0x
Base Cost

5. Code Implementation

Java: Creating a Table with GSI

Using the AWS SDK for Java v2.

import software.amazon.awssdk.services.dynamodb.model.*;

public class CreateTableGSI {
    public static CreateTableRequest buildRequest() {
        // 1. Define Attributes
        AttributeDefinition pk = AttributeDefinition.builder()
            .attributeName("PK").attributeType(ScalarAttributeType.S).build();
        AttributeDefinition sk = AttributeDefinition.builder()
            .attributeName("SK").attributeType(ScalarAttributeType.S).build();
        AttributeDefinition email = AttributeDefinition.builder()
            .attributeName("Email").attributeType(ScalarAttributeType.S).build();

        // 2. Define GSI
        GlobalSecondaryIndex emailIndex = GlobalSecondaryIndex.builder()
            .indexName("EmailIndex")
            .keySchema(
                KeySchemaElement.builder().attributeName("Email").keyType(KeyType.HASH).build()
            )
            .projection(Projection.builder().projectionType(ProjectionType.ALL).build())
            .build();

        // 3. Build Request
        return CreateTableRequest.builder()
            .tableName("Users")
            .attributeDefinitions(pk, sk, email)
            .keySchema(
                KeySchemaElement.builder().attributeName("PK").keyType(KeyType.HASH).build(),
                KeySchemaElement.builder().attributeName("SK").keyType(KeyType.RANGE).build()
            )
            .globalSecondaryIndexes(emailIndex)
            .billingMode(BillingMode.PAY_PER_REQUEST)
            .build();
    }
}

Go: Querying a GSI

Using the AWS SDK for Go v2.


package main

import (
	"context"
	"fmt"
	"log"

	"github.com/aws/aws-sdk-go-v2/aws"
	"github.com/aws/aws-sdk-go-v2/service/dynamodb"
	"github.com/aws/aws-sdk-go-v2/service/dynamodb/types"
)

func FindUserByEmail(client *dynamodb.Client, email string) {
	out, err := client.Query(context.TODO(), &dynamodb.QueryInput{
		TableName:              aws.String("Users"),
		IndexName:              aws.String("EmailIndex"),
		KeyConditionExpression: aws.String("Email = :e"),
		ExpressionAttributeValues: map[string]types.AttributeValue{
			":e": &types.AttributeValueMemberS{Value: email},
		},
	})

	if err != nil {
		log.Fatalf("Query failed: %v", err)
	}

	fmt.Printf("Found %d users with email %s\n", out.Count, email)
}