The Search Platform: Beyond the Cluster

[!NOTE] This module explores the core principles of The Search Platform: Beyond the Cluster, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. The Staff Engineer’s View

A Senior Engineer manages a Cluster. A Staff Engineer manages a Platform.

The Platform Components:

  1. Gateway Service: A proxy (Go/Java) between the App and ES.
  2. Schema Registry: Git-backed source of truth for mappings.
  3. Cross-Cluster Replication (CCR): Syncing data between us-east-1 and eu-west-1.

Why a Gateway?

  • Protection: Prevent “Kill Queries” (*.* regex).
  • Abstraction: Let apps query /search/users instead of knowing index names (users-v1).
  • Circuit Breaking: Fail fast if ES is overloaded.

2. Blue/Green Deployments (Zero Downtime)

You need to change a mapping (e.g., text to keyword). You cannot do this in-place. The Dance:

  1. Green: Current Index (users-v1). Alias users points here.
  2. Blue: Create new Index (users-v2) with new mapping.
  3. Reindex: Copy data V1 \to V2.
  4. Swap: Atomic Alias switch. users now points to V2.
  5. Delete: V1.

3. Interactive: Multi-Region Routing

Simulate a global outage and failover.

User
Gateway

US-East (Primary)

ONLINE

EU-West (Failover)

ONLINE
Routing: US-East (Latency: 20ms)

4. Hardware Reality: Cost of Redundancy

  • Cross-Cluster Replication: Bandwidth costs ($0.02/GB) apply.
  • Storage: 2 Regions = 2x Storage Cost.
  • Staff Decision: Is 99.99% uptime worth 2x the bill?
  • Tier 1 (User Search): Yes.
  • Tier 3 (Internal Logs): No. Only DR backup to S3.