The Search Platform: Beyond the Cluster
[!NOTE] This module explores the core principles of The Search Platform: Beyond the Cluster, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.
1. The Staff Engineer’s View
A Senior Engineer manages a Cluster. A Staff Engineer manages a Platform.
The Platform Components:
- Gateway Service: A proxy (Go/Java) between the App and ES.
- Schema Registry: Git-backed source of truth for mappings.
- Cross-Cluster Replication (CCR): Syncing data between
us-east-1andeu-west-1.
Why a Gateway?
- Protection: Prevent “Kill Queries” (
*.*regex). - Abstraction: Let apps query
/search/usersinstead of knowing index names (users-v1). - Circuit Breaking: Fail fast if ES is overloaded.
2. Blue/Green Deployments (Zero Downtime)
You need to change a mapping (e.g., text to keyword).
You cannot do this in-place.
The Dance:
- Green: Current Index (
users-v1). Aliasuserspoints here. - Blue: Create new Index (
users-v2) with new mapping. - Reindex: Copy data V1 \to V2.
- Swap: Atomic Alias switch.
usersnow points to V2. - Delete: V1.
3. Interactive: Multi-Region Routing
Simulate a global outage and failover.
User
Gateway
US-East (Primary)
ONLINE
EU-West (Failover)
ONLINE
Routing: US-East (Latency: 20ms)
4. Hardware Reality: Cost of Redundancy
- Cross-Cluster Replication: Bandwidth costs ($0.02/GB) apply.
- Storage: 2 Regions = 2x Storage Cost.
- Staff Decision: Is 99.99% uptime worth 2x the bill?
- Tier 1 (User Search): Yes.
- Tier 3 (Internal Logs): No. Only DR backup to S3.