Module Review: Data Modeling
In this module, we shifted from relational thinking to NoSQL access patterns. We covered the Single-Table Design philosophy, how to use Partition and Sort Keys for complex queries, and how to model relationships without joins.
Key Takeaways
- Access Patterns First: You cannot design a table without knowing your queries.
- Single-Table Design: Store related items (User + Orders) in the same partition (
PK) to fetch them in one request. - Partition Key (PK): Determines physical storage location (Hashing). Must have high cardinality.
- Sort Key (SK): Determines order within that partition. Enables range queries.
- Relationships:
- 1:N: Use Item Collections (Same PK).
- M:N: Use Adjacency Lists (Base Table + Inverted GSI).
1. Flashcards
What is an Item Collection?
(Click to flip)
A group of items that share the same Partition Key (PK) but have different Sort Keys (SK).
Why use Single-Table Design?
To reduce network calls (1 request vs N requests) and eliminate CPU-intensive joins at read time.
What is the Adjacency List Pattern?
A pattern for M:N relationships where you store the "link" in the Base Table and "invert" it using a GSI.
What is a Hot Partition?
When a single Partition Key receives too much traffic (read/write), causing throttling on that specific node.
2. Cheat Sheet: Modeling Patterns
| Pattern | Problem Solved | Key Structure |
|---|---|---|
| Simple Primary Key | Key-Value Lookup | PK only. No SK. |
| Composite Primary Key | 1:N Relationship | PK = Parent ID, SK = Child ID. |
| Adjacency List | M:N Relationship | PK = Entity A, SK = Entity B. |
| Inverted Index (GSI) | Reverse Lookup | GSI_PK = SK, GSI_SK = PK. |
| Time Series | Chronological Data | PK = Source ID, SK = Timestamp. |
| Hierarchical Data | Tree Structures | PK = Root ID, SK = Path (e.g. Level1#Level2). |