Back of the Envelope Estimation
Don’t Guess, Estimate.
You are in an interview. The interviewer asks: “Design Instagram. By the way, how much storage do we need for 5 years?” Do you panic? Do you guess “100 Terabytes”? No. You pull out a napkin (or the whiteboard) and do Back of the Envelope Estimation.
Google’s Jeff Dean (the architect behind MapReduce, BigTable, Spanner) famously said that every engineer should know the “latency numbers” by heart. Why? Because designing a system without knowing the numbers is like building a bridge without knowing how much a truck weighs.
1. The Estimation Workflow
A successful estimate isn’t about being 100% accurate; it’s about being within the right Order of Magnitude.
2. Latency Numbers Every Programmer Should Know
You don’t need to memorize exact nanoseconds, but you MUST know the orders of magnitude.
| Operation | Time (Approx) | Human Equivalent (if 1 CPU cycle = 1 sec) |
|---|---|---|
| L1 Cache | 0.5 ns | Heartbeat (0.5 s) |
| Main Memory (RAM) | 100 ns | Brushing Teeth (1.5 min) |
| SSD Random Read | 150 μs | Weekend Trip (1.7 days) |
| Round Trip (Same Data Center) | 500 μs | Week-long Vacation (6 days) |
| Disk Seek | 10 ms | Semester in College (4 months) |
| Packet CA → NL | 150 ms | 5 Years |
Interactive: The Latency Time Machine
Drag the slider to see how “Computer Time” translates to “Human Time”. If you can’t avoid the disk, you are already “months” late.
3. Interactive: The Dynamic Capacity & Cost Planner
In an interview, you need to calculate QPS and Storage fast. Bonus points if you can estimate the AWS Bill.
4. The Power of 2 (The Magic Numbers)
In System Design, we approximate everything to powers of 2. It simplifies math significantly.
[!TIP] Pro Tip: Memorize that 210 ≈ 103 (1000). This is the key to converting between binary and decimal.
| Power | Approximation | Unit |
|---|---|---|
| 210 | 1 Thousand (103) | 1 KB |
| 220 | 1 Million (106) | 1 MB |
| 230 | 1 Billion (109) | 1 GB |
| 240 | 1 Trillion (1012) | 1 TB |
| 250 | 1 Quadrillion (1015) | 1 PB |
Interactive: The Storage Converter
How big is 1 Petabyte really? Enter a value to find out.
5. Common Mistakes to Avoid
Even senior engineers make these errors under pressure.
5.1 Bits (b) vs Bytes (B)
- Network bandwidth is usually in Bits (Gbps).
- Storage is in Bytes (GB).
- Example: If you have a 1 Gbps connection, you can download 1 GB in 8 seconds, not 1 second.
1 Gigabit = 125 Megabytes.
5.2 QPS vs Concurrent Users
- Concurrent Users: Number of people on the site right now.
- QPS: Number of requests hitting the server per second.
- The Trap: 1 million concurrent users ≠ 1 million QPS. If a user clicks once every 10 seconds, that’s only 100k QPS.
6. Walkthrough: Estimating Twitter Storage
Let’s apply this to a real interview question: “Estimate the storage for Twitter for 5 years.”
Step 1: Traffic Assumptions
- DAU: 300 Million.
- Tweets/Day: 2 per user.
- Total Tweets:
300M × 2 = 600Mtweets/day.
Step 2: Size Assumptions
- Tweet ID: 8 Bytes.
- User ID: 8 Bytes.
- Text (140 chars): ~300 Bytes (incl. encoding).
- Media: 10% of tweets have photos (500 KB each).
Step 3: The Calculation
- Text Storage:
600M × 300 Bytes = 180 GB / Day. - Media Storage:
60M × 500 KB = 30 TB / Day. - 5-Year Total (Media):
30 TB × 365 × 5 ≈ 54 PB.
[!IMPORTANT] Conclusion: 54 Petabytes! This immediately tells you that you cannot use a single SQL database. You need a Distributed File System (like HDFS) for media and Sharded Databases for the text.
7. Common Formulas & “Magic Rules”
The QPS Shortcut
- 1 Million requests per day ≈ 12 QPS.
- 10 Million requests per day ≈ 120 QPS.
- 100 Million requests per day ≈ 1200 QPS.
Memory vs. Disk
- If your active dataset fits in RAM, your system will be 1000× faster.
- Standard rule: 20% of data is “hot” (accessed 80% of the time). Cache that 20%.
Summary Checklist
- Know the orders of magnitude for L1, RAM, SSD, and Network.
- Use Powers of 2 for easy mental math.
- Always design for Peak QPS (Average × 5).
- Verify if the data fits in RAM first.