Social Networks & Graph Partitioning

[!NOTE] This module explores the core principles of Social Networks & Graph Partitioning, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. The Hook: The Billion User Graph

A graph with 3 billion nodes (users) and 500 billion edges (friendships) does not fit in RAM. It doesn’t even fit in one Database. How do you query “Friends of Friends” when the data is split across 1,000 servers?


2. Feed Generation: Pull vs. Push

Problem: User A posts. Users B, C, D (friends) need to see it.

Pull Model (Fan-out-on-Load):

  • A writes to A’s DB.
  • B opens feed → Query DB for all friends (A, E, F…) → Sort by time.
  • Pro: Fast write (O(1)).
  • Con: Slow read (O(Friends)). Bad for Justin Bieber (millions of reads).

Push Model (Fan-out-on-Write):

  • A writes. System pushes ID of post to B’s pre-computed “Feed List”, C’s list, D’s list.
  • B opens feed → Read “Feed List” (O(1)).
  • Pro: Blazing fast read.
  • Con: Slow write (O(Friends)). “Thundering Herd” for celebrities.

Hybrid: Push for normal users, Pull for celebrities.


3. Sharding the Graph

How to split the graph?

  1. By User ID: ID % 1000.
    • Pro: Easy.
    • Con: Friends are scattered. “Get Friends” hits 1,000 servers.
  2. Locality Aware: Put friends on the same server.
    • Pro: Fast queries.
    • Con: Hard to maintain. What if you befriend someone in a different “cluster”?

4. Deep Dive Strategy Lab: Social Graphs

Intuition Through Analogy

The Party Room.

  • Sharding by ID: You assign people to rooms randomly. To talk to your friends, you have to run between 10 rooms.
  • Locality Sharding: You put high-school friends in Room A, College friends in Room B. Most conversations stay inside one room.

Mathematical Anchor: Power Law Distribution

Social networks follow a Power Law: P(k) \sim k-\gamma. A few nodes have massive degree (Celebrities). Most have very few. This skew breaks naive sharding and simple algorithms. You must design for the “Whales”, not the average case.