Recursion Tree Analysis
[!NOTE] The Master Theorem is great, but it has limits. It can’t handle cases like
T(n) = T(n/3) + T(2n/3) + n(uneven splits). For these, we need to draw the Recursion Tree.
The Real-World Hook: The Asymmetric Task Queue
Imagine you are building a distributed data processing system. The main coordinator receives a batch of n records. To optimize processing, it routes 1/3 of the records to a fast, in-memory processing cluster, and the remaining 2/3 of the records to a slower, disk-backed cluster. Before routing, the coordinator does O(n) work to partition the data.
How long does this entire process take as it recursively subdivides? The Master Theorem fails here because the subproblems are uneven. To analyze the true cost of this asymmetry, we must map out the execution visually using a Recursion Tree.
1. The Method
The goal is to visualize the recursive calls as a tree and sum the work done at each level.
- Draw the Tree: Start with the root problem size
n. Break it down into children based on the recurrence. - Calculate Work per Node: Determine the non-recursive work done at each node (e.g.,
n,n<sup>2</sup>,1). - Sum Work per Level: Add up the work for all nodes at the same depth.
- Sum All Levels: Sum the work across all levels (often a Geometric Series).
2. Interactive Tree Builder
Visualize how work propagates through the recursion tree.
Let’s analyze T(n) = 2T(n/2) + n (Merge Sort).
Recursion Tree Builder
3. Example Analysis: T(n) = 2T(n/2) + n<sup>2</sup>
Let’s apply the method step-by-step.
- Level 0 (Root):
- Cost:
n<sup>2</sup> - Nodes: 1
- Cost:
- Level 1:
- Nodes: 2 subproblems of size
n/2. - Cost per node:
(n/2)<sup>2</sup> = n<sup>2</sup> / 4. - Total Level Cost:
2 * (n<sup>2</sup> / 4) = n<sup>2</sup> / 2.
- Nodes: 2 subproblems of size
- Level 2:
- Nodes: 4 subproblems of size
n/4. - Cost per node:
(n/4)<sup>2</sup> = n<sup>2</sup> / 16. - Total Level Cost:
4 * (n<sup>2</sup> / 16) = n<sup>2</sup> / 4.
- Nodes: 4 subproblems of size
The Pattern
The series is: n<sup>2</sup> + n<sup>2</sup>/2 + n<sup>2</sup>/4 + ...
This is a Geometric Series with ratio r = 1/2.
Since r < 1, the sum is dominated by the first term (the root).
This matches Master Theorem Case 3 (Root Heavy).
4. Uneven Splits
What about T(n) = T(n/3) + T(2n/3) + n? Let’s break down our Asymmetric Task Queue example.
- The tree is lopsided. The right branch (
2n/3) is “heavier” and goes deeper. - Shortest Path (Left): Recursion bottoms out when
n / 3<sup>k</sup> = 1. This gives a minimum depth oflog<sub>3</sub> n. - Longest Path (Right): Recursion bottoms out when
n * (2/3)<sup>k</sup> = 1. This gives a maximum depth oflog<sub>3/2</sub> n. - Cost per Level:
- Level 0 (Root):
nwork. - Level 1:
(n/3) + (2n/3) = nwork. - Level 2:
(n/9) + (2n/9) + (2n/9) + (4n/9) = nwork.
- Level 0 (Root):
- The cost is exactly
nfor every level until the shortest branch hits a leaf (at depthlog<sub>3</sub> n). - After the shortest branch ends, the remaining levels have less than
nwork (because some branches have already terminated). However, we can upper-bound the work at these deeper levels ton. - Total Complexity: Bounding the cost by the maximum depth, we have at most
log<sub>3/2</sub> nlevels, each doing at mostnwork. Thus,O(n * log<sub>3/2</sub> n) = O(n log n).
[!TIP] Even with uneven splits, if the sum of fractions (
1/3 + 2/3) equals 1, the work per level is bounded byO(n), leading toO(n log n). If the fractions sum to strictly less than 1 (e.g.,1/3 + 1/4), the tree is root-heavy and complexity is dominated by the root:O(n).
5. Deep Dive Strategy Lab: Recursion Tree Analysis
Intuition Through Analogy: The Corporate Budget
Think of a Recursion Tree like tracking a corporate project budget as it delegates down the organizational chart:
- The Root Node: The CEO receives an initial budget (the input
n) and does some administrative work (thef(n)cost). - The Branches: The CEO delegates chunks of the project to VP branches (
T(n/2),T(n/3), etc.). - The Levels: To find out the total administrative cost of the company, you can sum the overhead row-by-row (level-by-level).
- The Bottleneck: Is the company Top-Heavy (the CEO does most of the work,
O(f(n))), Bottom-Heavy (the base-level employees do most of the work,O(n<sup>log_b(a)</sup>)), or is the work Equally Distributed across the hierarchy (O(f(n) log n))? The recursion tree visually answers this bottleneck question.