K Closest Points to Origin
[!NOTE] This problem is the geometric equivalent of “Top K Elements”. Instead of raw values, the priority is defined by the Euclidean Distance from $(0,0)$.
1. Problem Definition
Given an array of points where points[i] = [xi, yi] represents a point on the X-Y plane and an integer k, return the k closest points to the origin $(0, 0)$.
The distance between two points $(x_1, y_1)$ and $(x_2, y_2)$ is defined as $\sqrt{(x_1 - x_2)^2 + (y_1 - y_2)^2}$. Since we only need to compare distances, we can use the squared distance: $x^2 + y^2$.
Example:
- Input:
points = [[1,3],[-2,2]],k = 1 - Output:
[[-2,2]] - Explanation:
- Distance to (1,3): $\sqrt{1^2 + 3^2} = \sqrt{10}$
- Distance to (-2,2): $\sqrt{(-2)^2 + 2^2} = \sqrt{8}$
- $\sqrt{8} < \sqrt{10}$, so
[-2,2]is closer.
2. Interactive Analysis
Visualize points on a coordinate plane and see how a Max-Heap of size $K$ maintains only the closest candidates.
Max-Heap (Farthest at top)
3. Intuition
Pattern Recognition
This is a Top K Elements problem. Whenever you need the “K closest”, “K largest”, or “K most frequent” items, your mind should immediately go to Sorting or Heaps.
The Core Math
To compare distances from $(0,0)$, we use the Euclidean formula: $\sqrt{x^2 + y^2}$.
[!TIP] Optimization: Calculating the square root is expensive and unnecessary for comparison. $a^2 + b^2 < c^2 + d^2$ implies $\sqrt{a^2 + b^2} < \sqrt{c^2 + d^2}$. We use squared distance to stay in the integer domain.
Approach 1: Brute Force (Global Sorting)
… The simplest way is to calculate distances for all $N$ points, sort them, and return the first $K$.
- Algorithm:
Arrays.sort(points, (a, b) -> dist(a) - dist(b)) - Time Complexity: $O(N \log N)$
- Space Complexity: $O(N)$ or $O(\log N)$ depending on sort implementation.
- Why optimize?: We don’t need to perfectly sort the farthest $N-K$ points. We only care about the closest $K$.
Approach 2: Optimized (Max-Heap)
We use a Max-Heap of size $K$ to maintain the $K$ closest points found so far.
- Why a Max-Heap?: In a Max-Heap, the root is always the largest (farthest) point. If we find a new point that is closer than our root, we know our root is no longer in the global “top K closest”.
- Algorithm:
- Iterate through points.
- Add point to Max-Heap.
- If heap size > $K$, pop the root (farthest point).
- Time Complexity: $O(N \log K)$
- Space Complexity: $O(K)$ extra space for the heap.
Approach 3: Quickselect (Advanced)
A variation of the Quicksort partitioning algorithm can find the first $K$ elements in $O(N)$ average time.
Common Pitfalls
- Min-Heap vs. Max-Heap: Learners often use a Min-Heap for “closest” problems. However, to keep the $K$ smallest values using a sliding window, you need a Max-Heap to evict the largest (farthest) elements.
- Memory Overhead: Forgetting that $O(N \log K)$ is only better than $O(N \log N)$ when $K \ll N$. If $K \approx N$, sorting might be faster due to lower constant factors in standard library sorts.
- Comparator Logic: Inverting the comparison logic $f(b) - f(a)$ in the Max-Heap is a frequent source of “Farthest K” instead of “Closest K” bugs.
4. Solutions
class Solution {
/**
* Max-Heap Approach
* Time: O(N log K) | Space: O(K)
*/
public int[][] kClosest(int[][] points, int k) {
// Max-heap: store points, prioritized by distance (descending)
PriorityQueue<int[]> maxHeap = new PriorityQueue<>(
(a, b) -> Integer.compare(distSq(b), distSq(a))
);
for (int[] point : points) {
maxHeap.add(point);
// If we exceed K, throw away the farthest point
if (maxHeap.size() > k) {
maxHeap.poll();
}
}
// Convert to result array
int[][] res = new int[k][2];
while (k > 0) {
res[--k] = maxHeap.poll();
}
return res;
}
private int distSq(int[] p) {
return p[0] * p[0] + p[1] * p[1];
}
}
import (
"container/heap"
)
type Point struct {
x, y int
dist int
}
// Max-Heap implementation
type PointHeap []Point
func (h PointHeap) Len() int { return len(h) }
func (h PointHeap) Less(i, j int) bool { return h[i].dist > h[j].dist } // Max-Heap
func (h PointHeap) Swap(i, j int) { h[i], h[j] = h[j], h[i] }
func (h *PointHeap) Push(x interface{}){ *h = append(*h, x.(Point)) }
func (h *PointHeap) Pop() interface{} {
old := *h
n := len(old)
x := old[n-1]
*h = old[0 : n-1]
return x
}
func kClosest(points [][]int, k int) [][]int {
h := &PointHeap{}
heap.Init(h)
for _, p := range points {
d := p[0]*p[0] + p[1]*p[1]
heap.Push(h, Point{p[0], p[1], d})
if h.Len() > k {
heap.Pop(h)
}
}
res := make([][]int, k)
for i := 0; i < k; i++ {
p := heap.Pop(h).(Point)
res[i] = []int{p.x, p.y}
}
return res
}
5. Complexity Analysis
| Strategy | Time | Space | Hardware Reasoning |
|---|---|---|---|
| Sort | $O(N \log N)$ | $O(\log N)$ | High overhead for large $N$ if we only need $K \ll N$. |
| Max-Heap | $O(N \log K)$ | $O(K)$ | Efficient for streams or when memory is tight. |
| Quickselect | $O(N)$ avg | $O(1)$ | Best time complexity, but worst-case is $O(N^2)$. |