Trie (Prefix Tree)
A Trie (pronounced “try”) or Prefix Tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. It is widely used in applications like:
- Autocomplete: Suggesting words based on a prefix.
- Spell Checkers: Verifying if a word exists in a dictionary.
- IP Routing: Longest prefix matching.
1. Anatomy of a Trie
Unlike a Binary Search Tree, nodes in a Trie do not store the keys associated with them. Instead, a node’s position in the tree defines the key.
- Root: Empty node.
- Children: Each node has up to 26 children (for lowercase English letters).
- IsWord: A boolean flag indicating if the path to this node represents a complete word.
Visualization: Trie Construction
Insert words into the Trie and watch how nodes are created and shared.
Words: []
2. Implementation
We typically use a TrieNode class containing an array of children and a boolean flag.
Java
class Trie {
private class TrieNode {
private TrieNode[] children;
private boolean isEnd;
public TrieNode() {
children = new TrieNode[26];
isEnd = false;
}
}
private TrieNode root;
public Trie() {
root = new TrieNode();
}
// Inserts a word into the trie.
public void insert(String word) {
TrieNode node = root;
for (char c : word.toCharArray()) {
int index = c - 'a';
if (node.children[index] == null) {
node.children[index] = new TrieNode();
}
node = node.children[index];
}
node.isEnd = true;
}
// Returns true if the word is in the trie.
public boolean search(String word) {
TrieNode node = SearchNode(word);
return node != null && node.isEnd;
}
// Returns true if there is any word in the trie that starts with the given prefix.
public boolean startsWith(String prefix) {
return SearchNode(prefix) != null;
}
// Helper to find the node for a given string
private TrieNode SearchNode(String s) {
TrieNode node = root;
for (char c : s.toCharArray()) {
int index = c - 'a';
if (node.children[index] == null) {
return null;
}
node = node.children[index];
}
return node;
}
}
Go
type Trie struct {
children [26]*Trie
isEnd bool
}
func Constructor() Trie {
return Trie{}
}
func (this *Trie) Insert(word string) {
curr := this
for _, ch := range word {
idx := ch - 'a'
if curr.children[idx] == nil {
curr.children[idx] = &Trie{}
}
curr = curr.children[idx]
}
curr.isEnd = true
}
func (this *Trie) Search(word string) bool {
node := this.searchNode(word)
return node != nil && node.isEnd
}
func (this *Trie) StartsWith(prefix string) bool {
return this.searchNode(prefix) != nil
}
func (this *Trie) searchNode(s string) *Trie {
curr := this
for _, ch := range s {
idx := ch - 'a'
if curr.children[idx] == nil {
return nil
}
curr = curr.children[idx]
}
return curr
}
Complexity Analysis
- Time Complexity:
O(L)for insert/search, whereLis length of the word. - Space Complexity:
O(N * L), whereNis total words andLis average length (in the worst case of no prefix sharing).
Practice Problems
- Design Add and Search Words Data Structure (Medium)
- Word Search II (Hard) - Backtracking on a Trie.
- Maximum XOR of Two Numbers in an Array (Medium) - Bitwise Trie.
3. Deep Dive Strategy Lab: Trie Prefix Tree
Intuition Through Analogy
Think of a Trie like organizing a massive physical library by spelling. Instead of throwing all books into a single room (an Array) or sorting them alphabetically on shelves (a Binary Search Tree), imagine a hallway that branches based on the first letter.
- You walk into the Root room. There are 26 doors, labeled ‘A’ through ‘Z’.
- To find “CAT”, you open the ‘C’ door, which leads to another room with 26 doors.
- You open the ‘A’ door, then the ‘T’ door.
- Inside the ‘T’ door, there’s a stamp that says “IsWord: True”, meaning “CAT” is a valid book in our library.
- If you then want to look for “CAR”, you don’t start from the beginning. You trace back to the ‘A’ room and open the ‘R’ door instead.
This is the Prefix Tree in action. It aggressively shares paths for words with the same prefix, saving both space (sometimes) and time (always) when doing prefix-based searches.
The “Genesis” Problem: Why not just use a Hash Map?
A common junior developer mistake is to use a Hash Map (or HashSet) for everything related to fast lookups.
Brute Force Approach (Hash Set):
If we just want to know if a word exists, a HashSet provides O(1) lookup. But what if we want to build an Autocomplete system?
If a user types c, a, how does a Hash Set help us find “cat”, “car”, and “camel”?
It doesn’t. We’d have to iterate through the entire Hash Set, checking if every single string startsWith("ca").
- Time Complexity:
O(N * L)whereNis the total number of words in the dictionary, andLis the length of the prefix. This is far too slow for real-time keystroke suggestions.
Optimized Approach (The Trie):
By structuring the data as a tree of characters, finding all words starting with “ca” takes exactly O(L) time to traverse down to the ‘a’ node. From there, we simply run a quick DFS to collect the branches.
The bottleneck (searching the entire dictionary) is eliminated. The time constraint (latency for every keystroke) dominates, making the Trie’s O(L) prefix lookup the perfect representation.
Edge Cases & Common Pitfalls
- Memory Bloat: While a Trie shares prefixes, every node has an array of 26 pointers (in standard implementations). If the dataset is extremely sparse (few shared prefixes), a Trie can use significantly more memory than a simple array of strings.
- Character Sets: Always clarify the character set in an interview. Is it just lowercase English (
a-z)? If so, an array of size 26 is perfect. If it includes Unicode or full ASCII, you might need to use a Hash Map inside eachTrieNodeinstead of an array to save space (Map<Character, TrieNode>). - Deletion: Deleting from a Trie is tricky. You can’t just set
isEnd = false. You must traverse back up and delete nodes that are now empty (leaf nodes withisEnd = falseand no children) to prevent memory leaks.