HashMap HashSet
Imagine you are looking for a specific book in a massive library with millions of volumes. If the books were scattered randomly, you’d have to check every single one (an $O(N)$ operation). But what if the librarian had a magical index card catalog? You give them a book title, and they instantly point you to the exact shelf and position ($O(1)$).
That “magical catalog” is a Hash Table.
If Arrays are the “bread” of programming, HashMaps are the “butter”. They are the default choice for storing data when you need incredibly fast lookups.
The Two Flavors
- HashMap (Dictionary in Python): Stores
Key → Value pairs.
- Keys must be unique.
- Values can be duplicated.
- Think: A phone book (Name → Number).
- HashSet (Set in Python): Stores
Key only.
- Keys must be unique.
- Think: A guest list (Names only, no duplicates allowed).
Interactive Playground
Experiment with how a Map handles data. Notice what happens when you add a duplicate key!
Implementation Details
Under the hood, both Python’s dict and Java’s HashMap use a Hash Table.
Map Example
Maps are highly optimized hash tables.
Java
Go
```java
// 1. Create
Map<String, Integer> grades = new HashMap<>();
// 2. Add (Put) - O(1)
grades.put("Alice", 95);
grades.put("Bob", 82);
// 3. Access (Get) - O(1)
System.out.println(grades.get("Alice")); // 95
// 4. Check Key - O(1)
if (!grades.containsKey("Charlie")) {
System.out.println("Charlie is missing");
}
// 5. Remove - O(1)
grades.remove("Bob");
```
```go
// 1. Create
grades := make(map[string]int)
// 2. Add (Put) - O(1)
grades["Alice"] = 95
grades["Bob"] = 82
// 3. Access (Get) - O(1)
fmt.Println(grades["Alice"]) // 95
// 4. Check Key - O(1)
if _, exists := grades["Charlie"]; !exists {
fmt.Println("Charlie is missing")
}
// 5. Remove - O(1)
delete(grades, "Bob")
```
Set Example
Use a set when you only care about uniqueness, not the associated value.
Java
Go
```java
// 1. Create
Set
uniqueIds = new HashSet<>();
// 2. Add - O(1)
uniqueIds.add(101);
uniqueIds.add(102);
uniqueIds.add(101); // Duplicate ignored!
System.out.println(uniqueIds.size()); // Output: 2
// 3. Check Membership - O(1)
if (uniqueIds.contains(101)) {
System.out.println("ID 101 exists");
}
```
</div>
```go
// 1. Create (Go uses map with empty struct for sets)
uniqueIds := make(map[int]struct{})
// 2. Add - O(1)
uniqueIds[101] = struct{}{}
uniqueIds[102] = struct{}{}
uniqueIds[101] = struct{}{} // Duplicate ignored!
fmt.Println(len(uniqueIds)) // Output: 2
// 3. Check Membership - O(1)
if _, exists := uniqueIds[101]; exists {
fmt.Println("ID 101 exists")
}
```
</div>
## Amortized Complexity & Hash Collisions
Under the hood, a Hash Table works by passing a key through a **Hash Function**, which spits out an integer index.
### Hash Collisions
What happens if two different keys (like "Alice" and "Bob") hash to the exact same index? This is called a **Hash Collision**. There are two main ways to handle this:
1. **Chaining (Most Common)**: Instead of storing just one value at an index, we store a Linked List of values. If a collision occurs, we just append the new value to the list. When looking up, we hash the key, go to the index, and traverse the short list to find the exact key.
2. **Open Addressing**: If the target index is full, we simply look for the next available empty slot (e.g., index + 1, index + 2).
### Amortized $O(1)$
Why do we say Maps have **Amortized $O(1)$** lookup?
Occasionally, the hash table fills up. When the **Load Factor** (items / capacity) exceeds a threshold (usually 0.75), the table must **resize** (usually double in size).
* **Resizing**: Takes **$O(N)$** time because every item must be re-hashed to a new index.
* **Amortized**: Since resizing happens rarely (exponentially less often), the average cost per operation remains **$O(1)$**.
### Common Pitfall: Mutable Keys
> [!WARNING]
> Never use **mutable objects** (like Arrays or Lists) as keys in a HashMap or HashSet. If you change the object after putting it in the Map, its hash code changes. The Map will try to look for it in the wrong bucket, and the object will be "lost" forever inside the data structure! Always use immutable types (Strings, Integers, Tuples) as keys.
> [!IMPORTANT]
> Always prefer `HashMap`/`HashSet` over `List` for lookups. Searching a list is **$O(N)$**. Searching a map is **$O(1)$**. This is the single most common optimization pattern in algorithmic interviews.
---