Sliding Window on Strings
The Sliding Window technique is a powerful optimization pattern used primarily on arrays and strings. It converts nested loops (O(N<sup>2</sup>)) into a single loop (O(N)) by maintaining a “window” of state that slides across the data.
1. The Dynamic Window
A dynamic sliding window grows and shrinks based on certain conditions. It’s often used to find the longest or shortest substring that satisfies a criteria.
Problem: Longest Substring Without Repeating Characters
Given a string s, find the length of the longest substring without repeating characters.
Example:
- Input:
abcabcbb - Output:
3(The answer is “abc”, with the length of 3.)
Visualization
Watch how the window [left, right] expands to include new characters and contracts when a duplicate is found.
Window:
Output (Max Length): 0
Set: {}
Implementation
// Java Implementation
public int lengthOfLongestSubstring(String s) {
Set<Character> set = new HashSet<>();
int left = 0, right = 0, maxLen = 0;
while (right < s.length()) {
char c = s.charAt(right);
// If duplicate found, shrink window from left
while (set.contains(c)) {
set.remove(s.charAt(left));
left++;
}
set.add(c);
maxLen = Math.max(maxLen, right - left + 1);
right++;
}
return maxLen;
}
Complexity Analysis
- Time Complexity:
O(n). Each character is added and removed at most once. - Space Complexity:
O(min(m, n)), wheremis the size of the charset (e.g., 26 for lowercase English letters).
Problem: Longest Substring with At Most Two Distinct Characters
Given a string s, return the length of the longest substring that contains at most two distinct characters.
Example 1:
- Input:
s = "eceba" - Output:
3(“ece”)
Example 2:
- Input:
s = "ccaabbb" - Output:
5(“aabbb”)
Implementation
public int lengthOfLongestSubstringTwoDistinct(String s) {
int left = 0;
int right = 0;
Map<Character, Integer> charCount = new HashMap<>();
int maxLen = 0;
while(right < s.length()) {
char cr = s.charAt(right);
charCount.merge(cr, 1, Integer::sum);
while(charCount.size() > 2) {
char cl = s.charAt(left);
charCount.merge(cl, -1, Integer::sum);
if(charCount.get(cl) == 0) {
charCount.remove(cl);
}
left++;
}
maxLen = Math.max(right - left + 1, maxLen);
right++;
}
return maxLen;
}
Complexity Analysis
- Time Complexity:
O(N). Each character is processed at most twice (once added, once removed). - Space Complexity:
O(1). The hash map stores at most 3 distinct characters at any time.
2. The Fixed Window
Sometimes the problem specifies a window size. In this case, we don’t need a while loop to shrink. We just slide left and right together.
Problem: Find All Anagrams in a String
Given two strings s and p, return an array of all the start indices of p’s anagrams in s.
Example:
- Input:
s = "cbaebabacd",p = "abc" - Output:
[0, 6](The substrings “cba” and “bac” are anagrams of “abc”).
Implementation
public List<Integer> findAnagrams(String s, String p) {
List<Integer> result = new ArrayList<>();
if (s.length() < p.length()) return result;
int[] pCount = new int[26];
int[] sCount = new int[26];
// Initialize the first window
for (int i = 0; i < p.length(); i++) {
pCount[p.charAt(i) - 'a']++;
sCount[s.charAt(i) - 'a']++;
}
if (Arrays.equals(pCount, sCount)) result.add(0);
// Slide the window
int left = 0;
for (int right = p.length(); right < s.length(); right++) {
// Add new char
sCount[s.charAt(right) - 'a']++;
// Remove old char
sCount[s.charAt(left) - 'a']--;
left++;
if (Arrays.equals(pCount, sCount)) {
result.add(left);
}
}
return result;
}
[!TIP] Notice we use an
int[26]array instead of aHashMapfor frequency counting. This is a common optimization for string problems involving lowercase English letters. It is significantly faster and uses constant space.
Practice Problems
- Longest Repeating Character Replacement (Medium)
- Permutation in String (Medium)
- Minimum Window Substring (Hard)
3. Deep Dive Strategy Lab: Sliding Window On Strings
Intuition Through Analogy
Think of this chapter like searching text in a large log stream. The goal is not to memorize a fixed trick, but to repeatedly answer:
- What is the bottleneck?
- Which constraint dominates (time, memory, latency, correctness)?
- Which representation makes the bottleneck easier to eliminate?