Query DSL: Speaking Jewish
[!NOTE] This module explores the core principles of Query DSL: Speaking Jewish, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.
1. The Two Contexts: Score vs No-Score
Every clause in Elasticsearch runs in one of two contexts. Mixing them up is the #1 cause of slow clusters.
| Feature | Query Context ("query": ...) |
Filter Context ("filter": ...) |
|---|---|---|
| Question | “How well does this match?” | “Does this match? (Yes/No)” |
| Output | _score (Float) |
Boolean (True/False) |
| Performance | Slower (Calculates Relevance) | Fast (Cached in BitSet) |
| Use Case | Full-text search (“best pizza”) | Exact filtering (“status=active”) |
Golden Rule: If you don’t care about ranking (e.g., filtering by Date, Status, ID), ALWAYS use Filter Context.
2. The Compound bool Query
The bool query is the wrapper for combining logic. It has 4 clauses:
must(AND): Must match. Contributes to score.filter(AND): Must match. Ignores score. Cached.should(OR): Nice to have. Boosts score if present.must_not(NOT): Must NOT match. Ignores score. Cached.
Pattern:
{
"query": {
"bool": {
"must": [ { "match": { "title": "pizza" }} ], // Calculate score
"filter": [ { "term": { "city": "NYC" }} ] // Cached BitSet
}
}
}
3. Interactive: The BitSet Cache
Elasticsearch caches Filters using BitSets (Arrays of 0s and 1s). See how intersecting queries works.
4. Hardware Reality: CPU Instructions
Why are Filter Contexts so fast? Elasticsearch uses Roaring Bitmaps. It doesn’t just loop through arrays. It uses SIMD (Single Instruction, Multiple Data) CPU instructions to AND together thousands of bits in a single CPU cycle. 1 Filter = 1 BitSet lookup. Fast as light. Query Context = Floating point math. Slow.