Common Stages
While there are dozens of aggregation stages, you will spend 90% of your time using just five of them. Mastering these “Big 5” is the key to becoming proficient.
1. $match (Filter)
The $match stage filters documents to pass only those that match the specified condition(s). It is the Aggregation equivalent of the find() command or SQL WHERE clause.
// Filter for active users over 21
{
$match: {
status: "active",
age: { $gt: 21 }
}
}
[!IMPORTANT] Performance Rule #1: Always place
$matchas early as possible (ideally first).
- It can use indexes to find documents efficiently.
- It reduces the number of documents subsequent stages have to process.
Index Scan vs. Collection Scan
Index Scan (Good)
Collection Scan (Bad)
2. $group (Aggregate)
The $group stage groups input documents by a specified _id expression and applies accumulators to each group. This is your SQL GROUP BY.
The _id Field
The _id field is mandatory. It determines the “bucket” that documents fall into.
_id: "$category": Group by category field._id: { region: "$region", year: "$year" }: Group by region AND year._id: null: Group all documents into one single bucket (useful for global totals).
Accumulators
You can calculate values for each group using accumulators:
$sum: Adds numeric values (or counts documents if you use$sum: 1).$avg: Calculates the average.$min/$max: Finds extreme values.$push: Creates an array of values from the group.$addToSet: Creates an array of unique values.
{
$group: {
_id: "$department", // Group by department
totalBudget: { $sum: "$budget" }, // Sum budget
avgSalary: { $avg: "$salary" }, // Average salary
employees: { $push: "$name" } // List of employee names
}
}
3. Interactive: $group Bucket Visualizer
Watch how raw items are sorted into buckets based on the grouping key.
Input Stream
Buckets
4. $project (Reshape)
The $project stage passes along the documents with the requested fields to the next stage. It can:
- Select fields (like SQL
SELECT). - Rename fields.
- Compute new fields using expressions.
- Hide sensitive fields (e.g., exclude
password).
{
$project: {
_id: 0, // Exclude _id
fullName: "$name", // Rename 'name' to 'fullName'
status: 1, // Include 'status'
isAdult: { $gte: ["$age", 18] } // Compute boolean field
}
}
5. $unwind (Expand)
$unwind is unique to document databases. It deals with arrays. It “deconstructs” an array field from the input documents to output a document for each element.
Example:
Input: { id: 1, tags: ["A", "B"] }
Output:
{ id: 1, tags: "A" }{ id: 1, tags: "B" }
This is crucial when you want to group or filter by individual array elements.
6. $sort (Order)
The $sort stage reorders the document stream.
1: Ascending (A-Z, 0-9)-1: Descending (Z-A, 9-0)
{ $sort: { age: -1, name: 1 } } // Sort by age desc, then name asc
[!WARNING] Memory Limit Alert:
$sortis a blocking stage. If you are sorting a large number of documents (more than 100MB of data), the query will fail unless you:
- Use
{ allowDiskUse: true }(slower, writes to temporary files).- Ensure the sort is covered by an index and placed early in the pipeline (preferred).