Advanced Patterns
Once you’ve mastered the basics, the Aggregation Framework opens up a world of complex data analysis capabilities, including joins, histograms, and multi-faceted search.
1. $lookup (Left Outer Join)
MongoDB is a document database, so we usually encourage embedding data. However, there are times when you need to reference data across collections. $lookup performs a Left Outer Join to another collection in the same database.
{
$lookup: {
from: "orders", // Target collection
localField: "_id", // Field in THIS collection (users)
foreignField: "userId", // Field in TARGET collection (orders)
as: "orderHistory" // Output array field name
}
}
[!NOTE] The result of
$lookupis always an array, even if only one document matches. You often need to$unwindit if you want to merge the fields.
Users (from)
Orders (target)
2. $bucket (Histograms)
Grouping by exact values is great, but sometimes you want to group by ranges. $bucket automatically categorizes data into ranges, perfect for histograms.
{
$bucket: {
groupBy: "$age", // Field to group by
boundaries: [0, 18, 30, 50, 80], // Define ranges: 0-17, 18-29, 30-49, 50-79
default: "Other", // Where to put outliers (80+)
output: {
count: { $sum: 1 },
names: { $push: "$name" }
}
}
}
Histogram Visualizer
3. $facet (Multi-Pipeline)
$facet is a game-changer for dashboards. It allows you to run multiple parallel aggregations on the same input dataset within a single query.
Imagine loading a product search page. You need:
- The list of products (paginated).
- The total count of products.
- A breakdown of products by category (for sidebar filters).
With $facet, you do this in one go:
{
$facet: {
// Pipeline 1: Get actual data
"products": [
{ $match: { price: { $lt: 100 } } },
{ $skip: 0 },
{ $limit: 10 }
],
// Pipeline 2: Get stats
"stats": [
{ $match: { price: { $lt: 100 } } },
{ $group: { _id: null, avgPrice: { $avg: "$price" } } }
]
}
}
4. Conditional Logic ($cond)
You can use conditional logic inside $project to create dynamic fields. It works like a ternary operator (if ? then : else).
{
$project: {
status: {
$cond: {
if: { $gte: ["$quantity", 10] },
then: "In Stock",
else: "Low Stock"
}
}
}
}
5. Performance Pitfalls
With great power comes great responsibility. Watch out for these common issues:
- The Cartesian Product: If you
$unwinda large array, you multiply the number of documents in your pipeline. 100 documents with an array of 100 items each becomes 10,000 documents! - $lookup on Unindexed Fields: Always ensure the
foreignFieldin your$lookupis indexed. Otherwise, MongoDB has to scan the entire target collection for every input document. - Memory Limits: Remember the 100MB limit for blocking stages. Use indexes to avoid sorting in memory whenever possible.