Create and Read Operations
In the relational world, you insert rows into tables. In MongoDB, you insert documents into collections. This shift from a rigid schema to a flexible document model is not just about syntax; it’s about data locality and type fidelity.
1. First Principles: Why Documents?
Relational databases shred objects into multiple tables (normalization) to reduce redundancy. To reconstruct the object, you must JOIN these tables, which is expensive (O(N*M) or O(N log M)).
MongoDB stores data as BSON (Binary JSON). By keeping related data together in a single document, we achieve O(1) retrieval for the entire object.
The BSON Advantage
BSON is a binary-encoded serialization of JSON-like documents. It extends the JSON model to provide additional data types and to be efficient for encoding and decoding across different languages.
- Traversability: BSON documents contain length prefixes, allowing the database to skip over fields without scanning them.
- Type Fidelity: Unlike JSON, which only has Number, String, Boolean, Array, and Object, BSON distinguishes between
Int32,Int64,Double,Date, andBinarydata.
Interactive: BSON vs JSON
Visualize how BSON packs data with types and length prefixes for efficiency, unlike the text-based JSON.
JSON (Text)
{
"price": 99.5,
"qty": 10
}
~30 bytes (ASCII text). Parsing requires scanning every character for delimiters `"{}:,"`.
BSON (Binary)
Compact binary. Types are explicit. Length prefixes allow skipping fields.
Under the Hood: The Insert Path
When you call insertOne(), the following happens:
- Driver Serialization: The client driver converts your language-native object (POJO in Java, Struct in Go) into raw BSON bytes.
- Wire Protocol: The driver wraps this BSON in an
OP_MSGcommand and sends it over a TCP socket to themongosormongodprocess. - Parsing & Locking: The database parses the BSON, acquires a Document Lock (intent exclusive), and validates the document (if schema validation is enabled).
- Storage Engine (WiredTiger): The document is written to the in-memory cache and the Journal (Write-Ahead Log) for durability.
sequenceDiagram participant App as Application participant Driver as MongoDB Driver participant Net as Network (TCP) participant DB as Mongod (WiredTiger) App->>Driver: insertOne(doc) Driver->>Driver: Serialize to BSON Driver->>Net: OP_MSG (insert) Net->>DB: Receive Command DB->>DB: Parse BSON & Validate DB->>DB: Acquire Lock (IX) DB->>DB: Write to Journal (WAL) DB->>DB: Write to Memory (Cache) DB-->>Net: Acknowledge (OK) Net-->>Driver: Response Driver-->>App: InsertOneResult
2. Creating Documents
insertOne vs insertMany
insertOne sends a single document command. insertMany batches multiple documents into a single network request (OP_MSG payload), significantly reducing round-trip time (RTT).
Interactive: Network Round Trip Visualizer
Observe the difference between sending 5 documents individually vs. in a batch.
Code Examples
Java (Sync Driver)
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import org.bson.Document;
import java.util.Arrays;
public class CreateExample {
public void createDocs(MongoDatabase db) {
MongoCollection<Document> collection = db.getCollection("products");
// 1. Single Insert
Document canvas = new Document("name", "Canvas")
.append("price", 25.99)
.append("stock", 150)
.append("tags", Arrays.asList("art", "hobby"));
collection.insertOne(canvas);
// 2. Bulk Insert (Atomic per batch)
Document paint = new Document("name", "Paint").append("price", 12.50);
Document brush = new Document("name", "Brush").append("price", 4.00);
collection.insertMany(Arrays.asList(paint, brush));
}
}
Go (mongo-go-driver)
package main
import (
"context"
"log"
"go.mongodb.org/mongo-driver/bson"
"go.mongodb.org/mongo-driver/mongo"
)
type Product struct {
Name string `bson:"name"`
Price float64 `bson:"price"`
Stock int `bson:"stock,omitempty"`
Tags []string `bson:"tags,omitempty"`
}
func CreateDocs(ctx context.Context, coll *mongo.Collection) {
// 1. Single Insert
canvas := Product{
Name: "Canvas",
Price: 25.99,
Stock: 150,
Tags: []string{"art", "hobby"},
}
_, err := coll.InsertOne(ctx, canvas)
if err != nil {
log.Fatal(err)
}
// 2. Bulk Insert (interface{} slice required)
products := []interface{}{
Product{Name: "Paint", Price: 12.50},
Product{Name: "Brush", Price: 4.00},
}
_, err = coll.InsertMany(ctx, products)
if err != nil {
log.Fatal(err)
}
}
[!TIP] Use
insertManyfor bulk loading. Inserting 1,000 documents one by one requires 1,000 network round trips.insertManycan do it in 1 (depending on document size and batch limits).
3. Reading Documents
Reading is done via find(), which returns a Cursor.
How Cursors Work
The database doesn’t send all 1,000,000 matching documents at once. It sends a batch (default 101 docs or 1MB).
- The driver requests the first batch.
- The application iterates through the documents.
- When the batch is exhausted, the driver silently sends a
getMorecommand to fetch the next batch. - This continues until the cursor is exhausted or closed.
Complexity Analysis
- Without Index: O(N) - The database must scan every document in the collection (Collection Scan).
- With Index: O(log N) - The database traverses a B-Tree to find the document locations (Index Scan).
Code Examples
Java
import com.mongodb.client.FindIterable;
import com.mongodb.client.model.Filters;
import static com.mongodb.client.model.Filters.*;
public void readDocs(MongoCollection<Document> collection) {
// 1. Find All (Returns a Cursor)
FindIterable<Document> allDocs = collection.find();
// 2. Filter: Price < 10
// Uses efficient BSON filtering on the server side
FindIterable<Document> cheapDocs = collection.find(lt("price", 10.0));
// 3. Filter with AND: Price > 20 AND Tag = "art"
collection.find(and(gt("price", 20.0), eq("tags", "art")))
.forEach(doc -> System.out.println(doc.toJson()));
}
Go
func ReadDocs(ctx context.Context, coll *mongo.Collection) {
// 1. Filter: Price < 10
filter := bson.D{{"price", bson.D{{"$lt", 10.0}}}}
cursor, err := coll.Find(ctx, filter)
if err != nil {
log.Fatal(err)
}
defer cursor.Close(ctx)
// 2. Iterate the cursor
for cursor.Next(ctx) {
var p Product
if err := cursor.Decode(&p); err != nil {
log.Fatal(err)
}
// Process p
}
}
4. Interactive: The Query Optimizer
Visualize how a query scans documents. Without an index, the engine must look at every document to verify if it matches the filter.
Collection Scan vs Index Scan
5. Projection: Data Diet
Transferring large documents over the network is expensive. Projections allow you to return only the necessary fields.
// Return only name and price, exclude _id
db.products.find({}, { name: 1, price: 1, _id: 0 })
[!WARNING] Excluding the
_idfield prevents the client from identifying the document for future updates. Only exclude_idfor read-only display logic.