Create and Read Operations

In the relational world, you insert rows into tables. In MongoDB, you insert documents into collections. This shift from a rigid schema to a flexible document model is not just about syntax; it’s about data locality and type fidelity.

1. First Principles: Why Documents?

Relational databases shred objects into multiple tables (normalization) to reduce redundancy. To reconstruct the object, you must JOIN these tables, which is expensive (O(N*M) or O(N log M)).

MongoDB stores data as BSON (Binary JSON). By keeping related data together in a single document, we achieve O(1) retrieval for the entire object.

The BSON Advantage

BSON is a binary-encoded serialization of JSON-like documents. It extends the JSON model to provide additional data types and to be efficient for encoding and decoding across different languages.

  • Traversability: BSON documents contain length prefixes, allowing the database to skip over fields without scanning them.
  • Type Fidelity: Unlike JSON, which only has Number, String, Boolean, Array, and Object, BSON distinguishes between Int32, Int64, Double, Date, and Binary data.

Interactive: BSON vs JSON

Visualize how BSON packs data with types and length prefixes for efficiency, unlike the text-based JSON.

JSON (Text)

{
  "price": 99.5,
  "qty": 10
}

~30 bytes (ASCII text). Parsing requires scanning every character for delimiters `"{}:,"`.

BSON (Binary)

Size (Int32)
\x01 (Double)
"price"\x00
99.5 (8 bytes)
\x10 (Int32)
"qty"\x00
10 (4 bytes)
\x00

Compact binary. Types are explicit. Length prefixes allow skipping fields.

Under the Hood: The Insert Path

When you call insertOne(), the following happens:

  1. Driver Serialization: The client driver converts your language-native object (POJO in Java, Struct in Go) into raw BSON bytes.
  2. Wire Protocol: The driver wraps this BSON in an OP_MSG command and sends it over a TCP socket to the mongos or mongod process.
  3. Parsing & Locking: The database parses the BSON, acquires a Document Lock (intent exclusive), and validates the document (if schema validation is enabled).
  4. Storage Engine (WiredTiger): The document is written to the in-memory cache and the Journal (Write-Ahead Log) for durability.
sequenceDiagram
  participant App as Application
  participant Driver as MongoDB Driver
  participant Net as Network (TCP)
  participant DB as Mongod (WiredTiger)

  App->>Driver: insertOne(doc)
  Driver->>Driver: Serialize to BSON
  Driver->>Net: OP_MSG (insert)
  Net->>DB: Receive Command
  DB->>DB: Parse BSON & Validate
  DB->>DB: Acquire Lock (IX)
  DB->>DB: Write to Journal (WAL)
  DB->>DB: Write to Memory (Cache)
  DB-->>Net: Acknowledge (OK)
  Net-->>Driver: Response
  Driver-->>App: InsertOneResult

2. Creating Documents

insertOne vs insertMany

insertOne sends a single document command. insertMany batches multiple documents into a single network request (OP_MSG payload), significantly reducing round-trip time (RTT).

Interactive: Network Round Trip Visualizer

Observe the difference between sending 5 documents individually vs. in a batch.

Client
Server
Operations: 0/5 Total Time: 0ms

Code Examples

Java (Sync Driver)

import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoDatabase;
import org.bson.Document;
import java.util.Arrays;

public class CreateExample {
    public void createDocs(MongoDatabase db) {
        MongoCollection<Document> collection = db.getCollection("products");

        // 1. Single Insert
        Document canvas = new Document("name", "Canvas")
                .append("price", 25.99)
                .append("stock", 150)
                .append("tags", Arrays.asList("art", "hobby"));

        collection.insertOne(canvas);

        // 2. Bulk Insert (Atomic per batch)
        Document paint = new Document("name", "Paint").append("price", 12.50);
        Document brush = new Document("name", "Brush").append("price", 4.00);

        collection.insertMany(Arrays.asList(paint, brush));
    }
}

Go (mongo-go-driver)

package main

import (
    "context"
    "log"
    "go.mongodb.org/mongo-driver/bson"
    "go.mongodb.org/mongo-driver/mongo"
)

type Product struct {
    Name  string   `bson:"name"`
    Price float64  `bson:"price"`
    Stock int      `bson:"stock,omitempty"`
    Tags  []string `bson:"tags,omitempty"`
}

func CreateDocs(ctx context.Context, coll *mongo.Collection) {
    // 1. Single Insert
    canvas := Product{
        Name:  "Canvas",
        Price: 25.99,
        Stock: 150,
        Tags:  []string{"art", "hobby"},
    }

    _, err := coll.InsertOne(ctx, canvas)
    if err != nil {
        log.Fatal(err)
    }

    // 2. Bulk Insert (interface{} slice required)
    products := []interface{}{
        Product{Name: "Paint", Price: 12.50},
        Product{Name: "Brush", Price: 4.00},
    }

    _, err = coll.InsertMany(ctx, products)
    if err != nil {
        log.Fatal(err)
    }
}

[!TIP] Use insertMany for bulk loading. Inserting 1,000 documents one by one requires 1,000 network round trips. insertMany can do it in 1 (depending on document size and batch limits).


3. Reading Documents

Reading is done via find(), which returns a Cursor.

How Cursors Work

The database doesn’t send all 1,000,000 matching documents at once. It sends a batch (default 101 docs or 1MB).

  1. The driver requests the first batch.
  2. The application iterates through the documents.
  3. When the batch is exhausted, the driver silently sends a getMore command to fetch the next batch.
  4. This continues until the cursor is exhausted or closed.

Complexity Analysis

  • Without Index: O(N) - The database must scan every document in the collection (Collection Scan).
  • With Index: O(log N) - The database traverses a B-Tree to find the document locations (Index Scan).

Code Examples

Java

import com.mongodb.client.FindIterable;
import com.mongodb.client.model.Filters;
import static com.mongodb.client.model.Filters.*;

public void readDocs(MongoCollection<Document> collection) {
    // 1. Find All (Returns a Cursor)
    FindIterable<Document> allDocs = collection.find();

    // 2. Filter: Price < 10
    // Uses efficient BSON filtering on the server side
    FindIterable<Document> cheapDocs = collection.find(lt("price", 10.0));

    // 3. Filter with AND: Price > 20 AND Tag = "art"
    collection.find(and(gt("price", 20.0), eq("tags", "art")))
             .forEach(doc -> System.out.println(doc.toJson()));
}

Go

func ReadDocs(ctx context.Context, coll *mongo.Collection) {
    // 1. Filter: Price < 10
    filter := bson.D{{"price", bson.D{{"$lt", 10.0}}}}

    cursor, err := coll.Find(ctx, filter)
    if err != nil {
        log.Fatal(err)
    }
    defer cursor.Close(ctx)

    // 2. Iterate the cursor
    for cursor.Next(ctx) {
        var p Product
        if err := cursor.Decode(&p); err != nil {
            log.Fatal(err)
        }
        // Process p
    }
}

4. Interactive: The Query Optimizer

Visualize how a query scans documents. Without an index, the engine must look at every document to verify if it matches the filter.

Collection Scan vs Index Scan

Docs Scanned: 0 Est. Time: 0ms Status: Idle

5. Projection: Data Diet

Transferring large documents over the network is expensive. Projections allow you to return only the necessary fields.

// Return only name and price, exclude _id
db.products.find({}, { name: 1, price: 1, _id: 0 })

[!WARNING] Excluding the _id field prevents the client from identifying the document for future updates. Only exclude _id for read-only display logic.