Buffering and Caching: The Art of Flow
[!NOTE] Why Buffer? A producer (CPU) generates data at 10 GB/s. A consumer (Network) accepts data at 1 GB/s. Without a buffer, the CPU must slow down to 1 GB/s. With a buffer, the CPU can dump data and move on.
1. Buffering Strategies
A. Single Buffer
OS reads a block from disk into a kernel buffer, then copies it to user space.
- Problem: While copying, the disk is idle.
B. Double Buffering (Ping-Pong)
Two buffers. While the CPU consumes Buffer A, the Disk fills Buffer B.
- Result: CPU and Disk work in parallel.
C. Circular Buffer (Ring Buffer)
Used for data streams (Keyboard, Network Cards).
- Head Pointer: Where data is written.
- Tail Pointer: Where data is read.
- Full: Head catches up to Tail.
2. Interactive: The Ring Buffer
Visualize how producers and consumers chase each other in a fixed-size buffer.
0
Items
Buffer Empty.
3. The Page Cache
The OS reserves a huge chunk of RAM to cache disk blocks.
- Read: Check cache first. If hit, return immediately (Nanoseconds).
- Write: Write to cache (mark page as Dirty). Return success.
- Writeback: A background thread (
pdflushin Linux) writes dirty pages to disk every ~30 seconds. - fsync(): Forces a writeback immediately.
4. Zero Copy I/O
Standard File Serving (read + write):
- Disk → Kernel Buffer
- Kernel Buffer → User Buffer (CPU Copy)
- User Buffer → Socket Buffer (CPU Copy)
- Socket Buffer → NIC
Zero Copy (sendfile):
- Disk → Kernel Buffer
- Kernel Buffer → NIC (via pointers) Result: No CPU copies! 2x-3x throughput boost.
5. Code Example: Memory Mapped Files (Zero Copy in Java)
Java’s MappedByteBuffer uses the OS mmap syscall to map a file directly into memory. This bypasses the heap and allows the OS to manage caching.
```java
import java.io.RandomAccessFile;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
public class ZeroCopy {
public static void main(String[] args) throws Exception {
RandomAccessFile file = new RandomAccessFile("large_db.dat", "rw");
FileChannel channel = file.getChannel();
// Map 100MB of the file into memory
// READ_WRITE mode implies changes are written back to disk eventually
MappedByteBuffer map = channel.map(FileChannel.MapMode.READ_WRITE, 0, 100 * 1024 * 1024);
// We can treat the file like a byte array!
// No read()/write() system calls.
// Write a value at byte 0
map.put(0, (byte) 123);
// Read a value at byte 50
byte b = map.get(50);
System.out.println("Read: " + b);
// Force changes to disk (like fsync)
map.force();
channel.close();
file.close();
}
}
```