Network Monitoring
[!NOTE] This module explores the core principles of Network Monitoring, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.
1. Why Monitor?
Network monitoring allows you to proactively identify bottlenecks, hardware failures, and security breaches before they impact users. It is divided into two types:
- Availability: “Is the link up or down?”
- Performance: “How much traffic is on the link, and is it too slow?”
2. SNMP (Simple Network Management Protocol)
The industry-standard protocol for collecting information from network devices (Switches, Routers, Servers).
- Management Information Base (MIB): A database on the device that defines what data can be collected (e.g., CPU temp, interface traffic).
- Polling (Pull): The NMS (Management System) asks the device for data every 5 minutes.
- Traps (Push): The device sends an immediate alert to the NMS if something goes wrong (e.g., “Interface 2 has died!”).
3. NetFlow (Traffic Visibility)
Developed by Cisco, NetFlow provides data on who is using the network and what they are doing.
- Instead of looking at every bit, NetFlow collects “Flows” (A sequence of packets with the same Source/Dest IP and Ports).
- Benefit: Excellent for security (detecting data exfiltration) and capacity planning.
4. Syslog (Logging)
A standard for message logging. Devices send textual log messages to a centralized Syslog server.
- Levels: 0 (Emergency) to 7 (Debug).
5. Interactive: Monitoring Dashboard
Watch the bandwidth spike.
1Gbps
SNMP Polling active...
6. The Importance of a Baseline
A baseline is a measurement of the “Normal” state of the network.
- If you don’t know what normal looks like, you won’t know when something is wrong.
- Example: If your CPU usually runs at 20% and it’s suddenly at 60%, that’s an issue. If it always runs at 60%, then 60% is your baseline.