Network Monitoring

[!NOTE] This module explores the core principles of Network Monitoring, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. Why Monitor?

Network monitoring allows you to proactively identify bottlenecks, hardware failures, and security breaches before they impact users. It is divided into two types:

  • Availability: “Is the link up or down?”
  • Performance: “How much traffic is on the link, and is it too slow?”

2. SNMP (Simple Network Management Protocol)

The industry-standard protocol for collecting information from network devices (Switches, Routers, Servers).

  • Management Information Base (MIB): A database on the device that defines what data can be collected (e.g., CPU temp, interface traffic).
  • Polling (Pull): The NMS (Management System) asks the device for data every 5 minutes.
  • Traps (Push): The device sends an immediate alert to the NMS if something goes wrong (e.g., “Interface 2 has died!”).

3. NetFlow (Traffic Visibility)

Developed by Cisco, NetFlow provides data on who is using the network and what they are doing.

  • Instead of looking at every bit, NetFlow collects “Flows” (A sequence of packets with the same Source/Dest IP and Ports).
  • Benefit: Excellent for security (detecting data exfiltration) and capacity planning.

4. Syslog (Logging)

A standard for message logging. Devices send textual log messages to a centralized Syslog server.

  • Levels: 0 (Emergency) to 7 (Debug).

5. Interactive: Monitoring Dashboard

Watch the bandwidth spike.

1Gbps
SNMP Polling active...

6. The Importance of a Baseline

A baseline is a measurement of the “Normal” state of the network.

  • If you don’t know what normal looks like, you won’t know when something is wrong.
  • Example: If your CPU usually runs at 20% and it’s suddenly at 60%, that’s an issue. If it always runs at 60%, then 60% is your baseline.