Network Monitoring

[!NOTE] This module explores the core principles of Network Monitoring, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.

1. Why Monitor?

Network monitoring allows you to proactively identify bottlenecks, hardware failures, and security breaches before they impact users. It is divided into two types:

Availability: “Is the link up or down?”
Performance: “How much traffic is on the link, and is it too slow?”

2. SNMP (Simple Network Management Protocol)

The industry-standard protocol for collecting information from network devices (Switches, Routers, Servers).

Management Information Base (MIB): A database on the device that defines what data can be collected (e.g., CPU temp, interface traffic).
Polling (Pull): The NMS (Management System) asks the device for data every 5 minutes.
Traps (Push): The device sends an immediate alert to the NMS if something goes wrong (e.g., “Interface 2 has died!”).

3. NetFlow (Traffic Visibility)

Developed by Cisco, NetFlow provides data on who is using the network and what they are doing.

Instead of looking at every bit, NetFlow collects “Flows” (A sequence of packets with the same Source/Dest IP and Ports).
Benefit: Excellent for security (detecting data exfiltration) and capacity planning.

4. Syslog (Logging)

A standard for message logging. Devices send textual log messages to a centralized Syslog server.

Levels: 0 (Emergency) to 7 (Debug).

5. Interactive: Monitoring Dashboard

Watch the bandwidth spike.

1Gbps

SNMP Polling active...

6. The Importance of a Baseline

A baseline is a measurement of the “Normal” state of the network.

If you don’t know what normal looks like, you won’t know when something is wrong.
Example: If your CPU usually runs at 20% and it’s suddenly at 60%, that’s an issue. If it always runs at 60%, then 60% is your baseline.