Automatic Differentiation: The Magic of PyTorch

1. Introduction: Who computes the gradients?

In calculus class, you calculated derivatives by hand. In early AI (80s), people derived gradients on paper and coded them. In Modern AI (PyTorch/TF), you write the Forward Pass, and the framework calculates the Backward Pass (gradients) automatically. This is AutoDiff.

2. The Computational Graph

Every calculation in your code builds a graph. Nodes = Operations (+, -, *, sin). Edges = Data Flow (Tensors).

Example: y = (x + 2) * 3

Input x.
Add 2 &rightarrow; a.
Multiply 3 &rightarrow; y.

3. Forward vs Backward Mode

Forward Mode: Computes the value (y) and the derivative (dy/dx) simultaneously. Good when inputs < outputs.
Backward Mode (Backprop): Computes value first, then traverses the graph in reverse to find gradients. Good when inputs > outputs (like in Neural Nets, where inputs=millions, output=1 loss).

4. Interactive Visualizer: Graph Builder

Visualize the computational graph for y = (x + w) * b.

Forward Pass (Blue): Values flow up.
Backward Pass (Red): Gradients flow down.

Input: x=2, w=1, b=3.

a = x+w = 3.
y = a*b = 9.
dy/dy = 1.
dy/da = b = 3.
dy/dx = dy/da · da/dx = 3 · 1 = 3.

5. Summary

Computational Graph: Represents math as a tree.
AutoDiff: Applies Chain Rule automatically on the graph.
Backward Mode: Efficient for functions with many inputs and few outputs (like Loss functions).

Next: Backpropagation from Scratch →