Matrix Multiplication: The Engine of Neural Nets
[!NOTE] This module explores the core principles of Matrix Multiplication: The Engine of Neural Nets, deriving solutions from first principles and hardware constraints to build world-class, production-ready expertise.
1. Introduction
If you open the source code of any Deep Learning library (PyTorch, TensorFlow), 90% of the compute time is spent on one operation: Matrix Multiplication (GEMM - General Matrix Multiply).
Why? Because a Neural Network layer is just a matrix multiplication followed by an activation function:
2. The Dot Product
The fundamental building block is the Dot Product (or Scalar Product) of two vectors. It returns a single number.
a ċ b = ∑ aibi = a1b1 + a2b2 + … + anbn
Geometric Interpretation
a ċ b = ||a|| ||b|| cos(θ)
- If vectors point in the same direction, dot product is Positive (High Similarity).
- If vectors are perpendicular (90°), dot product is Zero (Orthogonal/Unrelated).
- If vectors point in opposite directions, dot product is Negative.
[!TIP] ML Application: In Recommendation Systems, if User Vector u and Movie Vector m have a high dot product, the user will likely enjoy the movie. This is the basis of Cosine Similarity.
Python Implementation
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
# Dot Product
dot_prod = np.dot(a, b) # 1*4 + 2*5 + 3*6 = 32
# or
dot_prod = a @ b
print(dot_prod)
3. Matrix-Vector Multiplication (Ax)
When we multiply a matrix A by a vector x, we are transforming the vector x. Ax = b
The matrix A acts as a function f(x). It can rotate, scale, or skew the vector space.
| 1 | 0 |
| 0 | 2 |
| 1 |
| 1 |
| 1 |
| 2 |
(This matrix stretched the y-axis by 2).
4. Matrix-Matrix Multiplication (AB)
Multiplying two matrices is just applying two transformations in sequence (Composition). C = AB
Calculating Cij involves taking the dot product of Row i of A and Column j of B.
Rule: Inner dimensions must match! (m × n) ċ (n × p) → (m × p)
[!WARNING] Order Matters! Unlike scalar multiplication (2 × 3 = 3 × 2), Matrix Multiplication is not commutative. AB ≠ BA. Applying a Rotation then a Shear is different from a Shear then a Rotation.
Python Implementation
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Matrix Multiplication
C = np.matmul(A, B)
# or
C = A @ B
print(C)
# [[19 22]
# [43 50]]
5. Interactive Visualizer: The Linear Transformer
Modify the 2x2 Matrix M to see how it transforms the grid space. The basis vectors i (Red) and j (Green) show where the x and y axes land.
6. Summary
- Dot Product: Measures similarity between vectors.
- Matrix-Vector: Transforms a vector (Scale, Rotate, Skew).
- Matrix-Matrix: Combines multiple transformations.