Probability Glossary

A reference guide for key terms used throughout the Probability course.

B

Bernoulli Distribution

A discrete probability distribution of a random variable which takes the value 1 with probability p and the value 0 with probability q=1-p. It represents a single trial with two possible outcomes.

Beta Distribution

A family of continuous probability distributions defined on the interval [0, 1] parametrized by two positive shape parameters, denoted by α and β. Often used to model the distribution of probabilities or proportions.

Binomial Coefficient

The number of ways to choose k elements from a set of n distinct elements, disregarding order. Denoted as C(n, k) or _nC_k. Formula: C(n, k) = n! / (k!(n-k)!)

Binomial Distribution

A discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes-no question, and each with its own boolean-valued outcome.

C

Central Limit Theorem (CLT)

A theorem that states that the distribution of sample means approximates a normal distribution as the sample size becomes larger, regardless of the population’s distribution.

Combination

A selection of items from a collection, such that the order of selection does not matter. Formula: C(n, k) = n! / (k!(n-k)!)

Cross-Entropy

A measure of the difference between two probability distributions for a given random variable or set of events. In machine learning, it is commonly used as a loss function for classification models. Formula: H(P, Q) = - Σ P(x) \log Q(x)

Cumulative Distribution Function (CDF)

The probability that a real-valued random variable X will take a value less than or equal to x. Formula: F(x) = P(X ≤ x)

E

Entropy (Shannon)

A measure of the “uncertainty” or average information content inherent in the variable’s possible outcomes. Formula: H(X) = - Σ P(x) \log P(x)

F

Factorial

The product of an integer and all the integers below it, denoted by n!. Example: 5! = 5 × 4 × 3 × 2 × 1 = 120. By definition, 0! = 1.

G

Gamma Distribution

A two-parameter family of continuous probability distributions. It is a generalization of the exponential distribution and is often used to model waiting times for k events to occur.

Gaussian Distribution

Also known as the Normal Distribution. A continuous probability distribution that is symmetric about the mean, showing that data near the mean are more frequent in occurrence than data far from the mean.

I

Inclusion-Exclusion Principle

A counting technique used to calculate the size of the union of multiple sets by including the sizes of individual sets and excluding the sizes of their intersections to avoid double counting. Formula for 2 sets: |A ∪ B| = |A| + |B| - |A ∩ B|

Information Gain

The reduction in entropy (or uncertainty) of a random variable achieved by observing another random variable. Equivalent to Mutual Information.

K

Kullback-Leibler (KL) Divergence

A measure of how one probability distribution is different from a second, reference probability distribution. Also known as relative entropy. Formula: D_KL(P || Q) = Σ P(x) \log (P(x) / Q(x))

M

Mutual Information

A quantity that measures the mutual dependence between the two random variables. It measures the information that knowing either variable provides about the other. Formula: I(X; Y) = H(X) - H(X|Y)

N

Normal Distribution

See Gaussian Distribution.

P

Pascal’s Triangle

A triangular array of binomial coefficients. Each number is the sum of the two numbers directly above it. Row n contains the coefficients of the expansion of (x+y)ⁿ.

Permutation

An arrangement of items in a specific order. Formula: P(n, k) = n! / (n-k)!

Poisson Distribution

A discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event.

Probability Density Function (PDF)

A function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample. Used for continuous variables.

Probability Mass Function (PMF)

A function that gives the probability that a discrete random variable is exactly equal to some value.

Z

Z-Score

A standard score that indicates how many standard deviations an element is from the mean. Formula: Z = (x - μ) / σ