Intro to Combinatorics

Introduction Counting

Introduction

Combinatorics is a fundamental area of discrete mathematics that focuses on counting, arranging, and analyzing structures. It provides powerful techniques to solve problems related to permutations, combinations, graph enumeration, discrete probability, and more. These combinatorial methods are indispensable in fields such as computer science, cryptography, and optimization. In particular, they play a crucial role in machine learning, where they contribute to algorithms for data analysis, pattern recognition, and optimization problems. Additionally, combinatorics finds practical applications in network design and error-correcting codes. By exploring combinatorics, we will gain profound insights into mathematical structures and their inherent properties.

In this page, we review the fundamental counting principles - the rules of sum and product - and then develop the theory of combinations and permutations. Along the way, we encounter Pascal's relation and Chu's Theorem, which reveal a connection between combinatorial identities and linear algebra through the Pascal matrix.

Counting

We begin by revisiting the two most fundamental counting principles, which form the basis for all combinatorial reasoning.

Fundamental Counting Principle
  • Rule of Sum:
    If a set can be expressed as the disjoint union of two (or more) subsets, then the number of elements in the set is the sum of the numbers of elements in the subsets.
  • Rule of Product:
    Consider a (finite) sequence of decisions. Suppose the number of choices for each individual decision is independent of decisions made previously in the sequence. Then the number of ways to make the whole sequence of decisions is the product of these numbers of choices.

Remember, we have two ways to look at the combination \[ {}_n C_r = C(n, r) = \binom{n}{r}. \] One is the combinatorial definition: "\(n\) choose \(r\)" is the number of ways to choose \(r\) things from a collection of \(n\) things.

Alternatively, by the algebraic definition, \[ C(n, r)= \frac{n!}{r!(n-r)!}. \] This quantity is called a binomial coefficient because of its role in the binomial theorem. It is a special case of the more general multinomial coefficient: \[ \binom{n}{r_1, r_2, \cdots, r_k} = \frac{n!}{r_1! r_2! \cdots r_k!}. \] Here, if \(r_1 + r_2 + \cdots + r_k = n\), then \[ \binom{n}{r_1, r_2, \cdots, r_k} = \binom{n}{r_1}\binom{n-r_1}{r_2}\binom{n-r_1-r_2}{r_3} \cdots \binom{n-r_1-r_2 - \cdots - r_{k-1}}{r_k}. \]

Note: The number of ways to choose \(r\) items from \(n\) items with replacement, where order does not matter, is \(C(r+n-1, r)\); if order does matter, it is \(n^r\). For example, suppose we choose 3 letters from \(\{A, B, C, D\}\) with replacement. If order matters, we have \(4^3 = 64\) selections. If order does not matter, we have \(C(6, 3) = 20\) selections.

Let's observe a couple of combination-related theorems:

Pascal's Relation

If \(1 \leq r \leq n\), then \[ C(n+1, r) = C(n, r-1) + C(n, r). \]

Proof:

Suppose we have an \((n+1)\)-element set \(\{x_1, x_2, \ldots, x_n, y\}\). Its \(r\)-element subsets can be partitioned into two families: those that contain \(y\) and those that do not. The \(r\)-element subsets that do not contain \(y\) are exactly the \(r\)-element subsets of \(\{x_1, x_2, \ldots, x_n\}\), of which there are \(\binom{n}{r}\). On the other hand, each \(r\)-element subset that contains \(y\) is determined by choosing the remaining \(r-1\) elements from \(\{x_1, x_2, \ldots, x_n\}\), giving \(\binom{n}{r-1}\) such subsets.

Chu's Theorem

If \(n \geq r \geq 0\), then \[ \begin{align*} \sum_{k=0}^n C(k,r) &= C(r, r) + C(r+1, r) + C(r+2, r) + \cdots + C(n, r) \\\\ &= C(n+1, r+1). \end{align*} \]

Proof:

If we replace \(C(r,r)\) with \(C(r+1, r+1)\), then by Pascal's relation, \[ C(r+1, r+1) + C(r+1, r) = C(r+2, r+1) \] and again by Pascal's relation, we have \[ C(r+2, r+1) + C(r+2, r) = C(r+3, r+1). \] If we repeate this process, we obtain \[ C(n, r+1) + C(n, r) = C(n+1, r+1). \]

Chu's Theorem immediately yields familiar summation formulas. The sum of the first \(n\) positive integers, \[ 1 + 2 + \cdots + n = \frac{n(n+1)}{2}, \] is simply Chu's Theorem with \(r = 1\): \[ C(1, 1) + C(2,1) + \cdots + C(n, 1) = C(n+1, 2). \] This perspective is indeed, very useful. Consider the sum of the squares of first \(n\) positive integers: \[ 1^2 + 2^2 + \cdots + n^2 = \frac{n(n+1)(2n+1)}{6}. \] We can prove it by induction, but in the first place, how can we get the formula? Yes, we can use Chu's Theorem again. \[ k^2 = k + k(k-1) = C(k, 1) + 2C(k, 2) \] and summing both sides, we get \[ \sum_{k=1}^n k^2 = \sum_{k=1}^n C(k, 1) + 2\sum_{k=1}^n C(k,2). \] Thus, \[ \begin{align*} 1^2 + 2^2 + \cdots + n^2 &= C(n+1, 2) + 2C(n+1, 3) \\\\ &= \frac{n(n+1)}{2} + 2 \frac{(n+1)n(n-1)}{6}\\\\ &= n(n+1)\left[\frac{3+ 2(n-1)}{6}\right] \\\\ &= \frac{n(n+1)(2n+1)}{6}. \end{align*} \] In general,

Sum of Powers of Positive Integers

\[ k^m = \sum_{r=1}^m a_{r, m}\, C(k, r) \] where the coefficients \(a_{r, m}\) are independent of \(k\), for \(1 \leq r \leq m\).

Summing both sides and using Chu's Theorem, \[ \begin{align*} \sum_{k=1}^n k^m &= \sum_{k=1}^n \sum_{r=1}^m a_{r, m}\, C(k, r) \\\\ &= \sum_{r=1}^m a_{r, m} \sum_{k=1}^n C(k, r) \\\\ &= \sum_{r=1}^m a_{r, m}\, C(n+1, r+1). \end{align*} \]

To illustrate, consider the case \(m = 3\). We seek coefficients \(x_1, x_2, x_3\) such that \[ \begin{align*} k^3 &= x_1 C(k, 1) + x_2C(k,2) +x_3C(k,3) \\\\ &= x_1k + \frac{x_2 k(k-1)}{2}+ \frac{x_3k(k-1)(k-2)}{6} \\\\ \end{align*} \] and we have \[ 6k^3 = (6x_1 -3x_2 + 2x_3)k + (3x_2 -3x_3)k^2 + x_3k^3. \] Solving this system by comparing coefficients of \(k\), \(k^2\), and \(k^3\), we obtain \(x_1 = a_{1,3} = 1\), \(x_2 = a_{2,3} = 6\), and \(x_3 = a_{3,3} = 6\).

Hence, \[ k^3 = C(k,1) + 6C(k,2) + 6C(k,3) \] and \[ 1^3 + 2^3 + \cdots + n^3 = C(n+1, 2) + 6C(n+1, 3) + 6C(n+1, 4) = \frac{n^2(n+1)^2}{4}. \] Rather than solving this system of equations by hand each time, we can organize the computation using linear algebra.

Let \(C_n \in \mathbb{R}^{n \times n}\) be the Pascal matrix whose \((i,j)\)-entry is binomial coefficient \(C(i,j), \, 1 \leq i, j \leq n\).

For example, when \(n = 4\), \[ C_4 = \begin{bmatrix}1 & 0 & 0 & 0 \\ 2 & 1 & 0 & 0 \\ 3 & 3 & 1 & 0 \\ 4 & 6 & 4 & 1 \end{bmatrix} \] and its inverse is \[ C_4^{-1} = \begin{bmatrix}1 & 0 & 0 & 0 \\ -2 & 1 & 0 & 0 \\ 3 & -3 & 1 & 0 \\ -4 & 6 & -4 & 1 \end{bmatrix}. \] (You can see the portion of Pascal's triangle.) Compute \[ \begin{align*} &\begin{bmatrix}1 & 0 & 0 & 0 \\ -2 & 1 & 0 & 0 \\ 3 & -3 & 1 & 0 \\ -4 & 6 & -4 & 1 \end{bmatrix} \begin{bmatrix}1^1 & 1^2 & 1^3 & 1^4 \\ 2^1 & 2^2 & 2^3 & 2^4 \\ 3^1 & 3^2 & 3^3 & 3^4 \\ 4^1 & 4^2 & 4^3 & 4^4 \end{bmatrix} \\\\ &= \begin{bmatrix}1 & 1 & 1 & 1 \\ 0 & 2 & 6 & 14 \\ 0 & 0 & 6 & 36 \\ 0 & 0 & 0 & 24 \end{bmatrix} \end{align*} \] Thus, the 4th column gives us \[ k^4 = C(k,1) + 14C(k,2) + 36C(k,3) + 24C(k,4). \]

In combinations, the order in which elements are chosen does not matter. Permutations, on the other hand, count the number of ordered arrangements. By the rule of product, the first element can be chosen in \(n\) ways, the second in \(n-1\) ways, and so on, giving:

Definition: Permutation

The number of permutations of \(r\) elements chosen from an \(n\)-element set is \[ \begin{align*} P(n, r) &= n(n-1)(n-2)\cdots(n-r+1) \\\\ &= \frac{n!}{(n-r)!} \\\\ &= r!\, C(n,r). \end{align*} \]

With the fundamentals of counting, combinations, and permutations in place, we have the essential toolkit for enumerating discrete structures. These techniques will reappear throughout the curriculum - in probability theory where counting outcomes is the first step in computing probabilities and in the analysis of algorithms where combinatorial arguments bound running times and data structures.