Skip to main content
\(\newcommand{\set}[1]{\{1,2,\dotsc,#1\,\}} \newcommand{\ints}{\mathbb{Z}} \newcommand{\posints}{\mathbb{N}} \newcommand{\rats}{\mathbb{Q}} \newcommand{\reals}{\mathbb{R}} \newcommand{\complexes}{\mathbb{C}} \newcommand{\twospace}{\mathbb{R}^2} \newcommand{\threepace}{\mathbb{R}^3} \newcommand{\dspace}{\mathbb{R}^d} \newcommand{\nni}{\mathbb{N}_0} \newcommand{\nonnegints}{\mathbb{N}_0} \newcommand{\dom}{\operatorname{dom}} \newcommand{\ran}{\operatorname{ran}} \newcommand{\prob}{\operatorname{prob}} \newcommand{\Prob}{\operatorname{Prob}} \newcommand{\height}{\operatorname{height}} \newcommand{\width}{\operatorname{width}} \newcommand{\length}{\operatorname{length}} \newcommand{\crit}{\operatorname{crit}} \newcommand{\inc}{\operatorname{inc}} \newcommand{\HP}{\mathbf{H_P}} \newcommand{\HCP}{\mathbf{H^c_P}} \newcommand{\GP}{\mathbf{G_P}} \newcommand{\GQ}{\mathbf{G_Q}} \newcommand{\AG}{\mathbf{A_G}} \newcommand{\GCP}{\mathbf{G^c_P}} \newcommand{\PXP}{\mathbf{P}=(X,P)} \newcommand{\QYQ}{\mathbf{Q}=(Y,Q)} \newcommand{\GVE}{\mathbf{G}=(V,E)} \newcommand{\HWF}{\mathbf{H}=(W,F)} \newcommand{\bfC}{\mathbf{C}} \newcommand{\bfG}{\mathbf{G}} \newcommand{\bfH}{\mathbf{H}} \newcommand{\bfF}{\mathbf{F}} \newcommand{\bfI}{\mathbf{I}} \newcommand{\bfK}{\mathbf{K}} \newcommand{\bfP}{\mathbf{P}} \newcommand{\bfQ}{\mathbf{Q}} \newcommand{\bfR}{\mathbf{R}} \newcommand{\bfS}{\mathbf{S}} \newcommand{\bfT}{\mathbf{T}} \newcommand{\bfNP}{\mathbf{NP}} \newcommand{\bftwo}{\mathbf{2}} \newcommand{\cgA}{\mathcal{A}} \newcommand{\cgB}{\mathcal{B}} \newcommand{\cgC}{\mathcal{C}} \newcommand{\cgD}{\mathcal{D}} \newcommand{\cgE}{\mathcal{E}} \newcommand{\cgF}{\mathcal{F}} \newcommand{\cgG}{\mathcal{G}} \newcommand{\cgM}{\mathcal{M}} \newcommand{\cgN}{\mathcal{N}} \newcommand{\cgP}{\mathcal{P}} \newcommand{\cgR}{\mathcal{R}} \newcommand{\cgS}{\mathcal{S}} \newcommand{\bfn}{\mathbf{n}} \newcommand{\bfm}{\mathbf{m}} \newcommand{\bfk}{\mathbf{k}} \newcommand{\bfs}{\mathbf{s}} \newcommand{\bijection}{\xrightarrow[\text{onto}]{\text{$1$--$1$}}} \newcommand{\injection}{\xrightarrow[]{\text{$1$--$1$}}} \newcommand{\surjection}{\xrightarrow[\text{onto}]{}} \newcommand{\nin}{\not\in} \newcommand{\prufer}{\mbox{prüfer}} \DeclareMathOperator{\fix}{fix} \DeclareMathOperator{\stab}{stab} \DeclareMathOperator{\var}{var} \newcommand{\inv}{^{-1}} \newcommand{\lt}{<} \newcommand{\gt}{>} \newcommand{\amp}{&} \)

AppendixBBackground Material for Combinatorics

This appendix treats background material essential to the study of combinatorial mathematics. Many students will find that most—and perhaps all—of this material has been covered somewhere in their prior course work, and we expect that very few instructors will include this appendix in the syllabus. Nevertheless, students may find it convenient to consult this appendix from time to time, and we suspect that many instructors will encourage students to read this material to refresh their memories of key concepts.


Set theory is concerned with elements, certain collections of elements called sets and a concept of membership. For each element \(x\) and each set \(X\text{,}\) exactly one of the following two statements holds:

  1. \(x\) is a member of \(X\text{.}\)

  2. \(x\) is not a member of \(X\text{.}\)

It is important to note that membership cannot be ambiguous.

When \(x\) is an element and \(X\) is a set, we write \(x\in X\) when \(x\) is a member of \(X\text{.}\) Also, the statement \(x\) belongs to \(X\) means exactly the same thing as \(x\) is a member of \(X\text{.}\) Similarly, when \(x\) is not a member of \(X\text{,}\) we write \(x\notin X\) and say \(x\) does not belong to \(X\text{.}\)

Certain sets will be defined explicitly by listing the elements. For example, let \(X=\{a,b,d,g,m\}\text{.}\) Then \(b\in X\) and \(h\notin X\text{.}\) The order of elements in such a listing is irrelevant, so we could also write \(X=\{g,d,b,m,a\}\text{.}\) In other situations, sets will be defined by giving a rule for membership. As examples, let \(\posints\) denote the set of positive integers. Then let \(X=\{n\in\posints:5\le n\le 9\}\text{.}\) Note that \(6,8\in X\) while \(4,10,238\notin X\text{.}\)

Given an element \(x\) and a set \(X\text{,}\) it may at times be tedious and perhaps very difficult to determine which of the statements \(x\in X\) and \(x\notin X\) holds. But if we are discussing sets, it must be the case that exactly one is true.


Let \(X\) be the set consisting of the following \(12\) positive integers: \begin{align*} \amp 13232112332\\ \amp 13332112332\\ \amp 13231112132\\ \amp 13331112132\\ \amp 13232112112\\ \amp 13231112212\\ \amp 13331112212\\ \amp 13232112331\\ \amp 13231112131\\ \amp 13331112131\\ \amp 13331112132\\ \amp 13332112111\\ \amp 13231112131 \end{align*} Note that one number is listed twice. Which one is it? Also, does \(13232112132\) belong to \(X\text{?}\) Note that the apparent difficulty of answering these questions stems from (1) the size of the set \(X\) and (2) the size of the integers that belong to \(X\text{.}\) Can you think of circumstances in which it is difficult to answer whether \(x\) is a member of \(X\) even when it is known that \(X\) contains exactly one element?


Let \(P\) denote the set of primes. Then \(35\notin P\) since \(35= 5\times 7\text{.}\) Also, \(19\in P\text{.}\) Now consider the number \begin{equation*} n = 77788467064627123923601532364763319082817131766346039653933 \end{equation*} Does \(n\) belong to \(P\text{?}\) Alice says yes while Bob says no. How could Alice justify her affirmative answer? How could Bob justify his negative stance? In this specific case, I know that Alice is right. Can you explain why?

SectionB.2Intersections and Unions

When \(X\) and \(Y\) are sets, the intersection of \(X\) and \(Y\text{,}\) denoted \(X\cap Y\), is defined by \begin{equation*} X\cap Y = \{x: x\in X, x\in Y\} \end{equation*} Note that this notation uses the convention followed by many programming languages. Namely, the “comma” in the definition means that both requirements for membership be satisfied. For example, if \(X=\{b,c,e,g,m\}\) and \(Y=\{a,c,d,h,m,n,p\}\text{,}\) then \(X\cap Y=\{c,m\}\text{.}\)

SubsectionB.2.1The Meaning of \(2\)-Letter Words

In the not too distant past, there was considerable discussion in the popular press on the meaning of the \(2\)-letter word is. For mathematicians and computer scientists, it would have been far more significant to have a discussion of the \(2\)-letter word or. The problem is that the English language uses or in two fundamentally different ways. Consider the following sentences:

  1. A nearby restaurant has a dinner special featuring two choices for dessert: flan de casa or tirami-su.

  2. A state university accepts all students who have graduated from in-state high schools and have SAT scores above \(1000\) or have grade point averages above \(3.0\text{.}\)

  3. A local newspaper offers customers the option of paying their for their newspaper bills on a monthly or semi-annual basis.

In the first and third statement, it is clear that there are two options but that only one of them is allowed. However, in the second statement, the interpretation is that admission will be granted to students who satisfy at least one of the two requirements. These interpretations are called respectively the exclusive and inclusive versions of or. In this class, we will assume that whenever the word “or” is used, the inclusive interpretation is intended—unless otherwise stated.

For example, when \(X\) and \(Y\) are sets, the union of \(X\) and \(Y\text{,}\) denoted \(X\cup Y\text{,}\) is defined by \begin{equation*} X\cup Y = \{x: x\in x \text{ or } x\in Y\}. \end{equation*} For example, if \(X=\{b,c,e,g,m\}\) and \(Y=\{a,c,d,h,m,n,p\}\text{,}\) then \begin{equation*} X\cup Y=\{a,b,c,d,e,g,h,m,n,p\}. \end{equation*}

Note that \(\cap\) and \(\cup\) are commutative and associative binary operations, as is the case with addition and multiplication for the set \(\posints\) of positive integers, i.e., if \(X\text{,}\) \(Y\) and \(Z\) are sets, then \begin{equation*} X\cap Y = Y\cap X \quad\text{ and } \quad X\cup Y = Y\cup X. \end{equation*} Also, \begin{equation*} X\cap(Y\cap Z)= (X\cap Y)\cap Z\quad\text{ and } \quad X\cup(Y\cup Z)= (X\cup Y)\cup Z. \end{equation*} Also, note that each of \(\cap\) and \(\cup\) distributes over the other, i.e., \begin{equation*} X\cap(Y\cup Z)= (X\cap Y)\cup (X\cap Z)\quad\text{ and } \quad X\cup(Y\cap Z)= (X\cup Y)\cap (X\cup Z) \end{equation*} On the other hand, in \(\posints\text{,}\) multiplication distributes over addition but not vice-versa.

SubsectionB.2.2The Empty Set: Much To Do About Nothing

The empty set, denoted \(\emptyset\) is the set for which \(x\notin \emptyset\) for every element \(x\text{.}\) Note that \(X\cap \emptyset =\emptyset\) and \(X\cup \emptyset=X\text{,}\) for every set \(X\text{.}\)

The empty set is unique in the sense that if \(x\notin X\) for every element \(x\text{,}\) then \(X=\emptyset\text{.}\)

SubsectionB.2.3The First So Many Positive Integers

In this text, we will use the symbols \(\posints\text{,}\) \(\ints\text{,}\) \(\rats\) and \(\reals\) to denote respectively the set of positive integers, the set of all integers (positive, negative and zero), the set of rational numbers (fractions) and the set of real numbers (rationals and irrationals). On occasion, we will discuss the set \(\nonnegints\) of non-negative integers. When \(n\) is a positive integer, we will use the abbreviation \([n]\) for the set \(\{1,2,\dots,n\}\) of the first \(n\) positive integers. For example, \([5]=\{1,2,3,4,5\}\text{.}\) For reasons that may not be clear at the moment but hopefully will be transparent at the right moment, we use the notation \(\bfn\) to denote the \(n\)-element set \(\{0,1,2,\dots,n-1\}\text{.}\) Of course, \(\bfn\) is just the set of the first \(n\) non-negative integers. For example, \(\mathbf{5}=\{0,1,2,3,4\}\text{.}\)

SubsectionB.2.4Subsets, Proper Subsets and Equal Sets

When \(X\) and \(Y\) are sets, we say \(X\) is a subset of \(Y\) and write \(X\subseteq Y\) when \(x\in Y\) for every \(x\in X\text{.}\) When \(X\) is a subset of \(Y\) and there exists at least one element \(y\in Y\) with \(y\notin X\text{,}\) we say \(X\) is a proper subset of \(Y\) and write \(X\subsetneq Y\). For example, the \(P\) of primes is a proper subset of the set \(\posints\) of positive integers.

Surprisingly often, we will encounter a situation where sets \(X\) and \(Y\) have different rules for membership yet both are in fact the same set. For example, let \(X=\{0,2\}\) and \(Y=\{z\in\ints: z+z=z\times z\}\text{.}\) Then \(X=Y\text{.}\) For this reason, it is useful to have a test when sets are equal. If \(X\) and \(Y\) are sets, then \begin{equation*} X = Y \quad\text{ if and only if } \quad X\subseteq Y \text{ and } Y\subseteq X. \end{equation*}

SectionB.3Cartesian Products

When \(X\) and \(Y\) are sets, the cartesian product of \(X\) and \(Y\text{,}\) denoted \(X\times Y\), is defined by \begin{equation*} X\times Y=\{(x,y): x\in X \text{ and } y\in Y\} \end{equation*} For example, if \(X=\{a,b\}\) and \(Y=[3]\text{,}\) then \begin{equation*} X\times Y=\{(a,1),(b,1),(a,2),(b,2),(a,3),(b,3)\}. \end{equation*} Elements of \(X\times Y\) are called ordered pairs. When \(p=(x,y)\) is an ordered pair, the element \(x\) is referred to as the first coordinate of \(p\) while \(y\) is the second coordinate of \(p\text{.}\) Note that if either \(X\) or \(Y\) is the empty set, then \(X\times Y=\emptyset\text{.}\)


Let \(X=\{\emptyset,(1,0),\{\emptyset\}\}\) and \(Y=\{(\emptyset,0)\}\text{.}\) Is \(((1,0),\emptyset)\) a member of \(X\times Y\text{?}\)

Cartesian products can be defined for more than two factors. When \(n\ge 2\) is a positive integer and \(X_1,X_2,\dots,X_n\) are non-empty sets, their cartesian product is defined by \begin{equation*} X_1\times X_2\times\dots\times X_n=\{(x_1,x_2,\dots,x_n): x_i\in X_i \text{ for } i = 1,2,\dots,n\} \end{equation*}

SectionB.4Binary Relations and Functions

A subset \(R\subseteq X\times Y\) is called a binary relation on \(X\times Y\text{,}\) and a binary relation \(R\) on \(X\times Y\) is called a function from \(X\) to \(Y\) when for every \(x\in X\text{,}\) there is exactly one element \(y\in Y\) for which \((x,y)\in R\text{.}\)

Many authors prefer to write the condition for being a function in two parts:

  1. For every \(x\in X\text{,}\) there is some element \(y\in Y\) for which \((x,y)\in R\text{.}\)

  2. For every \(x\in X\text{,}\) there is at most one element \(y\in Y\) for which \((x,y)\in R\text{.}\)

The second condition is often stated in the following alternative form: If \(x\in X\text{,}\) \(y_1,y_2\in Y\) and \((x,y_1),(x,y_2)\in R\text{,}\) then \(y_1=y_2\text{.}\)


For example, let \(X=[4]\) and \(Y=[5]\text{.}\) Then let \begin{align*} R_1\amp =\{(2,1),(4,2),(1,1),(3,1)\}\\ R_2\amp =\{(4,2),(1,5),(3,2)\}\\ R_3\amp=\{(3,2),(1,4),(2,2),(1,1),(4,5)\} \end{align*} Of these relations, only \(R_1\) is a function from \(X\) to \(Y\text{.}\)

In many settings (like calculus), it is customary to use letters like \(f\text{,}\) \(g\) and \(h\) to denote functions. So let \(f\) be a function from a set \(X\) to a set \(Y\text{.}\) In view of the defining properties of functions, for each \(x\in X\text{,}\) there is a unique element \(y\in Y\) with \((x,y)\in f\text{.}\) And in this case, the convention is to write \(y=f(x)\text{.}\) For example, if \(f=R_1\) is the function in Example B.4, then \(2=f(4)\) and \(f(3) =1\text{.}\)

The shorthand notation \(f:X\rightarrow Y\) is used to indicate that \(f\) is a function from the set \(X\) to the set \(Y\text{.}\)

In calculus, we study functions defined by algebraic rules. For example, consider the function \(f\) whose rule is \(f(x) = 5x^3-8x+7\text{.}\) This short hand notation means that \(X=Y=\reals\) and that \begin{equation*} f=\{(x,5x^3-8x+7):x\in\reals\} \end{equation*} In combinatorics, we sometimes study functions defined algebraically, just like in calculus, but we will frequently describe functions by other kinds of rules. For example, let \(f:\posints\rightarrow\posints\) be defined by \(f(n) = |n/2|\) if \(n\) is even and \(f(n)=3|n|+1\) when \(n\) is odd.

A function \(f:X\rightarrow Y\) is called an injection from \(X\) to \(Y\) when for every \(y\in Y\text{,}\) there is at most one element \(x\in X\) with \(y=f(x)\text{.}\)

When the meaning of \(X\) and \(Y\) is clear, we just say \(f\) is an injection. An injection is also called a \(1\)–\(1\) function (read this as “one to one”) and this is sometimes denoted as \(f:X\injection Y\text{.}\)

A function \(f:X\rightarrow Y\) is called a surjection from \(X\) to \(Y\) when for every \(y\in Y\text{,}\) there is at least one \(x\in X\) with \(y=f(x)\text{.}\)

Again, when the meaning of \(X\) and \(Y\) is clear, we just say \(f\) is a surjection. A surjection is also called an onto function and this is sometimes denoted as \(f:X\surjection Y\text{.}\)

A function \(f\) from \(X\) to \(Y\) which is both an injection and a surjection is called a bijection. Alternatively, a bijection is referred to as a \(1\)–\(1\text{,}\) onto function, and this is sometimes denoted as \(f:X \bijection Y\). A bijection is also called a \(1\)–\(1\)-correspondence.


Let \(X=Y=\reals\text{.}\) Then let \(f\text{,}\) \(g\) and \(h\) be the functions defined by

  1. \(f(x)=3x-7\text{.}\)
  2. \(g(x)=3(x-2)(x+5)(x-7)\text{.}\)
  3. \(h(x)=6x^2-5x+13\text{.}\)

Then \(f\) is a bijection; \(g\) is a surjection but not an injection (Why?); and \(h\) is neither an injection nor a surjection (Why?).

SectionB.5Finite Sets

A set \(X\) is said to be finite when either (1) \(X=\emptyset\text{;}\) or (2) there exists positive integer \(n\) and a bijection \(f:[n]\bijection X\text{.}\) When \(X\) is not finite, it is called infinite. For example, \(\{a,\emptyset,(3,2),\posints\}\) is a finite set as is \(\posints\times\emptyset\text{.}\) On the other hand, \(\posints\times \{\emptyset\}\) is infinite. Of course, \([n]\) and \(\bfn\) are finite sets for every \(n\in\posints\text{.}\)

In some cases, it may take some effort to determine whether a set is finite or infinite. Here is a truly classic result.

Here's a famous example of a set where no one knows if the set is finite or not.

This conjecture is known as the Twin Primes Conjecture. Guaranteed \(\text{A} ++\) for any student who can settle it!

When \(X\) is a finite non-empty set, the cardinality of \(X\text{,}\) denoted \(|X|\) is the unique positive integer \(n\) for which there is a bijection \(f:[n]\bijection X\text{.}\) Intuitively, \(|X|\) is the number of elements in \(X\text{.}\) For example, \begin{equation*} |\{(6,2), (8,(4,\emptyset)), \{3,\{5\}\}\}|=3. \end{equation*} By convention, the cardinality of the empty set is taken to be zero, and we write \(|\emptyset|=0\text{.}\)

We note that the statement in Proposition B.11 is an example of “operator overloading”, a technique featured in several programming languages. Specifically, the times sign \(\times\) is used twice but has different meanings. As part of \(X\times Y\text{,}\) it denotes the cartesian product, while as part of \(|X|\times |Y|\text{,}\) it means ordinary multiplication of positive integers. Programming languages can keep track of the data types of variables and apply the correct interpretation of an operator like \(\times\) depending on the variables to which it is applied.

We also have the following general form of Proposition B.11: \begin{equation*} |X_1\times X_2\times\dots\times X_n|= |X_1|\times |X_2|\times\dots\times |X_n| \end{equation*}

SectionB.6Notation from Set Theory and Logic

In set theory, it is common to deal with statements involving one or more elements from the universe as variables. Here are some examples:

  1. For \(n\in\posints\text{,}\) \(n^2-6n+8=0\text{.}\)

  2. For \(A\subseteq[100]\text{,}\) \(\{2,8,25,58,99\}\subseteq A\text{.}\)

  3. For \(n\in \ints\text{,}\) \(|n|\) is even.

  4. For \(x\in \reals\text{,}\) \(1+1=2\text{.}\)

  5. For \(m,n\in \posints\text{,}\) \(m(m+1)+2n\) is even.

  6. For \(n\in \posints\text{,}\) \(2n+1\) is even.

  7. For \(n\in \posints\) and \(x\in\reals\text{,}\) \(n+x\) is irrational.

These statements may be true for some values of the variables and false for others. The fourth and fifth statements are true for all values of the variables, while the sixth is false for all values.

Implications are frequently abbreviated using with a double arrow \(\Longrightarrow\text{;}\) the quantifier \(\forall\) means “for all” (or “for every”); and the quantifier \(\exists\) means “there exists” (or “there is”). Some writers use \(\wedge\) and \(\vee\) for logical “and” and “or”, respectively. For example, \begin{equation*} \forall A,B\subseteq[4]\quad \bigl((1,2\in A) \wedge |B|\ge 3)\bigr) \Longrightarrow\bigl((A\subseteq B)\vee (\exists n\in A\cup B, n^2=16)\bigr) \end{equation*} The double arrow \(\iff\) is used to denote logical equivalence of statements (also “if and only if”). For example \begin{equation*} \forall A\subseteq[7]\quad A\cap\{1,3,6\}\neq\emptyset\iff A\nsubseteq\{2,4,5,7\} \end{equation*} We will use these notational shortcuts except for the use of \(\wedge\) and \(\vee\text{,}\) as we will use these two symbols in another context: binary operators in lattices.

SectionB.7Formal Development of Number Systems

Up to this point, we have been discussing number systems in an entirely informal manner, assuming everyone knew all that needed to be known. Now let's pause and put things on a more firm foundation. So for the time being, do a memory dump and forget everything you have ever learned about numbers and arithmetic. The set of natural numbers has just been delivered on our door step in a big box with a warning label saying Assembly Required. We open the box and find a single piece of paper on which the following “instructions” are printed. These defining properties of the natural numbers are known as the Peano Postulates.

  1. There is a non-empty set of elements called natural numbers. There is natural number called zero which is denoted \(0\text{.}\) The set of all natural numbers is denoted \(\nonnegints\)

  2. There is a one-to-one function \(s:\nonnegints \injection \nonnegints\) called the successor function. For each \(n\in \nonnegints\text{,}\) \(s(n)\) is called the successor of \(n\text{.}\)

  3. There is no natural number \(n\) for which \(0=s(n)\text{.}\)

  4. Let \(\mathbb{M}\subseteq \nonnegints\text{.}\) Then \(\mathbb{M} =\nonnegints\) if and only if

    1. \(0\in \mathbb{M}\text{;}\) and

    2. \(\forall k\in \nonnegints\quad(k\in \mathbb{M}) \Longrightarrow(s(k)\in \mathbb{M})\text{.}\)

Property iv in the list of Peano Postulates is called the Principle of Mathematical Induction, or just the Principle of Induction. As a first application of the Principle of Induction, we prove the following basic property of the natural numbers.

SubsectionB.7.1Addition as a Binary Operation

A binary operation \(*\) on set \(X\) is just a function \(*:X\times X\rightarrow X\text{.}\) So the image of the ordered pair \((x,y)\) would normally be denoted \(*((x,y))\text{.}\) However, this is usually abbreviated as \(*(x,y)\) or even more compactly as \(x*y\text{.}\) With this convention, we now define a binary operation \(+\) on the set \(\nonnegints\) of natural numbers. This operation is defined as follows for every natural number \(n\in\nonnegints\text{:}\)

  1. \(n+0=n\text{.}\)

  2. For all \(k\in\nonnegints\text{,}\) \(n+s(k) = s(n+k)\text{.}\)

We pause to make it clear why the preceding two statements define \(+\text{.}\) Let \(n\) be an arbitrary natural number. Then let \(\mathbb{M}\) denote the set of all natural numbers \(m\) for which \(n+m\) is defined. Note that \(0\in \mathbb{M}\) by part (i). Also note that for all \(k\in \nonnegints\text{,}\) \(s(k)\in \mathbb{M}\) whenever \(k\in \mathbb{M}\) by part (ii). This shows that \(\mathbb{M}=\nonnegints\text{.}\) Since \(n\) was arbitrary, this allows us to conclude that \(n+m\) is defined for all \(n,m\in \nonnegints\text{.}\)

We read \(n+m\) as \(n\) plus \(m\text{.}\) The operation \(+\) is also called addition.

Among the natural numbers, the successor of zero plays a very important role, so important that it deserves its own special symbol. Here we follow tradition and call the natural number \(s(0)\) one and denote it by \(1\text{.}\) Note that for every natural number \(n\text{,}\) we have \(n+1=n+s(0)=s(n)\text{.}\) In particular, \(0+1=1\text{.}\)

With this notation, the Principle of Induction can be restated in the following form.


In proofs to follow, we will trim out some of the wording and leave only the essential mathematical steps intact. In particular, we will (i) omit reference to the set \(\mathbb{M}\text{,}\) and (ii) drop the phrase “For all \(k\in\nonnegints\)” For example, to define addition, we will just write (i) \(n+0=n\text{,}\) and (ii) \(n+(k+1) = (n+k)+1\text{.}\)

We next prove the commutative property, a task that takes two steps. First, we prove the following special case.


SectionB.8Multiplication as a Binary Operation

We define a binary operation \(\times\text{,}\) called multiplication, on the set of natural numbers. When \(m\) and \(n\) are natural numbers, \(m\times n\) is also called the product of \(m\) and \(n\text{,}\) and it sometimes denoted \(m*n\) and even more compactly as \(mn\text{.}\) We will use this last convention in the material to follow. Let \(n\in \nonnegints\text{.}\) We define

  1. \(n0=0\text{,}\) and
  2. \(n(k+1)=nk +n\text{.}\)

Note that \(10=0\) and \(01=00+0=0\text{.}\) Also, note that \(11=10+1=0+1=1\text{.}\) More generally, from (ii) and Lemma B.19, we conclude that if \(m,n\neq0\text{,}\) then \(mn\neq0\text{.}\)


The commutative law requires some preliminary work.



We now define a binary operation called exponentiation which is defined only on those ordered pairs \((m,n)\) of natural numbers where not both are zero. The notation for exponentiation is non-standard. In books, it is written \(m^n\) while the notations \(m**n\text{,}\) \(m\wedge n\) and \(\exp(m,n)\) are used in-line. We will use the \(m^n\) notation for the most part.

When \(m=0\text{,}\) we set \(0^n=0\) for all \(n\in\nonnegints\) with \(n\neq0\text{.}\) Now let \(m\neq0\text{.}\) We define \(m^n\) by (i) \(m^0=1\) and (ii) \(m^{k+1}=mm^k\text{.}\)


SectionB.10Partial Orders and Total Orders

A binary relation \(R\) on a set \(X\) is just a subset of the cartesian product \(X\times X\text{.}\) In discussions of binary relations, the notation \((x,y)\in R\) is sometimes written as \(xRy\text{.}\)

A binary relation \(R\) is:

  1. reflexive if \((x,x)\in R\) for all \(x\in X\text{.}\)
  2. antisymmetric if \(x=y\) whenever \((x,y)\in R\) and \((y,x)\in R\text{,}\) for all \(x,y\in X\text{.}\)
  3. transitive if \((x,y)\in R\) and \((y,z)\in R\) imply \((x,z)\in R\text{,}\) for all \(x,y,z\in X\text{.}\)

A binary relation \(R\) on a set \(X\) is called a partial order on \(X\) when it is reflexive, antisymmetric, and transitive. Traditionally, symbols like \(\le\) and \(\subseteq\) are used to denote partial orders. As an example, recall that if \(X\) is a family of sets, we write \(A\subseteq B\) when \(A\) is a subset of \(B\text{.}\)

When using the ordered pair notation for binary relations, to indicate that a pair \((x,y)\) is not in the relation, we simply write \((x,y)\notin R\text{.}\) When using the alternate notation, this is usually denoted by using the negation symbol from logic and writing \(\lnot (xRy)\text{.}\) Most of the special symbols used to denote partial orders come with negative versions, e.g., \(x\not\le y\text{,}\) \(x\nsubseteq y\text{.}\)

A partial order is called a total order on \(X\) when for all \(x,y\in X\text{,}\) \((x,y)\in R\) or \((y,x)\in R\text{.}\) For example, if \begin{equation*} X=\{\emptyset,\{\emptyset\},\{\emptyset,\{\emptyset\}\}\} \end{equation*} then \(\subseteq\) is a total order on \(X\text{.}\)

When \(\le\) is a partial order on a set \(X\text{,}\) we write \(x\lt y\) when \(x\le y\) and \(x\neq y\text{.}\)

SectionB.11A Total Order on Natural Numbers

Let \(m,n\in \nonnegints\text{.}\) Define a binary relation \(\le\) on \(\nonnegints\) by setting \(m\le n\) if and only if there exists a natural number \(p\) so that \(m+p=n\text{.}\)

Note that if \(m,n\in\nonnegints\text{,}\) then \(m\lt n\) if and only if there exists a natural number \(p\neq 0\) so that \(m+p=n\text{.}\)


SectionB.12Notation for Natural Numbers

In some sense, we already have a workable notation for natural numbers. In fact, we really didn't need a special symbol for \(s(0)\text{.}\) The natural number \(0\) and the sucessor function \(s\) are enough. For example, the positive integer associated with the number of fingers (including the thumb) on one hand is \(s(s(s(s(s(0)))))\text{,}\) our net worth is \(0\text{,}\) and the age of Professor Trotter's son in years when this section was first written was \begin{equation*} s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(0)))))))))))))))))). \end{equation*} Admittedly, this is not very practical, especially if some day we win the lottery or want to discuss the federal deficit. So it is natural (ugh!) to consider alternative notations.

Here is one such scheme. First, let's decide on a natural \(b>s(0)\) as base. We will then develop a notation which is called the base \(b\) notation. We already have a special symbol for zero, namely \(0\text{,}\) but we need additional symbols for each natural number \(n\) with \(0\lt n\lt b\text{.}\) These symbols are called digits. For example, the positive integer \(b=s(s(s(s(s(s(s(s(0))))))))\) is called eight, and it makes a popular choice as a base. Here are the symbols (digits) customarily chosen for this base: \(1=s(0)\text{,}\) \(2=s(1)\text{;}\) \(3=s(2)\text{;}\) \(4=s(3)\text{;}\) \(5=s(4)\text{;}\) \(6=s(5)\text{;}\) and \(7=s(6)\text{.}\) Technically speaking, it is not necessary to have a separate symbol for \(b\text{,}\) but it might be handy regardless. In this case, most people prefer the symbol \(8\text{.}\) We like this symbol, unless and until it gets lazy and lays down sideways.

So the first \(8\) natural numbers are then \(0\text{,}\) \(1\text{,}\) \(2\text{,}\) \(3\text{,}\) \(4\text{,}\) \(5\text{,}\) \(6\) and \(7\text{.}\) To continue with our representation, we want to use the following basic theorem.


Now suppose that for some \(k\in \nonnegints\text{,}\) with \(k\ge 7\text{,}\) we have defined a base eight notation for the representation of \(k\text{,}\) for all \(n\) with \(0\le n\le k\text{,}\) and that in each case, this representation consists of a string of digits, written left to right, and selected from \(\{0,1,2,3,4,5,6,7\}\text{.}\) Write \(k+1=qb+r\) where \(0\le r\lt b\text{.}\) Note that \(q\le k\text{,}\) so that we already have a representation for \(q\text{.}\) To obtain a representation of \(k+1\text{,}\) we simply append \(r\) at the (right) end.

For example, consider the age of Professor Trotter's son. It is then written as 22. And to emphasize the base eight notation, most people would say \(22\text{,}\) base \(8\) and write \((22)_8\text{.}\)

Among the more popular bases are base 2, where only the digits 0 and 1 are used, and base sixteen, where sixteen is the popular word for \((20)_8\text{.}\) Here the digit symbols are \begin{equation*} 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F \end{equation*} Another popular choice, in fact the one in most widespread use in banks, shopping centers and movie theatres, is base ten. Ten is the natural number A, base sixteen. Also, ten is \((12)_8\text{.}\) Most folks use the digits \(0,1,2,3,4,5,6,7,8,9\) for base ten notation. And when no other designation is made, then it is assumed that the natural number is written base ten. So of course, Professor Trotter's son is 18 and is a freshman at Georgia Tech. Which explains why his hair is as white as it is.

For any base \(b>1\text{,}\) caution must be exercised when discussing multiplication, since writing the product \(m\times n\) in the abbreviated form \(mn\) causes us some grief. For example, if \(b = 8\text{,}\) then writing the product \(372\times4885\) as \(3724885\) is ambiguous. For this reason, when using base \(b\) notation, the product symbol \(\times\) (or some variation of \(\times\)) is always used.

SubsectionB.12.1Alternate Versions of Induction

Many authors prefer to start the development of number systems with the set of positive integers and defer the introduction of the concept of zero. In this setting, you have a non-empty set \(\posints\text{,}\) a one-to-one successor function \(s:\posints\injection\posints\) and a positive integer called one and denoted \(1\) that is not the successor of any positive integer. The Principle of Induction then becomes: If \(\mathbb{M}\subseteq\posints\text{,}\) then \(\mathbb{M} =\posints\) if and only if

  1. \(1\in \mathbb{M}\text{;}\) and
  2. \(\forall k\in \nonnegints\quad(k\in \mathbb{M}) \Longrightarrow(s(k)\in \mathbb{M})\text{.}\)

More generally, to show that a set \(\mathbb{M}\) contains all integers greater than or equal to an integer \(n\text{,}\) it is sufficient to show that (i) \(n\in\mathbb{M}\text{,}\) and (ii) For all \(k\in\ints,\,(k\in\mathbb{M}\Longrightarrow(k+1\in\mathbb{M})\text{.}\)

Here is another version of induction, one that is particularly useful in combinatorial arguments.

SectionB.13Equivalence Relations

A binary relation \(R\) is symmetric if \((x,y)\in R\) implies \((y,x)\in R\) for all \(x,y\in X\text{.}\)

A binary relation \(R\) on a set \(X\) is called an equivalence relation when it is reflexive, symmetric, and transitive. Typically, symbols like, \(=\text{,}\) \(\cong\text{,}\) \(\equiv\) and \(\sim\) are used to denote equivalence relations. An equivalence relation, say \(\cong\text{,}\) defines a partition on the set \(X\) by setting \begin{equation*} \langle x\rangle =\{y\in X: x\cong y\}. \end{equation*} Note that if \(x,y\in X\) and \(\langle x\rangle\cap\langle y\rangle \neq\emptyset\text{,}\) then \(\langle x\rangle=\langle y\rangle\text{.}\) The sets in this partition are called equivalence classes.

When using the ordered pair notation for binary relations, to indicate that a pair \((x,y)\) is not in the relation, we simply write \((x,y)\notin R\text{.}\) When using the alternate notation, this is usually denoted by using the negation symbol from logic and writing \(\lnot (xRy)\text{.}\) Many of the special symbols used to denote equivalence relations come with negative versions: \(x\neq y\text{,}\) \(x\ncong y\text{,}\) \(x\nsim y\text{,}\) etc.

SectionB.14The Integers as Equivalence Classes of Ordered Pairs

Define a binary relation \(\cong\) on the set \(Z=\nonnegints \times\nonnegints\) by \begin{equation*} (a,b)\cong (c,d)\quad\textit{iff}\quad a+d=b+c. \end{equation*}

Now that we know that \(\cong\) is an equivalence relation on \(Z\text{,}\) we know that \(\cong\) partitions \(Z\) into equivalence classes. For an element \((a,b)\in Z\text{,}\) we denote the equivalence class of \((a,b)\) by \(\langle (a,b)\rangle\text{.}\)

Let \(\ints\) denote the set of all equivalence classes of \(Z\) determined by the equivalence relation \(\cong\text{.}\) The elements of \(\ints\) are called integers.

SectionB.15Properties of the Integers

For the remainder of this chapter, most statements will be given without proof. Students are encouraged to fill in the details.

We define a binary operation \(+\) on \(\ints\) by the following rule: \begin{equation*} \langle(a,b)\rangle+\langle(c,d)\rangle = \langle(a+c,b+d)\rangle. \end{equation*} Note that the definition of addition is made in terms of representatives of the class, so we must pause to make sure that \(+\) is well defined, i.e., independent of the particular representatives.

In what follows, we use a single symbol, like \(x\text{,}\) \(y\) or \(z\) to denote an integer, but remember that each integer is in fact an entire equivalence class whose elements are ordered pairs of natural numbers.

Next, we define a second binary operation called multiplication, and denoted \(x\times y\text{,}\) \(x*y\) or just \(xy\text{.}\) When \(x=\langle(a,b)\rangle\) and \(y=\langle(c,d)\rangle\text{,}\) we define: \begin{equation*} xy =\langle(a,b)\rangle\langle(c,d)\rangle= \langle(ac+bd, ad+bc)\rangle. \end{equation*}

The integer \(\langle(0,0)\rangle\) has a number of special properties. Note that for all \(x\in\ints\text{,}\) \(x+\langle(0,0)\rangle= x\) and \(x\langle(0,0)\rangle=\langle(0,0)\rangle\text{.}\) So most folks call \(\langle(0,0)\rangle\) zero and denote it by \(0\text{.}\) This is a terrible abuse of notation, since we have already used the word zero and the symbol \(0\) to denote a particular natural number.

But mathematicians, computer scientists and even real people do this all the time. We use the same word and even the same phrase in many different settings expecting that the listener will make the correct interpretation. For example, how many different meanings do you know for You're so bad?

If \(x=\langle(a,b)\rangle\) is an integer and \(y= \langle(b,a)\rangle\text{,}\) then \(x+y=\langle(a+b,a+b)\rangle=0\text{.}\) The integer \(y\) is then called the additive inverse of \(x\) and is denoted \(-x\text{.}\) The additive inverse of \(x\) is also called minus \(x\). The basic property is that \(x + (-x) = 0\text{,}\) for every \(x\in \ints\text{.}\)

We can now define a new binary operation, called subtraction and denoted \(-\text{,}\) on \(\ints\) by setting \(x-y= x+(-y)\text{.}\) In general, subtraction is neither commutative nor associative. However, we do have the following basic properties.

Next, we define a total order on \(\ints\) by setting \(x\le y\) in \(\ints\) when \(x=\langle(a,b)\rangle\text{,}\) \(y=\langle(c,d) \rangle\) and \(a+d \le b+c\) in \(\nonnegints\text{.}\)

For multiplication, the situation is more complicated.

Now consider the function \(f:\nonnegints\longrightarrow \ints\) defined by \(f(n) = \langle(n,0)\rangle\text{.}\) It is easy to show that \(f\) is an injection. Furthermore, it respects addition and multiplication, i.e., \(f(n+m)=f(n)+f(m)\) and \(f(nm)=f(n)f(m)\text{.}\) Also, note that if \(x\in \ints\text{,}\) then \(x>0\) if and only if \(x=f(n)\) for some \(n\in \nonnegints\text{.}\) So, it is customary to abuse notation slightly and say that \(\nonnegints\) is a “subset” of \(\ints\text{.}\) Similarly, we can either consider the set \(\posints\) of positive integers as the set of natural numbers that are successors, or as the set of integers that are greater than \(0\text{.}\)

When \(n\) is a positive integer and \(0\) is the zero in \(\ints\text{,}\) we define \(0^n=0\text{.}\) When \(x\in\ints\text{,}\) \(x\neq 0\) and \(n\in\nonnegints\text{,}\) we define \(x^n\) inductively by (i) \(x^0=1\) and \(x^{k+1}=xx^k\text{.}\)

SectionB.16Obtaining the Rationals from the Integers

We consider the set \(\rats\) of all ordered pairs in \(\ints\times \ints\) of the form \((x,y)\) with \(y\neq 0\text{.}\) Elements of \(\rats\) are called rational numbers, or fractions. Define an equivalence relation, denoted \(=\text{,}\) on \(\rats\) by setting \((x,y)=(z,w)\) if and only if \(xw=yz\text{.}\) Here we should point out that the symbol \(=\) can be used (and often is) to denote an equivalence relation. It is not constrained to mean “identically the same.”

When \(q=(x,y)\) is a fraction, \(x\) is called the numerator and \(y\) is called the denominator of \(q\text{.}\) Remember that the denominator of a fraction is never zero.

Addition of fractions is defined by \begin{equation*} (a,b)+(c,d) = (ad+bc,bd), \end{equation*} while multiplication is defined by \begin{equation*} (a,b)(c,d) = (ac,bd). \end{equation*} As was the case with integers, it is important to pause and prove that both operations are well defined.

Addition and multiplication are both associative and commutative. Also, we have the distributive property.

The additive inverse of a fraction \((a,b)\) is just \((-a,b)\text{.}\) Using this, we define subtraction for fractions: \((a,b)-(c,d)=(a,b)+(-c,d)\text{.}\)

When \((a,b)\) is a fraction, and \(a\neq 0\text{,}\) the fraction \((b,a)\) is the reciprocal of \((a,b)\text{.}\) The reciprocal is also called the multiplicative inverse, and the reciprocal of \(x\) is denoted \(x^{-1}\text{.}\) When \(y\neq0\text{,}\) we can then define division by setting \(x/y=xy^{-1}\text{,}\) i.e., \((a,b)/(c,d)=(ad, bc)\text{.}\) Of course, division by zero is not defined, a fact that you probably already knew!

As was the case for both \(\nonnegints\) and \(\ints\text{,}\) when \(n\) is a positive integer, and \(0\) is the zero in \(\rats\text{,}\) we define \(0^n=0\text{.}\) When \(x=(a,b)\) is a fraction with \(x\neq0\) and \(n\) is a non-negative integer, we define \(x^n\) inductively by (i) \(x^0=1\) and (ii) \(x^{n+1}=xx^n\text{.}\)

Many folks prefer an alternate notation for fractions in which the numerator is written directly over the denominator with a horizontal line between them, so \((2,5)\) can also be written as \(\frac{2}{5}\text{.}\)

Via the map \(g(x) = (x,1)=\frac{x}{1}\text{,}\) we again say that the integers are a “subset” of the rationals. As before, note that \(g(x+y) = g(x)+g(y)\text{,}\) \(g(x-y)=g(x)-g(y)\) and \(g(xy)=g(x)g(y)\text{.}\)

In the third grade, you were probably told that \(5 =\frac{5}{1}\text{,}\) but by now you are realizing that this is not exactly true. Similarly, if you had told your teacher that \(\frac{3}{4}\) and \(\frac{6}{8}\) weren't really the same and were only “equal” in the broader sense of an equivalence relation defined on a subset of the cartesian product of the integers, you probably would have been sent to the Principal's office.

Try to imagine the trouble you would have gotten into had you insisted that the real meaning of \(\frac{1}{2}\) was \begin{equation*} \frac{1}{2} =\langle(\langle(s(s(0)),s(0))\rangle, \langle(s(s(0)),0)\rangle)\rangle \end{equation*} We can also define a total order on \(\rats\text{.}\) To do this, we assume that \((a,b),(c,d)\in\rats\) have \(b,d>0\text{.}\) (If \(b\lt 0\text{,}\) for example, we would replace it by \((a',b')=(-a,-b)\text{,}\) which is in the same equivalence class as \((a,b)\) and has \(b'>0\text{.}\)) Then we set \((a,b)\leq (c,d)\) in \(\rats\) if \(ad\leq bc\) in \(\ints\text{.}\)

SubsectionB.16.1Integer Exponents

When \(n\) is a positive integer and \(0\) is the zero in \(\rats\text{,}\) we define \(0^n=0\text{.}\) When \(x\in\rats\text{,}\) \(x\neq 0\) and \(n\in\nonnegints\text{,}\) we define \(x^n\) inductively by (i) \(x^0=1\) and \(x^{k+1}=xx^k\text{.}\) When \(n\in\ints\) and \(n\lt 0\text{,}\) we set \(x^n=1/x^{-n}\text{.}\)

SectionB.17Obtaining the Reals from the Rationals

A full discussion of this would take us far away from a discrete math class, but let's at least provide the basic definitions. A subset \(S\subset \rats\) of the rationals is called a cut (also, a Dedekind cut), if it satisfies the following properties:

  1. \(\emptyset\neq S\neq \rats\text{,}\) i.e, \(S\) is a proper non-empty subset of \(\rats\text{.}\)

  2. \(x\in S\) and \(y\lt x\) in \(\rats\) implies \(y\in S\text{,}\) for all \(x,y\in \rats\text{.}\)

  3. For every \(x\in S\text{,}\) there exists \(y\in S\) with \(x\lt y\text{,}\) i.e., \(S\) has no greatest element.

Cuts are also called real numbers, so a real number is a particular kind of set of rational numbers. For every rational number \(q\text{,}\) the set \(\bar{q}= \{p\in \rats: p\lt q\}\) is a cut. Such cuts are called rational cuts. Inside the reals, the rational cuts behave just like the rational numbers and via the map \(h(q)=\bar{q}\text{,}\) we abuse notation again (we are getting used to this) and say that the rational numbers are a subset of the real numbers.

But there are cuts which are not rational. Here is one: \(\{p\in \rats: p\le 0\}\cup \{p\in \rats: p^2\lt 2\}\text{.}\) The fact that this cut is not rational depends on the familiar proof that there is no rational \(q\) for which \(q^2=2\text{.}\)

The operation of addition on cuts is defined in the natural way. If \(S\) and \(T\) are cuts, set \(S+T=\{s+t:s\in S, t\in T\}\text{.}\) Order on cuts is defined in terms of inclusion, i.e., \(S\lt T\) if and only if \(S\subsetneq T\text{.}\) A cut is positive if it is greater than \(\bar{0}\text{.}\) When \(S\) and \(T\) are positive cuts, the product \(ST\) is defined by \begin{equation*} ST= \bar{0}\cup\{st:s\in S, t\in T, s\ge0, t\ge 0\}. \end{equation*} One can easily show that there is a real number \(r\) so that \(r^2=\bar{2}\text{.}\) You may be surprised, but perhaps not, to learn that this real number is denoted \(\sqrt2\text{.}\)

There are many other wonders to this story, but enough for one day.

SectionB.18Obtaining the Complex Numbers from the Reals

By now, the following discussion should be transparent. The complex number system \(\complexes\) is just the cartesian product \(\reals\times\reals\) with

  1. \((a,b) = (c,d)\) in \(\complexes\) if and only if \(a=c\) and \(b=d\) in \(\reals\text{.}\)

  2. \((a,b)+(c,d)=(a+c,b+d)\text{.}\)

  3. \((a,b)(c,d)=(ac-bd, ad+bc)\text{.}\)

Now the complex numbers of the form \((a,0)\) behave just like real numbers, so is natural to say that the complex number system contains the real number system. Also, note that \((0,1)^2=(0,1)(0,1)=(-1,0)\text{,}\) i.e., the complex number \((0,1)\) has the property that its square is the complex number behaving like the real number \(-1\text{.}\) So it is convenient to use a special symbol like \(i\) for this very special complex number and note that \(i^2=-1\text{.}\)

With this beginning, it is straightforward to develop all the familiar properties of the complex number system.

SubsectionB.18.1Decimal Representation of Real Numbers

Every real number has a decimal expansion—although the number of digits after the decimal point may be infinite. A rational number \(q=m/m\) from \(\rats\) has an expansion in which a certain block of digits repeats indefinitely. For example, \begin{equation*} \frac{2859}{35} = 81.6857142857142857142857142857142857142857142\dots \end{equation*} In this case, the block \(857142\) of size \(6\) is repeated forever.

Certain rational numbers have terminating decimal expansions. For example, we know that \(385/8= 48.125\text{.}\) If we chose to do so, we could write this instead as an infinite decimal by appending trailing \(0\)'s, as a repeating block of size \(1\text{:}\) \begin{equation*} \frac{385}{8} = 48.1250000000000000000000000000000000\dots \end{equation*} On the other hand, we can also write the decimal expansion of \(385/8\) as \begin{equation*} \frac{385}{8} = 48.12499999999999999999999999999999999\dots \end{equation*} Here, we intend that the digit \(9\text{,}\) a block of size \(1\text{,}\) be repeated forever. Apart from this anomaly, the decimal expansion of real numbers is unique.

On the other hand, irrational numbers have non-repeating decimal expansions in which there is no block of repeating digits that repeats forever.

You know that \(\sqrt{2}\) is irrational. Here is the first part of its decimal expansion: \begin{equation*} \sqrt{2} =1.41421356237309504880168872420969807856967187537694807317667973\dots \end{equation*} An irrational number is said to be algebraic if it is the root of polynomial with integer coefficients; else it is said to be transcendental. For example, \(\sqrt{2}\) is algebraic since it is the root of the polynomial \(x^2-2\text{.}\)

Two other famous examples of irrational numbers are \(\pi\) and \(e\text{.}\) Here are their decimal expansions: \begin{align*} \pi \amp =3.14159265358979323846264338327950288419716939937510582097494459\dots\\ \end{align*} and \begin{align*} e\amp=2.7182818284590452353602874713526624977572470936999595749669676277\dots \end{align*} Both \(\pi\) and \(e\) are transcendental.


Amanda and Bilal, both students at a nearby university, have been studying rational numbers that have large blocks of repeating digits in their decimal expansions. Amanda reports that she has found two positive integers \(m\) and \(n\) with \(n\lt 500\) for which the decimal expansion of the rational number \(m/n\) has a block of 1961 digits which repeats indefinitely. Not to be outdone, Bilal brags that he has found such a pair \(s\) and \(t\) of positive integers with \(t\lt 300\) for which the decimal expansion of \(s/t\) has a block of \(7643\) digits which repeats indefinitely. Bilal should be (politely) told to do his arithmetic more carefully, as there is no such pair of positive integers (Why?). On the other hand, Amanda may in fact be correct—although, if she has done her work with more attention to detail, she would have reported that the decimal expansion of \(m/n\) has a smaller block of repeating digits (Why?).

SectionB.19The Zermelo-Fraenkel Axioms of Set Theory

In the first part of this appendix, we put number systems on a firm foundation, but in the process, we used an intuitive understanding of sets. Not surprisingly, this approach is fraught with danger. As was first discovered more than 100 years ago, there are major conceptual hurdles in formulating consistent systems of axioms for set theory. And it is very easy to make statements that sound “obvious” but are not.

Here is one very famous example. Let \(X\) and \(Y\) be sets and consider the following two statements:

  1. There exists an injection \(f:X\rightarrow Y\text{.}\)

  2. There exists a surjection \(g:Y\rightarrow X\text{.}\)

If \(X\) and \(Y\) are finite sets, these statements are equivalent, and it is perhaps natural to surmise that the same is true when \(X\) and \(Y\) are infinite. But that is not the case.

Here is the system of axioms popularly known as ZFC, which is an abbreviation for Zermelo-Fraenkel plus the Axiom of Choice. In this system, the notion of set and the membership operator \(\in\) are undefined. However, if \(A\) and \(B\) are sets, then exactly one of the following statements is true: (i) \(A\in B\) is true; (ii) \(A\in B\) is false. When \(A\in B\) is false, we write \(A\notin B\text{.}\) Also, there is an equivalence relation \(=\) defined on sets.

A good source of additional (free) information on set theory is the collection of Wikipedia articles. Do a web search and look up the following topics and people:

  1. Zermelo-Fraenkel set theory.

  2. Axiom of Choice.

  3. Peano postulates.

  4. Georg Cantor, Augustus De Morgan, George Boole, Bertrand Russell and Kurt Gödel.