AppendixBBackground Material for Combinatorics¶ permalink
This appendix treats background material essential to the study of combinatorial mathematics. Many students will find that most—and perhaps all—of this material has been covered somewhere in their prior course work, and we expect that very few instructors will include this appendix in the syllabus. Nevertheless, students may find it convenient to consult this appendix from time to time, and we suspect that many instructors will encourage students to read this material to refresh their memories of key concepts.
Set theory is concerned with elements, certain collections of elements called sets and a concept of membership. For each element \(x\) and each set \(X\text{,}\) exactly one of the following two statements holds:
\(x\) is a member of \(X\text{.}\)
\(x\) is not a member of \(X\text{.}\)
It is important to note that membership cannot be ambiguous.
When \(x\) is an element and \(X\) is a set, we write \(x\in X\) when \(x\) is a member of \(X\text{.}\) Also, the statement \(x\) belongs to \(X\) means exactly the same thing as \(x\) is a member of \(X\text{.}\) Similarly, when \(x\) is not a member of \(X\text{,}\) we write \(x\notin X\) and say \(x\) does not belong to \(X\text{.}\)
Certain sets will be defined explicitly by listing the elements. For example, let \(X=\{a,b,d,g,m\}\text{.}\) Then \(b\in X\) and \(h\notin X\text{.}\) The order of elements in such a listing is irrelevant, so we could also write \(X=\{g,d,b,m,a\}\text{.}\) In other situations, sets will be defined by giving a rule for membership. As examples, let \(\posints\) denote the set of positive integers. Then let \(X=\{n\in\posints:5\le n\le 9\}\text{.}\) Note that \(6,8\in X\) while \(4,10,238\notin X\text{.}\)
Given an element \(x\) and a set \(X\text{,}\) it may at times be tedious and perhaps very difficult to determine which of the statements \(x\in X\) and \(x\notin X\) holds. But if we are discussing sets, it must be the case that exactly one is true.
ExampleB.1
Let \(X\) be the set consisting of the following \(12\) positive integers:
\begin{align*}
\amp 13232112332\\
\amp 13332112332\\
\amp 13231112132\\
\amp 13331112132\\
\amp 13232112112\\
\amp 13231112212\\
\amp 13331112212\\
\amp 13232112331\\
\amp 13231112131\\
\amp 13331112131\\
\amp 13331112132\\
\amp 13332112111\\
\amp 13231112131
\end{align*}
Note that one number is listed twice. Which one is it? Also, does \(13232112132\) belong to \(X\text{?}\) Note that the apparent difficulty of answering these questions stems from (1) the size of the set \(X\) and (2) the size of the integers that belong to \(X\text{.}\) Can you think of circumstances in which it is difficult to answer whether \(x\) is a member of \(X\) even when it is known that \(X\) contains exactly one element?
ExampleB.2
Let \(P\) denote the set of primes. Then \(35\notin P\) since \(35= 5\times 7\text{.}\) Also, \(19\in P\text{.}\) Now consider the number
\begin{equation*}
n = 77788467064627123923601532364763319082817131766346039653933
\end{equation*}
Does \(n\) belong to \(P\text{?}\) Alice says yes while Bob says no. How could Alice justify her affirmative answer? How could Bob justify his negative stance? In this specific case, I know that Alice is right. Can you explain why?
When \(X\) and \(Y\) are sets, the intersection of \(X\) and \(Y\text{,}\) denoted \(X\cap Y\), is defined by
\begin{equation*}
X\cap Y = \{x: x\in X, x\in Y\}
\end{equation*}
Note that this notation uses the convention followed by many programming languages. Namely, the “comma” in the definition means that both requirements for membership be satisfied. For example, if \(X=\{b,c,e,g,m\}\) and \(Y=\{a,c,d,h,m,n,p\}\text{,}\) then \(X\cap Y=\{c,m\}\text{.}\)
SubsectionB.2.1The Meaning of \(2\)-Letter Words
In the not too distant past, there was considerable discussion in the popular press on the meaning of the \(2\)-letter word is. For mathematicians and computer scientists, it would have been far more significant to have a discussion of the \(2\)-letter word or. The problem is that the English language uses or in two fundamentally different ways. Consider the following sentences:
A nearby restaurant has a dinner special featuring two choices for dessert: flan de casa or tirami-su.
A state university accepts all students who have graduated from in-state high schools and have SAT scores above \(1000\) or have grade point averages above \(3.0\text{.}\)
A local newspaper offers customers the option of paying their for their newspaper bills on a monthly or semi-annual basis.
In the first and third statement, it is clear that there are two options but that only one of them is allowed. However, in the second statement, the interpretation is that admission will be granted to students who satisfy at least one of the two requirements. These interpretations are called respectively the exclusive and inclusive versions of or. In this class, we will assume that whenever the word “or” is used, the inclusive interpretation is intended—unless otherwise stated.
For example, when \(X\) and \(Y\) are sets, the union of \(X\) and \(Y\text{,}\) denoted \(X\cup Y\text{,}\) is defined by
\begin{equation*}
X\cup Y = \{x: x\in x \text{ or } x\in Y\}.
\end{equation*}
For example, if \(X=\{b,c,e,g,m\}\) and \(Y=\{a,c,d,h,m,n,p\}\text{,}\) then
\begin{equation*}
X\cup Y=\{a,b,c,d,e,g,h,m,n,p\}.
\end{equation*}
Note that \(\cap\) and \(\cup\) are commutative and associative binary operations, as is the case with addition and multiplication for the set \(\posints\) of positive integers, i.e., if \(X\text{,}\) \(Y\) and \(Z\) are sets, then
\begin{equation*}
X\cap Y = Y\cap X \quad\text{ and } \quad X\cup Y = Y\cup X.
\end{equation*}
Also,
\begin{equation*}
X\cap(Y\cap Z)= (X\cap Y)\cap Z\quad\text{ and } \quad
X\cup(Y\cup Z)= (X\cup Y)\cup Z.
\end{equation*}
Also, note that each of \(\cap\) and \(\cup\) distributes over the other, i.e.,
\begin{equation*}
X\cap(Y\cup Z)= (X\cap Y)\cup (X\cap Z)\quad\text{ and } \quad
X\cup(Y\cap Z)= (X\cup Y)\cap (X\cup Z)
\end{equation*}
On the other hand, in \(\posints\text{,}\) multiplication distributes over addition but not vice-versa.
SubsectionB.2.2The Empty Set: Much To Do About Nothing
The empty set, denoted \(\emptyset\) is the set for which \(x\notin \emptyset\) for every element \(x\text{.}\) Note that \(X\cap \emptyset =\emptyset\) and \(X\cup \emptyset=X\text{,}\) for every set \(X\text{.}\)
The empty set is unique in the sense that if \(x\notin X\) for every element \(x\text{,}\) then \(X=\emptyset\text{.}\)
SubsectionB.2.3The First So Many Positive Integers
In this text, we will use the symbols \(\posints\text{,}\) \(\ints\text{,}\) \(\rats\) and \(\reals\) to denote respectively the set of positive integers, the set of all integers (positive, negative and zero), the set of rational numbers (fractions) and the set of real numbers (rationals and irrationals). On occasion, we will discuss the set \(\nonnegints\) of non-negative integers. When \(n\) is a positive integer, we will use the abbreviation \([n]\) for the set \(\{1,2,\dots,n\}\) of the first \(n\) positive integers. For example, \([5]=\{1,2,3,4,5\}\text{.}\) For reasons that may not be clear at the moment but hopefully will be transparent at the right moment, we use the notation \(\bfn\) to denote the \(n\)-element set \(\{0,1,2,\dots,n-1\}\text{.}\) Of course, \(\bfn\) is just the set of the first \(n\) non-negative integers. For example, \(\mathbf{5}=\{0,1,2,3,4\}\text{.}\)
SubsectionB.2.4Subsets, Proper Subsets and Equal Sets
When \(X\) and \(Y\) are sets, we say \(X\) is a subset of \(Y\) and write \(X\subseteq Y\) when \(x\in Y\) for every \(x\in X\text{.}\) When \(X\) is a subset of \(Y\) and there exists at least one element \(y\in Y\) with \(y\notin X\text{,}\) we say \(X\) is a proper subset of \(Y\) and write \(X\subsetneq Y\). For example, the \(P\) of primes is a proper subset of the set \(\posints\) of positive integers.
Surprisingly often, we will encounter a situation where sets \(X\) and \(Y\) have different rules for membership yet both are in fact the same set. For example, let \(X=\{0,2\}\) and \(Y=\{z\in\ints: z+z=z\times z\}\text{.}\) Then \(X=Y\text{.}\) For this reason, it is useful to have a test when sets are equal. If \(X\) and \(Y\) are sets, then
\begin{equation*}
X = Y \quad\text{ if and only if } \quad X\subseteq Y \text{ and } Y\subseteq X.
\end{equation*}
When \(X\) and \(Y\) are sets, the cartesian product of \(X\) and \(Y\text{,}\) denoted \(X\times Y\), is defined by
\begin{equation*}
X\times Y=\{(x,y): x\in X \text{ and } y\in Y\}
\end{equation*}
For example, if \(X=\{a,b\}\) and \(Y=[3]\text{,}\) then
\begin{equation*}
X\times Y=\{(a,1),(b,1),(a,2),(b,2),(a,3),(b,3)\}.
\end{equation*}
Elements of \(X\times Y\) are called ordered pairs. When \(p=(x,y)\) is an ordered pair, the element \(x\) is referred to as the first coordinate of \(p\) while \(y\) is the second coordinate of \(p\text{.}\) Note that if either \(X\) or \(Y\) is the empty set, then \(X\times Y=\emptyset\text{.}\)
ExampleB.3
Let \(X=\{\emptyset,(1,0),\{\emptyset\}\}\) and \(Y=\{(\emptyset,0)\}\text{.}\) Is \(((1,0),\emptyset)\) a member of \(X\times Y\text{?}\)
Cartesian products can be defined for more than two factors. When \(n\ge 2\) is a positive integer and \(X_1,X_2,\dots,X_n\) are non-empty sets, their cartesian product is defined by
\begin{equation*}
X_1\times X_2\times\dots\times X_n=\{(x_1,x_2,\dots,x_n): x_i\in X_i
\text{ for } i = 1,2,\dots,n\}
\end{equation*}
A subset \(R\subseteq X\times Y\) is called a binary relation on \(X\times Y\text{,}\) and a binary relation \(R\) on \(X\times Y\) is called a function from \(X\) to \(Y\) when for every \(x\in X\text{,}\) there is exactly one element \(y\in Y\) for which \((x,y)\in R\text{.}\)
Many authors prefer to write the condition for being a function in two parts:
For every \(x\in X\text{,}\) there is some element \(y\in Y\) for which \((x,y)\in R\text{.}\)
For every \(x\in X\text{,}\) there is at most one element \(y\in Y\) for which \((x,y)\in R\text{.}\)
The second condition is often stated in the following alternative form: If \(x\in X\text{,}\) \(y_1,y_2\in Y\) and \((x,y_1),(x,y_2)\in R\text{,}\) then \(y_1=y_2\text{.}\)
ExampleB.4
For example, let \(X=[4]\) and \(Y=[5]\text{.}\) Then let
\begin{align*}
R_1\amp =\{(2,1),(4,2),(1,1),(3,1)\}\\
R_2\amp =\{(4,2),(1,5),(3,2)\}\\
R_3\amp=\{(3,2),(1,4),(2,2),(1,1),(4,5)\}
\end{align*}
Of these relations, only \(R_1\) is a function from \(X\) to \(Y\text{.}\)
In many settings (like calculus), it is customary to use letters like \(f\text{,}\) \(g\) and \(h\) to denote functions. So let \(f\) be a function from a set \(X\) to a set \(Y\text{.}\) In view of the defining properties of functions, for each \(x\in X\text{,}\) there is a unique element \(y\in Y\) with \((x,y)\in f\text{.}\) And in this case, the convention is to write \(y=f(x)\text{.}\) For example, if \(f=R_1\) is the function in Example B.4, then \(2=f(4)\) and \(f(3) =1\text{.}\)
The shorthand notation \(f:X\rightarrow Y\) is used to indicate that \(f\) is a function from the set \(X\) to the set \(Y\text{.}\)
In calculus, we study functions defined by algebraic rules. For example, consider the function \(f\) whose rule is \(f(x) = 5x^3-8x+7\text{.}\) This short hand notation means that \(X=Y=\reals\) and that
\begin{equation*}
f=\{(x,5x^3-8x+7):x\in\reals\}
\end{equation*}
In combinatorics, we sometimes study functions defined algebraically, just like in calculus, but we will frequently describe functions by other kinds of rules. For example, let \(f:\posints\rightarrow\posints\) be defined by \(f(n) = |n/2|\) if \(n\) is even and \(f(n)=3|n|+1\) when \(n\) is odd.
A function \(f:X\rightarrow Y\) is called an injection from \(X\) to \(Y\) when for every \(y\in Y\text{,}\) there is at most one element \(x\in X\) with \(y=f(x)\text{.}\)
When the meaning of \(X\) and \(Y\) is clear, we just say \(f\) is an injection. An injection is also called a \(1\)–\(1\) function (read this as “one to one”) and this is sometimes denoted as \(f:X\injection Y\text{.}\)
A function \(f:X\rightarrow Y\) is called a surjection from \(X\) to \(Y\) when for every \(y\in Y\text{,}\) there is at least one \(x\in X\) with \(y=f(x)\text{.}\)
Again, when the meaning of \(X\) and \(Y\) is clear, we just say \(f\) is a surjection. A surjection is also called an onto function and this is sometimes denoted as \(f:X\surjection Y\text{.}\)
A function \(f\) from \(X\) to \(Y\) which is both an injection and a surjection is called a bijection. Alternatively, a bijection is referred to as a \(1\)–\(1\text{,}\) onto function, and this is sometimes denoted as \(f:X \bijection Y\). A bijection is also called a \(1\)–\(1\)-correspondence.
ExampleB.5
Let \(X=Y=\reals\text{.}\) Then let \(f\text{,}\) \(g\) and \(h\) be the functions defined by
- \(f(x)=3x-7\text{.}\)
- \(g(x)=3(x-2)(x+5)(x-7)\text{.}\)
- \(h(x)=6x^2-5x+13\text{.}\)
Then \(f\) is a bijection; \(g\) is a surjection but not an injection (Why?); and \(h\) is neither an injection nor a surjection (Why?).
PropositionB.6
Let \(X\) and \(Y\) be sets. Then there is a bijection from \(X\) to \(Y\) if and only if there is a bijection from \(Y\) to \(X\text{.}\)
A set \(X\) is said to be finite when either (1) \(X=\emptyset\text{;}\) or (2) there exists positive integer \(n\) and a bijection \(f:[n]\bijection X\text{.}\) When \(X\) is not finite, it is called infinite. For example, \(\{a,\emptyset,(3,2),\posints\}\) is a finite set as is \(\posints\times\emptyset\text{.}\) On the other hand, \(\posints\times \{\emptyset\}\) is infinite. Of course, \([n]\) and \(\bfn\) are finite sets for every \(n\in\posints\text{.}\)
PropositionB.7
If \(X\) be a non-empty finite set, then there is a unique positive integer \(n\) for which there is a bijection \(f:[n]\bijection X\text{.}\)
In some cases, it may take some effort to determine whether a set is finite or infinite. Here is a truly classic result.
PropositionB.8
The set \(P\) of primes is infinite.
Suppose that the set \(P\) of primes is finite. It is non-empty since \(2\in P\text{.}\) Let \(n\) be the unique positive integer for which there exists a bijection \(f:[n]\rightarrow P\text{.}\) Then let
\begin{equation*}
p=1+f(1)\times f(2)\times f(3)\times \dots\times f(n)
\end{equation*}
Then \(p\) is not divisible by any of the primes in \(P\) but is larger than any element of \(P\text{.}\) Thus, either \(p\) is prime or there is a prime that does not belong to \(P\text{.}\) The contradiction completes the proof.
Here's a famous example of a set where no one knows if the set is finite or not.
ConjectureB.9
It is conjectured that the following set is infinite:
\begin{equation*}
T=\{n\in\posints:n \text{ and } n+2 \text{ are both
primes } \}.
\end{equation*}
This conjecture is known as the Twin Primes Conjecture. Guaranteed \(\text{A} ++\) for any student who can settle it!
PropositionB.10
Let \(X\) and \(Y\) be finite sets. If there exists an injection \(f:X\injection Y\) and an injection \(g:Y \injection X\text{,}\) then there exists a bijection \(h:X \bijection Y\text{.}\)
When \(X\) is a finite non-empty set, the cardinality of \(X\text{,}\) denoted \(|X|\) is the unique positive integer \(n\) for which there is a bijection \(f:[n]\bijection X\text{.}\) Intuitively, \(|X|\) is the number of elements in \(X\text{.}\) For example,
\begin{equation*}
|\{(6,2), (8,(4,\emptyset)), \{3,\{5\}\}\}|=3.
\end{equation*}
By convention, the cardinality of the empty set is taken to be zero, and we write \(|\emptyset|=0\text{.}\)
PropositionB.11
If \(X\) and \(Y\) are finite non-empty sets, then \(|X\times Y| =|X|\times |Y|\text{.}\)
We note that the statement in Proposition B.11 is an example of “operator overloading”, a technique featured in several programming languages. Specifically, the times sign \(\times\) is used twice but has different meanings. As part of \(X\times Y\text{,}\) it denotes the cartesian product, while as part of \(|X|\times |Y|\text{,}\) it means ordinary multiplication of positive integers. Programming languages can keep track of the data types of variables and apply the correct interpretation of an operator like \(\times\) depending on the variables to which it is applied.
We also have the following general form of Proposition B.11:
\begin{equation*}
|X_1\times X_2\times\dots\times X_n|=
|X_1|\times |X_2|\times\dots\times |X_n|
\end{equation*}
TheoremB.12
There is a bijection between any two of the following infinite sets \(\posints\text{,}\) \(\ints\) and \(\rats\text{.}\)
There is an injection from \(\rats\) to \(\reals\text{.}\)
There is no surjection from \(\rats\) to \(\reals\text{.}\)
SectionB.6Notation from Set Theory and Logic¶ permalink
In set theory, it is common to deal with statements involving one or more elements from the universe as variables. Here are some examples:
For \(n\in\posints\text{,}\) \(n^2-6n+8=0\text{.}\)
For \(A\subseteq[100]\text{,}\) \(\{2,8,25,58,99\}\subseteq A\text{.}\)
For \(n\in \ints\text{,}\) \(|n|\) is even.
For \(x\in \reals\text{,}\) \(1+1=2\text{.}\)
For \(m,n\in \posints\text{,}\) \(m(m+1)+2n\) is even.
For \(n\in \posints\text{,}\) \(2n+1\) is even.
For \(n\in \posints\) and \(x\in\reals\text{,}\) \(n+x\) is irrational.
These statements may be true for some values of the variables and false for others. The fourth and fifth statements are true for all values of the variables, while the sixth is false for all values.
Implications are frequently abbreviated using with a double arrow \(\Longrightarrow\text{;}\) the quantifier \(\forall\) means “for all” (or “for every”); and the quantifier \(\exists\) means “there exists” (or “there is”). Some writers use \(\wedge\) and \(\vee\) for logical “and” and “or”, respectively. For example,
\begin{equation*}
\forall A,B\subseteq[4]\quad \bigl((1,2\in A) \wedge |B|\ge 3)\bigr)
\Longrightarrow\bigl((A\subseteq B)\vee (\exists n\in A\cup B,
n^2=16)\bigr)
\end{equation*}
The double arrow \(\iff\) is used to denote logical equivalence of statements (also “if and only if”). For example
\begin{equation*}
\forall A\subseteq[7]\quad A\cap\{1,3,6\}\neq\emptyset\iff
A\nsubseteq\{2,4,5,7\}
\end{equation*}
We will use these notational shortcuts except for the use of \(\wedge\) and \(\vee\text{,}\) as we will use these two symbols in another context: binary operators in lattices.
SectionB.8Multiplication as a Binary Operation¶ permalink
We define a binary operation \(\times\text{,}\) called multiplication, on the set of natural numbers. When \(m\) and \(n\) are natural numbers, \(m\times n\) is also called the product of \(m\) and \(n\text{,}\) and it sometimes denoted \(m*n\) and even more compactly as \(mn\text{.}\) We will use this last convention in the material to follow. Let \(n\in \nonnegints\text{.}\) We define
- \(n0=0\text{,}\) and
- \(n(k+1)=nk +n\text{.}\)
Note that \(10=0\) and \(01=00+0=0\text{.}\) Also, note that \(11=10+1=0+1=1\text{.}\) More generally, from (ii) and Lemma B.19, we conclude that if \(m,n\neq0\text{,}\) then \(mn\neq0\text{.}\)
TheoremB.21Left Distributive Law
\(m(n+p)=mn + mp\text{,}\) for all \(m,n,p\in \nonnegints\text{.}\)
Proof
Let \(m,n\in \nonnegints\text{.}\) Then
\begin{equation*}
m(n+0)=mn = mn +0 = mn+ m0.
\end{equation*}
Now assume \(m(n+k) = mn + mk\text{.}\) Then
\begin{align*}
m[n+(k+1)] \amp = m[(n+k)+1]=m(n+k)+m\\
\amp =(mn+mk)+m=mn+(mk+m)= mn+m(k+1).
\end{align*}
TheoremB.22Right Distributive Law
\((m+n)p=mp + np\text{,}\) for all \(m,n,p\in \nonnegints\text{.}\)
Proof
Let \(m,n\in \nonnegints\text{.}\) Then
\begin{equation*}
(m+n)0 =0 = 0+0 = m0 + n0.
\end{equation*}
Now assume \((m+n)k = mk + nk\text{.}\) Then
\begin{align*}
(m+n)(k+1)\amp =(m+n)k+(m+n)= (mk+nk) +(m+n)\\
\amp =(mk+m)+(nk+n)=m(k+1)+n(k+1).
\end{align*}
TheoremB.23Associative Law of Multiplication
\(m(np) = (mn)p\text{,}\) for all \(m,n,p\in \nonnegints\text{.}\)
Proof
Let \(m,n\in \nonnegints\text{.}\) Then
\begin{equation*}
m(n0)= m0 = 0 = (mn)0.
\end{equation*}
Now assume that \(m(nk)=(mn)k\text{.}\) Then
\begin{equation*}
m[n(k+1)]= m(nk + n)= m(nk) + mn =(mn)k + mn = (mn)(k+1).
\end{equation*}
The commutative law requires some preliminary work.
LemmaB.24
\(n0= 0n=0\text{,}\) for all \(n\in \nonnegints\text{.}\)
The lemma holds trivially when \(n=0\text{.}\) Assume \(k0=
0k=0\text{.}\) Then
\begin{equation*}
(k+1)0 =0 = 0+0= 0k+0=0(k+1).
\end{equation*}
LemmaB.25
\(n1 =1n=n\text{,}\) for every \(n\in \nonnegints\text{.}\)
\(01=00+0=0 =10\text{.}\) Assume \(k1=1k=k\text{.}\) Then
\begin{equation*}
(k+1)1=k1+11=1k+1=1(k+1).
\end{equation*}
TheoremB.26Commutative Law of Multiplication
\(mn=nm\text{,}\) for all \(m,n\in \nonnegints\text{.}\)
Proof
Let \(m\in \nonnegints\text{.}\) Then \(m0=0m\text{.}\) Assume \(mk=km\text{.}\) Then
\begin{equation*}
m (k+1) = mk +m = km+m= km +1m=(k+1)m.
\end{equation*}
We now define a binary operation called exponentiation which is defined only on those ordered pairs \((m,n)\) of natural numbers where not both are zero. The notation for exponentiation is non-standard. In books, it is written \(m^n\) while the notations \(m**n\text{,}\) \(m\wedge n\) and \(\exp(m,n)\) are used in-line. We will use the \(m^n\) notation for the most part.
When \(m=0\text{,}\) we set \(0^n=0\) for all \(n\in\nonnegints\) with \(n\neq0\text{.}\) Now let \(m\neq0\text{.}\) We define \(m^n\) by (i) \(m^0=1\) and (ii) \(m^{k+1}=mm^k\text{.}\)
TheoremB.27
For all \(m,n,p\in\nonnegints\) with \(m\neq0\text{,}\) \(m^{n+p}=m^n\,m^p\text{.}\)
Proof
Let \(m,n\in\nonnegints\) with \(m\neq0\text{.}\) Then \(m^{n+0}=m^n=m^n\,1=m^n\,m^0\text{.}\) Now suppose that \(m^{n+k}=m^n\,m^k\text{.}\) Then
\begin{equation*}
m^{n+(k+1)}=m^{(n+k)+1}=m\,m^{n+k}
= m(m^n\,m^k)=m^n(m\,m^k)=m^n\,m^{k+1}.
\end{equation*}
TheoremB.28
For all \(m,n,p\in\nonnegints\) with \(m\neq0\text{,}\) \((m^n)^p=m^{np}\text{.}\)
Proof
Let \(m,n\in\nonnegints\) with \(m\neq0\text{.}\) Then \((m^n)^0=1=m^0=m^{n0}\text{.}\) Now suppose that \((m^n)^k=m^{nk}\text{.}\) Then
\begin{equation*}
(m^n)^{k+1}=m^n(m^n)^k=m^n(m^{nk})=m^{n+nk}=m^{n(k+1)}.
\end{equation*}
A binary relation \(R\) on a set \(X\) is just a subset of the cartesian product \(X\times X\text{.}\) In discussions of binary relations, the notation \((x,y)\in R\) is sometimes written as \(xRy\text{.}\)
A binary relation \(R\) is:
-
reflexive if \((x,x)\in R\) for all \(x\in X\text{.}\)
-
antisymmetric if \(x=y\) whenever \((x,y)\in R\) and \((y,x)\in R\text{,}\) for all \(x,y\in X\text{.}\)
-
transitive if \((x,y)\in R\) and \((y,z)\in R\) imply \((x,z)\in R\text{,}\) for all \(x,y,z\in X\text{.}\)
A binary relation \(R\) on a set \(X\) is called a partial order on \(X\) when it is reflexive, antisymmetric, and transitive. Traditionally, symbols like \(\le\) and \(\subseteq\) are used to denote partial orders. As an example, recall that if \(X\) is a family of sets, we write \(A\subseteq B\) when \(A\) is a subset of \(B\text{.}\)
When using the ordered pair notation for binary relations, to indicate that a pair \((x,y)\) is not in the relation, we simply write \((x,y)\notin R\text{.}\) When using the alternate notation, this is usually denoted by using the negation symbol from logic and writing \(\lnot (xRy)\text{.}\) Most of the special symbols used to denote partial orders come with negative versions, e.g., \(x\not\le y\text{,}\) \(x\nsubseteq y\text{.}\)
A partial order is called a total order on \(X\) when for all \(x,y\in X\text{,}\) \((x,y)\in R\) or \((y,x)\in R\text{.}\) For example, if
\begin{equation*}
X=\{\emptyset,\{\emptyset\},\{\emptyset,\{\emptyset\}\}\}
\end{equation*}
then \(\subseteq\) is a total order on \(X\text{.}\)
When \(\le\) is a partial order on a set \(X\text{,}\) we write \(x\lt y\) when \(x\le y\) and \(x\neq y\text{.}\)
Let \(m,n\in \nonnegints\text{.}\) Define a binary relation \(\le\) on \(\nonnegints\) by setting \(m\le n\) if and only if there exists a natural number \(p\) so that \(m+p=n\text{.}\)
PropositionB.29
\(\le\) is a total order on \(\nonnegints\text{.}\)
\(\le\) is reflexive since \(n+0=n\) and therefore \(n\le n\text{,}\) for all \(n\in \nonnegints\text{.}\) Next, we show that \(\le\) is antisymmetric. Let \(m,n\in\nonnegints\) and suppose that \(m\le n\) and \(n\le m\text{.}\) Then there exist natural numbers \(p\) and \(q\) so that \(m+p=n\) and \(n+q=m\text{.}\) It follows that
\begin{equation*}
m+(p+q)= (m+p)+q=n+q =m=m+0
\end{equation*}
Therefore \(p+q=0\text{,}\) which implies that \(p=q=0\text{.}\) Thus \(m+p=m+0=m=n\text{.}\)
Next, we show that \(\le\) is transitive. Suppose that \(m,n,p\in \nonnegints\text{,}\) \(m\le n\) and \(n\le p\text{.}\) Then there exist natural numbers \(q\) and \(r\) so that \(m+q=n\) and \(n+r=p\text{.}\) Then
\begin{equation*}
m+(q+r)=(m+q)+r=n+r=p.
\end{equation*}
Thus \(m\le p\text{,}\) and we have now shown that \(\le\) is a partial order on \(\nonnegints\text{.}\)
Finally, we show that \(\le\) is a total order. To accomplish this, we choose an arbitrary element \(m\in\nonnegints\) and show that for every \(n\in\nonnegints\text{,}\) either \(m\le n\) or \(n\le m\text{.}\) We do this by induction on \(n\text{.}\) Suppose first that \(n=0\text{.}\) Since \(0+m=m\text{,}\) we conclude that \(0\le m\text{.}\) Now suppose that for some \(k\in\nonnegints\text{,}\) we have \(m\le k\text{.}\) Then there is a natural number \(p\) so that \(m+p=k\text{.}\) Then \(m+(p+1) =(m+p)+1=k+1\text{,}\) so \(m\le k+1\text{.}\)
On the other hand, suppose that for some \(k\in\nonnegints\text{,}\) we have \(k\le m\text{.}\) If \(k=m\text{,}\) then \(m\le k\) and \(m\le k+1\) as above. Now suppose that \(k\le m\) and \(k\neq m\text{.}\) Since \(k\le m\text{,}\) there exists a natural number \(p\) so that \(k+p=m\text{.}\) Since \(k\neq m\text{,}\) we know \(p\neq 0\text{.}\) Therefore, there is a natural number \(q\) so that \(p=q+1\text{.}\) Then \(m=k+p=k+(q+1)=(k+1)+q\) which shows that \(k+1\le m\text{.}\)
Note that if \(m,n\in\nonnegints\text{,}\) then \(m\lt n\) if and only if there exists a natural number \(p\neq 0\) so that \(m+p=n\text{.}\)
TheoremB.30Monotonic Law for Addition
Let \(m,n,p\in \nonnegints\text{.}\) If \(m\le n\text{,}\) then \(m+p\le n+p\text{.}\) Furthermore, if \(m\lt n\text{,}\) then \(m+p\lt n+p\text{.}\)
Proof
It suffices to prove that if \(m,n\in \nonnegints\) with \(m\lt n\text{,}\) then \(m+p\lt n+p\) for every \(p\in\nonnegints\text{.}\) Let \(q\neq0\) be the natural number so that \(m+q =n\text{.}\) Now let \(p\in \nonnegints\text{.}\) Then \((m+p)+q=(m+q)+p=n+p\text{,}\) so \(m+p \lt n+p\text{.}\)
LemmaB.31
If \(m,n\in \nonnegints\text{,}\) \(m\neq0\) and \(n\neq0\text{,}\) then \(mn\neq0\text{.}\)
Assume to the contrary, that \(m,n\in \nonnegints\text{,}\) \(m\neq0\text{,}\) \(n\neq0\) and \(mn=0\text{.}\) Let \(n=s(p)\text{.}\) Then \(0=mn= ms(p)+m\) which requires \(m=0\text{.}\) This is a contradiction.
TheoremB.32Monotonic Law for Multiplication
Let \(m,n,p\in \nonnegints\text{.}\) If \(m\le n\text{,}\) then \(mp\le np\text{.}\) Furthermore, if \(m\lt n\) and \(p\neq0\text{,}\) then \(mp\lt np\text{.}\)
Proof
Only the last statement requires proof. Let \(m,n\in\nonnegints\) with \(m\lt n\text{.}\) Then \(m+q=n\) for some \(q\neq0\text{.}\) Then \(np=(m+q)p=mp+pq\text{.}\) Since \(pq\neq0\text{,}\) we conclude \(mp\lt np\text{.}\)
CorollaryB.33Cancellation Law of Multiplication
If \(m,n,p\in \nonnegints\text{,}\) \(mp=np\text{,}\) and \(p\neq 0\text{,}\) then \(m=n\text{.}\)
If \(m\lt n\text{,}\) then \(mp\lt np\text{,}\) and if \(n\lt m\text{,}\) then \(np\lt mp\text{.}\) We conclude that \(m=n\text{.}\)
In some sense, we already have a workable notation for natural numbers. In fact, we really didn't need a special symbol for \(s(0)\text{.}\) The natural number \(0\) and the sucessor function \(s\) are enough. For example, the positive integer associated with the number of fingers (including the thumb) on one hand is \(s(s(s(s(s(0)))))\text{,}\) our net worth is \(0\text{,}\) and the age of Professor Trotter's son in years when this section was first written was
\begin{equation*}
s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(s(0)))))))))))))))))).
\end{equation*}
Admittedly, this is not very practical, especially if some day we win the lottery or want to discuss the federal deficit. So it is natural (ugh!) to consider alternative notations.
Here is one such scheme. First, let's decide on a natural \(b>s(0)\) as base. We will then develop a notation which is called the base \(b\) notation. We already have a special symbol for zero, namely \(0\text{,}\) but we need additional symbols for each natural number \(n\) with \(0\lt n\lt b\text{.}\) These symbols are called digits. For example, the positive integer \(b=s(s(s(s(s(s(s(s(0))))))))\) is called eight, and it makes a popular choice as a base. Here are the symbols (digits) customarily chosen for this base: \(1=s(0)\text{,}\) \(2=s(1)\text{;}\) \(3=s(2)\text{;}\) \(4=s(3)\text{;}\) \(5=s(4)\text{;}\) \(6=s(5)\text{;}\) and \(7=s(6)\text{.}\) Technically speaking, it is not necessary to have a separate symbol for \(b\text{,}\) but it might be handy regardless. In this case, most people prefer the symbol \(8\text{.}\) We like this symbol, unless and until it gets lazy and lays down sideways.
So the first \(8\) natural numbers are then \(0\text{,}\) \(1\text{,}\) \(2\text{,}\) \(3\text{,}\) \(4\text{,}\) \(5\text{,}\) \(6\) and \(7\text{.}\) To continue with our representation, we want to use the following basic theorem.
TheoremB.34
Let \(n,d\in \nonnegints\) with \(d>0\text{.}\) Then there exist unique natural numbers \(q\) and \(r\) so that \(n=qd+r\) and \(0\le r \lt d\text{.}\)
Proof
Let \(d\in \nonnegints\) with \(d>0\text{.}\) We first show that for each \(n\in\nonnegints\text{,}\) there exists \(q,r\in\nonnegints\) so that \(n=qd+r\) and \(0\le r\lt d\text{.}\) If \(n=0\text{,}\) we can take \(q=0\) and \(r=0\text{.}\) Now suppose that \(k=qd+r\) and \(0\le r\lt m\) for some \(k\in \nonnegints\text{.}\)
Note that \(r\lt d\) implies \(r+1\le d\text{.}\) If \(r+1\lt d\text{,}\) then \(k+1=qd+(r+1)\text{.}\) On the other hand, if \(r+1=d\text{,}\) then \(k+1=(q+1)d+0\text{.}\)
Now that existence has been settled, we note that the uniqueness of \(q\) and \(r\) follow immediately from the cancellation properties.
Now suppose that for some \(k\in \nonnegints\text{,}\) with \(k\ge 7\text{,}\) we have defined a base eight notation for the representation of \(k\text{,}\) for all \(n\) with \(0\le n\le k\text{,}\) and that in each case, this representation consists of a string of digits, written left to right, and selected from \(\{0,1,2,3,4,5,6,7\}\text{.}\) Write \(k+1=qb+r\) where \(0\le r\lt b\text{.}\) Note that \(q\le k\text{,}\) so that we already have a representation for \(q\text{.}\) To obtain a representation of \(k+1\text{,}\) we simply append \(r\) at the (right) end.
For example, consider the age of Professor Trotter's son. It is then written as 22. And to emphasize the base eight notation, most people would say \(22\text{,}\) base \(8\) and write \((22)_8\text{.}\)
Among the more popular bases are base 2, where only the digits 0 and 1 are used, and base sixteen, where sixteen is the popular word for \((20)_8\text{.}\) Here the digit symbols are
\begin{equation*}
0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F
\end{equation*}
Another popular choice, in fact the one in most widespread use in banks, shopping centers and movie theatres, is base ten. Ten is the natural number A, base sixteen. Also, ten is \((12)_8\text{.}\) Most folks use the digits \(0,1,2,3,4,5,6,7,8,9\) for base ten notation. And when no other designation is made, then it is assumed that the natural number is written base ten. So of course, Professor Trotter's son is 18 and is a freshman at Georgia Tech. Which explains why his hair is as white as it is.
For any base \(b>1\text{,}\) caution must be exercised when discussing multiplication, since writing the product \(m\times n\) in the abbreviated form \(mn\) causes us some grief. For example, if \(b = 8\text{,}\) then writing the product \(372\times4885\) as \(3724885\) is ambiguous. For this reason, when using base \(b\) notation, the product symbol \(\times\) (or some variation of \(\times\)) is always used.
SubsectionB.12.1Alternate Versions of Induction
Many authors prefer to start the development of number systems with the set of positive integers and defer the introduction of the concept of zero. In this setting, you have a non-empty set \(\posints\text{,}\) a one-to-one successor function \(s:\posints\injection\posints\) and a positive integer called one and denoted \(1\) that is not the successor of any positive integer. The Principle of Induction then becomes: If \(\mathbb{M}\subseteq\posints\text{,}\) then \(\mathbb{M} =\posints\) if and only if
- \(1\in \mathbb{M}\text{;}\) and
- \(\forall k\in \nonnegints\quad(k\in \mathbb{M})
\Longrightarrow(s(k)\in \mathbb{M})\text{.}\)
More generally, to show that a set \(\mathbb{M}\) contains all integers greater than or equal to an integer \(n\text{,}\) it is sufficient to show that (i) \(n\in\mathbb{M}\text{,}\) and (ii) For all \(k\in\ints,\,(k\in\mathbb{M}\Longrightarrow(k+1\in\mathbb{M})\text{.}\)
Here is another version of induction, one that is particularly useful in combinatorial arguments.
TheoremB.35
Let \(\mathbb{M}\subseteq\posints\text{.}\) If \(\mathbb{M}\neq\posints\text{,}\) then there is a unique least positive integer \(n\) that does not belong to \(\mathbb{M}\text{.}\)
A binary relation \(R\) is symmetric if \((x,y)\in R\) implies \((y,x)\in R\) for all \(x,y\in X\text{.}\)
A binary relation \(R\) on a set \(X\) is called an equivalence relation when it is reflexive, symmetric, and transitive. Typically, symbols like, \(=\text{,}\) \(\cong\text{,}\) \(\equiv\) and \(\sim\) are used to denote equivalence relations. An equivalence relation, say \(\cong\text{,}\) defines a partition on the set \(X\) by setting
\begin{equation*}
\langle x\rangle =\{y\in X: x\cong y\}.
\end{equation*}
Note that if \(x,y\in X\) and \(\langle x\rangle\cap\langle y\rangle \neq\emptyset\text{,}\) then \(\langle x\rangle=\langle y\rangle\text{.}\) The sets in this partition are called equivalence classes.
When using the ordered pair notation for binary relations, to indicate that a pair \((x,y)\) is not in the relation, we simply write \((x,y)\notin R\text{.}\) When using the alternate notation, this is usually denoted by using the negation symbol from logic and writing \(\lnot (xRy)\text{.}\) Many of the special symbols used to denote equivalence relations come with negative versions: \(x\neq y\text{,}\) \(x\ncong y\text{,}\) \(x\nsim y\text{,}\) etc.
SectionB.14The Integers as Equivalence Classes of Ordered Pairs¶ permalink
Define a binary relation \(\cong\) on the set \(Z=\nonnegints \times\nonnegints\) by
\begin{equation*}
(a,b)\cong (c,d)\quad\textit{iff}\quad a+d=b+c.
\end{equation*}
LemmaB.36
\(\cong\) is reflexive.
Let \((a,b)\in Z\text{.}\) Then \(a+b=b+a\text{,}\) so \((a,b)\cong(b,a)\text{.}\)
LemmaB.37
\(\cong\) is symmetric.
Let \((a,b),(c,d)\in Z\) and suppose that \((a,b)\cong (c,d)\text{.}\) Then \(a+d=b+c\text{,}\) so that \(c+b=d+a\text{.}\) Thus \((c,d)\cong (a,b)\text{.}\)
LemmaB.38
\(\cong\) is transitive.
Let \((a,b), (c,d), (e,f)\in Z\text{.}\) Suppose that
\begin{equation*}
(a,b)\cong(c,d)\quad\text{and} \quad (c,d)\cong (e,f).
\end{equation*}
Then \(a+d=b+c\) and \(c+f=d+e\text{.}\) Therefore,
\begin{equation*}
(a+d)+(c+f) =(b+c)+(d+e).
\end{equation*}
It follows that
\begin{equation*}
(a+f)+(c+d) =(b+e)+(c+d).
\end{equation*}
Thus \(a+f = b+e\) so that \((a,b)\cong(e,f)\text{.}\)
Now that we know that \(\cong\) is an equivalence relation on \(Z\text{,}\) we know that \(\cong\) partitions \(Z\) into equivalence classes. For an element \((a,b)\in Z\text{,}\) we denote the equivalence class of \((a,b)\) by \(\langle (a,b)\rangle\text{.}\)
Let \(\ints\) denote the set of all equivalence classes of \(Z\) determined by the equivalence relation \(\cong\text{.}\) The elements of \(\ints\) are called integers.
For the remainder of this chapter, most statements will be given without proof. Students are encouraged to fill in the details.
We define a binary operation \(+\) on \(\ints\) by the following rule:
\begin{equation*}
\langle(a,b)\rangle+\langle(c,d)\rangle = \langle(a+c,b+d)\rangle.
\end{equation*}
Note that the definition of addition is made in terms of representatives of the class, so we must pause to make sure that \(+\) is well defined, i.e., independent of the particular representatives.
LemmaB.39
If \(\langle(a,b)\rangle =\langle(c,d)\rangle\) and \(\langle(e,f)\rangle=\langle(g,h)\rangle\text{,}\) then \(\langle(a,b)\rangle+\langle(e,f)\rangle= \langle(c,d)\rangle+\langle(g,h)\rangle\text{.}\)
Since \((a,b)\cong(c,d)\text{,}\) we know \(a+d=b+c\text{.}\) Since \((e,f)\cong(g,h)\text{,}\) we know \(e+h=f+g\text{.}\) It follows that \((a+d)+(e+h) = (b+c)+ (f+g)\text{.}\) Thus \((a+e)+(d+h)= (b+f)+(c+g)\text{,}\) which implies that \(\langle(a,b)\rangle+\langle(e,f)\rangle= \langle(c,d)\rangle+\langle(g,h)\rangle\text{.}\)
In what follows, we use a single symbol, like \(x\text{,}\) \(y\) or \(z\) to denote an integer, but remember that each integer is in fact an entire equivalence class whose elements are ordered pairs of natural numbers.
TheoremB.40
For all \(x,y,z\in\ints\text{,}\)
\(x+y=y+x\text{;}\)
\(x+(y+z)= (x+y)+z\text{;}\) and
\(x+y= x+z\) implies \(y=z\text{.}\)
Next, we define a second binary operation called multiplication, and denoted \(x\times y\text{,}\) \(x*y\) or just \(xy\text{.}\) When \(x=\langle(a,b)\rangle\) and \(y=\langle(c,d)\rangle\text{,}\) we define:
\begin{equation*}
xy =\langle(a,b)\rangle\langle(c,d)\rangle= \langle(ac+bd, ad+bc)\rangle.
\end{equation*}
TheoremB.41
Multiplication is well defined. Furthermore,
\(xy=yx\text{,}\) for every \(x,y\in \ints\text{.}\)
\(x(yz)=(xy)z\text{,}\) for every \(x,y,z\in \ints\text{.}\)
\(x(y+z)=xy+xz\text{,}\) for every \(x,y,z\in \ints\text{.}\)
The integer \(\langle(0,0)\rangle\) has a number of special properties. Note that for all \(x\in\ints\text{,}\) \(x+\langle(0,0)\rangle= x\) and \(x\langle(0,0)\rangle=\langle(0,0)\rangle\text{.}\) So most folks call \(\langle(0,0)\rangle\) zero and denote it by \(0\text{.}\) This is a terrible abuse of notation, since we have already used the word zero and the symbol \(0\) to denote a particular natural number.
But mathematicians, computer scientists and even real people do this all the time. We use the same word and even the same phrase in many different settings expecting that the listener will make the correct interpretation. For example, how many different meanings do you know for You're so bad?
If \(x=\langle(a,b)\rangle\) is an integer and \(y= \langle(b,a)\rangle\text{,}\) then \(x+y=\langle(a+b,a+b)\rangle=0\text{.}\) The integer \(y\) is then called the additive inverse of \(x\) and is denoted \(-x\text{.}\) The additive inverse of \(x\) is also called minus \(x\). The basic property is that \(x + (-x) = 0\text{,}\) for every \(x\in \ints\text{.}\)
We can now define a new binary operation, called subtraction and denoted \(-\text{,}\) on \(\ints\) by setting \(x-y= x+(-y)\text{.}\) In general, subtraction is neither commutative nor associative. However, we do have the following basic properties.
TheoremB.42
For all \(x,y,z\in\ints\text{,}\)
\(x(-y)=-xy\text{;}\)
\(x(y-z)= xy-xz\text{;}\) and
\(-(x-y)=y-x\text{.}\)
Next, we define a total order on \(\ints\) by setting \(x\le y\) in \(\ints\) when \(x=\langle(a,b)\rangle\text{,}\) \(y=\langle(c,d) \rangle\) and \(a+d \le b+c\) in \(\nonnegints\text{.}\)
TheoremB.43Monotonic Law for Addition
Let \(x,y,z\in \ints\text{.}\) If \(x\le y\text{,}\) then \(x+z\le y+z\text{.}\) Furthermore, if \(x\lt y\text{,}\) then \(x+z\lt y+z\text{.}\)
For multiplication, the situation is more complicated.
TheoremB.44Monotonic Law for Multiplication
Let \(x,y,z\in \ints\text{.}\) If \(x\lt y\text{,}\) then
\(xz\lt yz\text{,}\) if \(z>0\text{,}\)
\(xz=yz=0\text{,}\) if \(z=0\text{,}\) and
\(xz>yz\text{,}\) if \(z\lt 0\text{.}\)
Now consider the function \(f:\nonnegints\longrightarrow \ints\) defined by \(f(n) = \langle(n,0)\rangle\text{.}\) It is easy to show that \(f\) is an injection. Furthermore, it respects addition and multiplication, i.e., \(f(n+m)=f(n)+f(m)\) and \(f(nm)=f(n)f(m)\text{.}\) Also, note that if \(x\in \ints\text{,}\) then \(x>0\) if and only if \(x=f(n)\) for some \(n\in \nonnegints\text{.}\) So, it is customary to abuse notation slightly and say that \(\nonnegints\) is a “subset” of \(\ints\text{.}\) Similarly, we can either consider the set \(\posints\) of positive integers as the set of natural numbers that are successors, or as the set of integers that are greater than \(0\text{.}\)
When \(n\) is a positive integer and \(0\) is the zero in \(\ints\text{,}\) we define \(0^n=0\text{.}\) When \(x\in\ints\text{,}\) \(x\neq 0\) and \(n\in\nonnegints\text{,}\) we define \(x^n\) inductively by (i) \(x^0=1\) and \(x^{k+1}=xx^k\text{.}\)
TheoremB.45
If \(x\in\ints\text{,}\) \(x\neq0\text{,}\) and \(m,n\in\nonnegints\text{,}\) then \(x^mx^n=x^{m+m}\) and \((x^m)^n=x^{mn}\text{.}\)
SectionB.16Obtaining the Rationals from the Integers¶ permalink
We consider the set \(\rats\) of all ordered pairs in \(\ints\times \ints\) of the form \((x,y)\) with \(y\neq 0\text{.}\) Elements of \(\rats\) are called rational numbers, or fractions. Define an equivalence relation, denoted \(=\text{,}\) on \(\rats\) by setting \((x,y)=(z,w)\) if and only if \(xw=yz\text{.}\) Here we should point out that the symbol \(=\) can be used (and often is) to denote an equivalence relation. It is not constrained to mean “identically the same.”
When \(q=(x,y)\) is a fraction, \(x\) is called the numerator and \(y\) is called the denominator of \(q\text{.}\) Remember that the denominator of a fraction is never zero.
Addition of fractions is defined by
\begin{equation*}
(a,b)+(c,d) = (ad+bc,bd),
\end{equation*}
while multiplication is defined by
\begin{equation*}
(a,b)(c,d) = (ac,bd).
\end{equation*}
As was the case with integers, it is important to pause and prove that both operations are well defined.
TheoremB.46
Let \(x,y,z,w\in\rats\text{.}\) If \(x=y\) and \(z=w\text{,}\) then \(x+z=y+w\) and \(xz=yw\text{.}\)
Addition and multiplication are both associative and commutative. Also, we have the distributive property.
TheoremB.47
Let \(x,y,z\in Q\text{.}\) Then
\(x+y=y+x\) and \(xy=yx\text{.}\)
\(x+(y+z) = (x+y)+z\) and \(x(yz)=(xy)z\text{.}\)
\(x(y+z)=xy+xz\text{.}\)
The additive inverse of a fraction \((a,b)\) is just \((-a,b)\text{.}\) Using this, we define subtraction for fractions: \((a,b)-(c,d)=(a,b)+(-c,d)\text{.}\)
When \((a,b)\) is a fraction, and \(a\neq 0\text{,}\) the fraction \((b,a)\) is the reciprocal of \((a,b)\text{.}\) The reciprocal is also called the multiplicative inverse, and the reciprocal of \(x\) is denoted \(x^{-1}\text{.}\) When \(y\neq0\text{,}\) we can then define division by setting \(x/y=xy^{-1}\text{,}\) i.e., \((a,b)/(c,d)=(ad, bc)\text{.}\) Of course, division by zero is not defined, a fact that you probably already knew!
As was the case for both \(\nonnegints\) and \(\ints\text{,}\) when \(n\) is a positive integer, and \(0\) is the zero in \(\rats\text{,}\) we define \(0^n=0\text{.}\) When \(x=(a,b)\) is a fraction with \(x\neq0\) and \(n\) is a non-negative integer, we define \(x^n\) inductively by (i) \(x^0=1\) and (ii) \(x^{n+1}=xx^n\text{.}\)
TheoremB.48
If \(x\in\rats\text{,}\) \(x\neq0\text{,}\) and \(m,n\in\ints\text{,}\) then \(x^mx^n=x^{m+m}\) and \((x^m)^n=x^{mn}\text{.}\)
Many folks prefer an alternate notation for fractions in which the numerator is written directly over the denominator with a horizontal line between them, so \((2,5)\) can also be written as \(\frac{2}{5}\text{.}\)
Via the map \(g(x) = (x,1)=\frac{x}{1}\text{,}\) we again say that the integers are a “subset” of the rationals. As before, note that \(g(x+y) = g(x)+g(y)\text{,}\) \(g(x-y)=g(x)-g(y)\) and \(g(xy)=g(x)g(y)\text{.}\)
In the third grade, you were probably told that \(5 =\frac{5}{1}\text{,}\) but by now you are realizing that this is not exactly true. Similarly, if you had told your teacher that \(\frac{3}{4}\) and \(\frac{6}{8}\) weren't really the same and were only “equal” in the broader sense of an equivalence relation defined on a subset of the cartesian product of the integers, you probably would have been sent to the Principal's office.
Try to imagine the trouble you would have gotten into had you insisted that the real meaning of \(\frac{1}{2}\) was
\begin{equation*}
\frac{1}{2} =\langle(\langle(s(s(0)),s(0))\rangle,
\langle(s(s(0)),0)\rangle)\rangle
\end{equation*}
We can also define a total order on \(\rats\text{.}\) To do this, we assume that \((a,b),(c,d)\in\rats\) have \(b,d>0\text{.}\) (If \(b\lt 0\text{,}\) for example, we would replace it by \((a',b')=(-a,-b)\text{,}\) which is in the same equivalence class as \((a,b)\) and has \(b'>0\text{.}\)) Then we set \((a,b)\leq (c,d)\) in \(\rats\) if \(ad\leq bc\) in \(\ints\text{.}\)
SubsectionB.16.1Integer Exponents
When \(n\) is a positive integer and \(0\) is the zero in \(\rats\text{,}\) we define \(0^n=0\text{.}\) When \(x\in\rats\text{,}\) \(x\neq 0\) and \(n\in\nonnegints\text{,}\) we define \(x^n\) inductively by (i) \(x^0=1\) and \(x^{k+1}=xx^k\text{.}\) When \(n\in\ints\) and \(n\lt 0\text{,}\) we set \(x^n=1/x^{-n}\text{.}\)
TheoremB.49
If \(x\in\rats\text{,}\) \(x\neq0\text{,}\) and \(m,n\in\ints\text{,}\) then \(x^mx^n=x^{m+m}\) and \((x^m)^n=x^{mn}\text{.}\)
SectionB.17Obtaining the Reals from the Rationals¶ permalink
A full discussion of this would take us far away from a discrete math class, but let's at least provide the basic definitions. A subset \(S\subset \rats\) of the rationals is called a cut (also, a Dedekind cut), if it satisfies the following properties:
\(\emptyset\neq S\neq \rats\text{,}\) i.e, \(S\) is a proper non-empty subset of \(\rats\text{.}\)
\(x\in S\) and \(y\lt x\) in \(\rats\) implies \(y\in S\text{,}\) for all \(x,y\in \rats\text{.}\)
For every \(x\in S\text{,}\) there exists \(y\in S\) with \(x\lt y\text{,}\) i.e., \(S\) has no greatest element.
Cuts are also called real numbers, so a real number is a particular kind of set of rational numbers. For every rational number \(q\text{,}\) the set \(\bar{q}= \{p\in \rats: p\lt q\}\) is a cut. Such cuts are called rational cuts. Inside the reals, the rational cuts behave just like the rational numbers and via the map \(h(q)=\bar{q}\text{,}\) we abuse notation again (we are getting used to this) and say that the rational numbers are a subset of the real numbers.
But there are cuts which are not rational. Here is one: \(\{p\in \rats: p\le 0\}\cup \{p\in \rats: p^2\lt 2\}\text{.}\) The fact that this cut is not rational depends on the familiar proof that there is no rational \(q\) for which \(q^2=2\text{.}\)
The operation of addition on cuts is defined in the natural way. If \(S\) and \(T\) are cuts, set \(S+T=\{s+t:s\in S, t\in T\}\text{.}\) Order on cuts is defined in terms of inclusion, i.e., \(S\lt T\) if and only if \(S\subsetneq T\text{.}\) A cut is positive if it is greater than \(\bar{0}\text{.}\) When \(S\) and \(T\) are positive cuts, the product \(ST\) is defined by
\begin{equation*}
ST= \bar{0}\cup\{st:s\in S, t\in T, s\ge0, t\ge 0\}.
\end{equation*}
One can easily show that there is a real number \(r\) so that \(r^2=\bar{2}\text{.}\) You may be surprised, but perhaps not, to learn that this real number is denoted \(\sqrt2\text{.}\)
There are many other wonders to this story, but enough for one day.
SectionB.18Obtaining the Complex Numbers from the Reals¶ permalink
By now, the following discussion should be transparent. The complex number system \(\complexes\) is just the cartesian product \(\reals\times\reals\) with
\((a,b) = (c,d)\) in \(\complexes\) if and only if \(a=c\) and \(b=d\) in \(\reals\text{.}\)
\((a,b)+(c,d)=(a+c,b+d)\text{.}\)
\((a,b)(c,d)=(ac-bd, ad+bc)\text{.}\)
Now the complex numbers of the form \((a,0)\) behave just like real numbers, so is natural to say that the complex number system contains the real number system. Also, note that \((0,1)^2=(0,1)(0,1)=(-1,0)\text{,}\) i.e., the complex number \((0,1)\) has the property that its square is the complex number behaving like the real number \(-1\text{.}\) So it is convenient to use a special symbol like \(i\) for this very special complex number and note that \(i^2=-1\text{.}\)
With this beginning, it is straightforward to develop all the familiar properties of the complex number system.
SubsectionB.18.1Decimal Representation of Real Numbers
Every real number has a decimal expansion—although the number of digits after the decimal point may be infinite. A rational number \(q=m/m\) from \(\rats\) has an expansion in which a certain block of digits repeats indefinitely. For example,
\begin{equation*}
\frac{2859}{35} = 81.6857142857142857142857142857142857142857142\dots
\end{equation*}
In this case, the block \(857142\) of size \(6\) is repeated forever.
Certain rational numbers have terminating decimal expansions. For example, we know that \(385/8= 48.125\text{.}\) If we chose to do so, we could write this instead as an infinite decimal by appending trailing \(0\)'s, as a repeating block of size \(1\text{:}\)
\begin{equation*}
\frac{385}{8} = 48.1250000000000000000000000000000000\dots
\end{equation*}
On the other hand, we can also write the decimal expansion of \(385/8\) as
\begin{equation*}
\frac{385}{8} = 48.12499999999999999999999999999999999\dots
\end{equation*}
Here, we intend that the digit \(9\text{,}\) a block of size \(1\text{,}\) be repeated forever. Apart from this anomaly, the decimal expansion of real numbers is unique.
On the other hand, irrational numbers have non-repeating decimal expansions in which there is no block of repeating digits that repeats forever.
You know that \(\sqrt{2}\) is irrational. Here is the first part of its decimal expansion:
\begin{equation*}
\sqrt{2} =1.41421356237309504880168872420969807856967187537694807317667973\dots
\end{equation*}
An irrational number is said to be algebraic if it is the root of polynomial with integer coefficients; else it is said to be transcendental. For example, \(\sqrt{2}\) is algebraic since it is the root of the polynomial \(x^2-2\text{.}\)
Two other famous examples of irrational numbers are \(\pi\) and \(e\text{.}\) Here are their decimal expansions:
\begin{align*}
\pi \amp =3.14159265358979323846264338327950288419716939937510582097494459\dots\\
\end{align*}
and
\begin{align*}
e\amp=2.7182818284590452353602874713526624977572470936999595749669676277\dots
\end{align*}
Both \(\pi\) and \(e\) are transcendental.
ExampleB.50
Amanda and Bilal, both students at a nearby university, have been studying rational numbers that have large blocks of repeating digits in their decimal expansions. Amanda reports that she has found two positive integers \(m\) and \(n\) with \(n\lt 500\) for which the decimal expansion of the rational number \(m/n\) has a block of 1961 digits which repeats indefinitely. Not to be outdone, Bilal brags that he has found such a pair \(s\) and \(t\) of positive integers with \(t\lt 300\) for which the decimal expansion of \(s/t\) has a block of \(7643\) digits which repeats indefinitely. Bilal should be (politely) told to do his arithmetic more carefully, as there is no such pair of positive integers (Why?). On the other hand, Amanda may in fact be correct—although, if she has done her work with more attention to detail, she would have reported that the decimal expansion of \(m/n\) has a smaller block of repeating digits (Why?).
PropositionB.51
There is no surjection from \(\posints\) to the set \(X= \{x\in\reals:0\lt x\lt 1\}\text{.}\)
Let \(f\) be a function from \(\posints\) to \(X\text{.}\) For each \(n\in \posints\text{,}\) consider the decimal expansion(s) of the real number \(f(n)\text{.}\) Then choose a positive integer \(a_n\) so that (1) \(a_n\le 8\text{,}\) and (2) \(a_n\) is not the \(n^{th}\) digit after the decimal point in any decimal expansion of \(f(n)\text{.}\) Then the real number \(x\) whose decimal expansion is \(x=.a_1a_2a_3a_4a_5\dots\) is an element of \(X\) which is distinct from \(f(n)\text{,}\) for every \(n\in\posints\text{.}\) This shows that \(f\) is not a surjection.
SectionB.19The Zermelo-Fraenkel Axioms of Set Theory¶ permalink
In the first part of this appendix, we put number systems on a firm foundation, but in the process, we used an intuitive understanding of sets. Not surprisingly, this approach is fraught with danger. As was first discovered more than 100 years ago, there are major conceptual hurdles in formulating consistent systems of axioms for set theory. And it is very easy to make statements that sound “obvious” but are not.
Here is one very famous example. Let \(X\) and \(Y\) be sets and consider the following two statements:
There exists an injection \(f:X\rightarrow Y\text{.}\)
There exists a surjection \(g:Y\rightarrow X\text{.}\)
If \(X\) and \(Y\) are finite sets, these statements are equivalent, and it is perhaps natural to surmise that the same is true when \(X\) and \(Y\) are infinite. But that is not the case.
Here is the system of axioms popularly known as ZFC, which is an abbreviation for Zermelo-Fraenkel plus the Axiom of Choice. In this system, the notion of set and the membership operator \(\in\) are undefined. However, if \(A\) and \(B\) are sets, then exactly one of the following statements is true: (i) \(A\in B\) is true; (ii) \(A\in B\) is false. When \(A\in B\) is false, we write \(A\notin B\text{.}\) Also, there is an equivalence relation \(=\) defined on sets.
AxiomB.52Zermelo-Fraenkel Axioms with Axiom of Choice
- Axiom of extensionality
Two sets are equal if and only if they have the same elements.
- Axiom of empty set
There is a set \(\emptyset\) with no elements.
- Axiom of pairing
If \(x\) and \(y\) are sets, then there exists a set containing \(x\) and \(y\) as its only elements, which we denote by \(\{x,y\}\text{.}\) Note: If \(x=y\text{,}\) then we write only \(\{x\}\text{.}\)
- Axiom of union
For any set \(x\text{,}\) there is a set \(y\) such that the elements of \(y\) are precisely the elements of the elements of \(x\text{.}\)
- Axiom of infinity
There exists a set \(x\) such that \(\emptyset\in x\) and whenever \(y\in x\text{,}\) so is \(\{y ,\{y\}\}\text{.}\)
- Axiom of power set
Every set has a power set. That is, for any set \(x\text{,}\) there exists a set \(y\text{,}\) such that the elements of \(y\) are precisely the subsets of \(x\text{.}\)
- Axiom of regularity
Every non-empty set \(x\) contains some element \(y\) such that \(x\) and \(y\) are disjoint sets.
- Axiom of separation (or subset axiom)
Given any set and any proposition \(P(x)\text{,}\) there is a subset of the original set containing precisely those elements \(x\) for which \(P(x)\) holds.
- Axiom of replacement
Given any set and any mapping, formally defined as a proposition \(P(x,y)\) where \(P(x,y_1)\) and \(P(x,y_2)\) implies \(y_1 = y_2\text{,}\) there is a set containing precisely the images of the original set's elements.
- Axiom of choice
Given any set of mutually exclusive non-empty sets, there exists at least one set that contains exactly one element in common with each of the non-empty sets.
A good source of additional (free) information on set theory is the collection of Wikipedia articles. Do a web search and look up the following topics and people:
Zermelo-Fraenkel set theory.
Axiom of Choice.
Peano postulates.
Georg Cantor, Augustus De Morgan, George Boole, Bertrand Russell and Kurt Gödel.