##### Theorem5.39Cayley's Formula

The number \(T_n\) of labeled trees on \(n\) vertices is \(n^{n-2}\).

How many trees are there with vertex set \([n]=\{1,2,\dots,n\}\)? Let \(T_n\) be this number. For \(n=1\), there is clearly only one tree. Also, for \(n=2\), there is only one tree, which is isomorphic to \(\bfK_2\). In determining \(T_3\), we finally have some work to do; however, there's not much, since all trees on \(3\) vertices are isomorphic to \(\bfP_3\). Thus, there are \(T_3=3\) *labeled trees* on \(3\) vertices, corresponding to which vertex is the one of degree \(2\). When \(n=4\), we can begin by counting the number of nonisomorphic trees and consider two cases depending on whether the tree has a vertex of degree \(3\). If there is a vertex of degree \(3\), the tree is isomorphic to \(\bfK_{1,3}\) or it does not have a vertex of degree three, in which case it is isomorphic to \(\bfP_4\), since there must be precisely two vertices of degree \(2\) in such a graph. There are four labelings by \([4]\) for \(\bfK_{1,3}\) (choose the vertex of degree three). How many labelings by \([4]\) are there for \(\bfP_4\)? There are \(C(4,2)\) ways to choose the labels \(i,j\) given to the vertices of degree \(2\) and two ways to select one of the remaining labels to be made adjacent to \(i\). Thus, there are \(12\) ways to label \(\bfP_4\) by \([4]\) and so \(T_4=16\).

To this point, it looks like maybe there's a pattern forming. Perhaps it is the case that for all \(n\geq 1\), \(T_n = n^{n-2}\). This is in fact the case, but let's see how it works out for \(n=5\) before proving the result in general. What are the nonisomorphic trees on five vertices? Well, there's \(\bfK_{1,4}\) and \(\bfP_5\) for sure, and there's also the third tree shown in Figure 5.38. After thinking for a minute or two, you should be able to convince yourself that this is all of the possibilities. How many labelings by \([5]\) does each of these have? There are \(5\) for \(\bfK_{1,4}\) since there are \(5\) ways to choose the vertex of degree \(4\). For \(\bfP_5\), there are \(5\) ways to choose the middle vertex of the path, \(C(4,2)=6\) ways to label the two remaining vertices of degree \(2\) once the middle vertex is labeled, and then \(2\) ways to label the vertices of degree \(1\). This gives \(60\) labelings. For the last tree, there are \(5\) ways to label the vertex of degree \(3\), \(C(4,2)=6\) ways to label the two leaves adjacent to the vertex of degree \(3\), and \(2\) ways to label the remaining two vertices, giving \(60\) labelings. Therefore, \(T_5=125=5^3=5^{5-2}\).

It turns out that we are in fact on the right track, and we will now set out to prove the following:

The number \(T_n\) of labeled trees on \(n\) vertices is \(n^{n-2}\).

This result is usually referred to as Cayley's Formula, although equivalent results were proven earlier by James J. Sylvester (1857) and Carl W. Borchardt (1860). The reason that Cayley's name is most often affixed to this result is that he was the first to state and prove it in graph theoretic terminology (in 1889). (Although one could argue that Cayley really only proved it for \(n=6\) and then claimed that it could easily be extended for all other values of \(n\), and whether such an extension can actually happen is open to some debate.) Cayley's Formula has many different proofs, most of which are quite elegant. If you're interested in presentations of several proofs, we encourage you to read the chapter on Cayley's Formula in Proofs from THE BOOK by Aigner, Ziegler, and Hofmann, which contains four different proofs, all using different proof techniques. Here we give a fifth proof, due to Prüfer and published in 1918. Interestingly, even though Prüfer's proof came after much of the terminology of graph theory was established, he seemed unaware of it and worked in the context of permutations and his own terminology, even though his approach clearly includes the ideas of graph theory. We will use a recursive technique in order to find a bijection between the set of labeled trees on \(n\) vertices and a natural set of size \(n^{n-2}\), the set of strings of length \(n-2\) where the symbols in the string come from \([n]\).

We define a recursive algorithm that takes a tree \(\bfT\) on \(k\geq 2\) vertices labeled by elements of a set \(S\) of positive integers of size \(k\) and returns a string of length \(k-2\) whose symbols are elements of \(S\). (The set \(S\) will usually be \([k]\), but in order to define a recursive procedure, we need to allow that it be an arbitrary set of \(k\) positive integers.) This string is called the *Prüfer code* of the tree \(\bfT\). Let \(\prufer(\bfT)\) denote the Prüfer code of the tree \(\bfT\), and if \(v\) is a leaf of \(\bfT\), let \(\bfT-v\) denote the tree obtained from \(\bfT\) by removing \(v\) (i.e., the subgraph induced by all the other vertices). We can then define \(\prufer(\bfT)\) recursively by the following procedure.

If \(\bfT\cong \bfK_2\), return the empty string.

Else, let \(v\) be the leaf of \(\bfT\) with the smallest label and let \(u\) be its unique neighbor. Let \(i\) be the label of \(u\). Return \((i,\prufer(\bfT-v))\).

Before using Prüfer codes to prove Cayley's Formula, let's take a moment to make sure we understand how they are computed given a tree. Consider the \(9\)-vertex tree \(\bfT\) in Figure 5.41.

How do we compute \(\prufer(\bfT)\)? Since \(\bfT\) has more than two vertices, we use the second step and find that \(v\) is the vertex with label \(2\) and \(u\) is the vertex with label \(6\), so \(\prufer(\bfT)=(6,\prufer(\bfT-v))\). The graph \(\bfT-v\) is shown in Figure 5.42.

The recursive call \(\prufer(\bfT-v)\) returns \((6,\prufer(\bfT-v-v'))\), where \(v'\) is the vertex labeled \(5\). Continuing recursively, the next vertex deleted is \(6\), which appends a \(4\) to the string. Then \(7\) is deleted, appending \(3\). Next \(8\) is deleted, appending \(1\). This is followed by the deletion of \(1\), appending \(4\). Finally \(4\) is deleted, appending \(3\), and the final recursive call has the subtree isomorphic to \(\bfK_2\) with vertices labeled \(3\) and \(9\), and an empty string is returned. Thus, \(\prufer(\bfT)=6643143\).

We're now prepared to give a proof of Cayley's Formula.

We close this section with an example of how to take a Prüfer code and use it to construct a labeled tree. Consider the string \(\bfs=75531\) as a Prüfer code. Then the tree \(\bfT\) corresponding to \(\bfs\) has \(7\) vertices, and its leaves are labeled \(2\), \(4\), and \(6\). The inductive step in our proof attaches the vertex labeled \(2\) to the vertex labeled \(7\) in the tree \(\bfT'\) with Prüfer code \(5531\) and vertex labels \(\{1,3,4,5,6,7\}\), since \(2\) is used to label the last vertex added. What are the leaves of \(\bfT'\)? The symbols in \(\{4,6,7\}\) do not appear in \(5531\), so they must be the labels of leaves, and the construction says that we would attach the vertex labeled \(4\) to the vertex labeled \(5\) in the tree we get by induction. In Table 5.44, we show how this recursive process continues.

Prüfer code | Label set | Edge added |

75531 | \(\{1,2,3,4,5,6,7\}\) | 2–7 |

5531 | \(\{1,3,4,5,6,7\}\) | 4–5 |

531 | \(\{1,3,5,6,7\}\) | 6–5 |

31 | \(\{1,3,5,7\}\) | 5–3 |

1 | \(\{1,3,7\}\) | 3–1 |

(empty string) | \(\{1,7\}\) | 1–7 |

We form each row from the row above it by removing the first label used on the edge added from the label set and removing the first symbol from the Prüfer code. Once the Prüfer code becomes the empty string, we know that the two remaining labels must be the labels we place on the ends of \(\bfK_2\) to start building \(\bfT\). We then work back up the edge added column, adding a new vertex and the edge indicated. The tree we construct in this manner is shown in Figure 5.45.