Connectivity

Which nodes are connected, directly or indirectly? Between which nodes is there a path from one node to another? Those are the questions we deal with now. This is part of the topic of graph traversal, encountering a series of connected nodes in a graph or subgraph.

9.3 Connectivity

This section is devoted to a question that, when posed in relation to the graphs that we have examined, seems trivial. That question is: Given two vertices, s and t, of a graph, is there a path from s to t? If s = t, this question is interpreted as asking whether there is a circuit of positive length starting at s. Of course, for the graphs we have seen up to now, this question can be answered after a brief examination.

9.3.1 Preliminaries

There are two situations under which a question of this kind is nontrivial. One is where the graph is very large and an "examination" of the graph could take a considerable amount of time. Anyone who has tried to solve a maze may have run into a similar problem. The second interesting situation is when we want to pose the question to a machine. If only the information on the edges between the vertices is part of the data structure for the graph, how can you put that information together to determine whether two vertices can be connected by a path?

Note 9.3.1 Connectivity Terminology. Let v and w be vertices of a directed graph. Vertex v is connected to vertex w if there is a path from v to w. Two vertices are strongly connected if they are connected in both directions to one another. A graph is connected if, for each pair of distinct vertices, v and w, v is connected to w or w is connected to v. A graph is strongly connected if every pair of its vertices is strongly connected. For an undirected graph, in which edges can be used in either direction, the notions of strongly connected and connected are the same.

Theorem 9.3.2 Maximal Path Theorem. If a graph has n vertices and vertex u is connected to vertex w, then there exists a path from u to w of length no more than n.

Proof. (Indirect): Suppose u is connected to w, but the shortest path from to w has length m, where m > n. A vertex list for a path of length m will have m + 1 vertices. This path can be represented as (v0, v1, . . . , vm), where v0 = u and v= w. Note that since there are only n vertices in the graph and m vertices are listed in the path after v0, we can apply the pigeonhole principle and be assured that there must be some duplication in the last m vertices of the vertex list, which represents a circuit in the path. This means that our path of minimum length can be reduced, which is a contradiction.

Algorithm 9.3.3 Adjacency Matrix Method. Suppose that the information about edges in a graph is stored in an adjacency matrix, G. The relation, r, that G defines is vrw if there is an edge connecting v to w. Recall that the composition of r with itself, r2, is defined by vr2w if there exists a vertex y such that vry and yrw; that is, v is connected to w by a path of length 2. We could prove by induction that the relation rk, k ≥1, is defined by vrkw if and only if there is a path of length k from v to w. Since the transitive closure, r+, is the union of r, r2, r3, . . ., we can answer our connectivity question by determining the transitive closure of r, which can be done most easily by keeping our relation in matrix form. Theorem 9.3.2 is significant in our calculations because it tells us that we need only go as far as Gto determine the matrix of the transitive closure.

The main advantage of the adjacency matrix method is that the transitive closure matrix can answer all questions about the existence of paths between any vertices. If G+ is the matrix of the transitive closure, vis connected to vj if and only if (G+)ij = 1. A directed graph is connected if (G+)ij = 1 or (G+)j= 1 for each i ≠ j. A directed graph is strongly connected if its transitive closure matrix has no zeros.

A disadvantage of the adjacency matrix method is that the transitive closure matrix tells us whether a path exists, but not what the path is. The next algorithm will solve this problem.

We will describe the Breadth-First Search Algorithm first with an example.

The football team at Mediocre State University (MSU) has had a bad year, 2 wins and 9 losses. Thirty days after the end of the football season, the university trustees are meeting to decide whether to rehire the head coach; things look bad for him. However, on the day of the meeting, the coach issues the following press release with results from the past year:

List 9.3.4 Press Release: MSU completes successful season

The Mediocre State University football team compared favorably with national champion Enormous State University this season.

• Mediocre State defeated Local A and M.
• Local A and M defeated City College.
• City College defeated Corn State U.
• ... (25 results later)
• Tough Tech defeated Enormous State University (ESU).

...and ESU went on to win the national championship!

The trustees were so impressed that they rehired the coach with a raise! How did the coach come up with such a list?

In reality, such lists exist occasionally and have appeared in newspapers from time to time. Of course they really don't prove anything since each team that defeated MSU in our example above can produce a similar, shorter chain of results. Since college football records are readily available, the coach could have found this list by trial and error. All that he needed to start with was that his team won at least one game. Since ESU lost one game, there was some hope of producing the chain.

The problem of finding this list is equivalent to finding a path in the tournament graph for last year's football season that initiates at MSU and ends at ESU. Such a graph is far from complete and is likely to be represented using edge lists. To make the coach's problem interesting, let's imagine that only the winner of any game remembers the result of the game. The coach's problem has now taken on the flavor of a maze. To reach ESU, he must communicate with the various teams along the path. One way that the coach could have discovered his list in time is by sending the following messages to the coaches of the two teams that MSU defeated during the season:

Note 9.3.5 When this example was first written, we commented that ties should be ignored. Most recent NCAA rules call for a tiebreaker in college football and so ties are no longer an issue. Email was also not common and we described the process in terms of letters, not email messages. Another change is that the coach could also have asked the MSU math department to use Mathematica or Sage to find the path!

List 9.3.6 The Coach's Letter

Dear Football Coach:

1. If you are the coach at ESU, contact the coach at MSU now and tell him who sent you this message.
2. If you are not the coach at ESU and this is the first message of this type that you have received, then:
• Remember from whom you received this message.
• Forward a copy of this message, signed by you, to each of the coaches whose teams you defeated during the past year.
• Ignore this message if you have received one like it already.

Signed,

Coach of MSU

List 9.3.7 Observations

From the conditions of this message, it should be clear that if everyone cooperates and if coaches participate within a day of receiving the message:

1. If a path of length n exists from MSU to ESU, then the coach will know about it in n days.
2. By making a series of phone calls, the coach can construct a path that he wants by first calling the coach who defeated ESU (the person who sent ESU's coach that message). This coach will know who sent him a letter, and so on. Therefore, the vertex list of the desired path is constructed in reverse order.
3. If a total of M football games were played, no more than messages will be sent out.
4. If a day passes without any message being sent out, no path from MSU to ESU exists.
5. This method could be extended to construct a list of all teams that a given team can be connected to. Simply imagine a series of letters like the one above sent by each football coach and targeted at every other coach.

The general problem of finding a path between two vertices in a graph, if one exists, can be solved exactly as we solved the problem above. The following algorithm, commonly called a breadth-first search, uses a stack.

Stacks. A stack is a fundamental data structure in computer science. A common analogy used to describe stacks is a stack of plates. If you put a plate on the top of a stack and then want to use a plate, it's natural to use that top plate. So the last plate in is the first plate out. "Last in, first out" is the short description of the rule for stacks. This is contrast with a queue which uses a "First in, first out" rule.

Algorithm 9.3.8 Breadth-first Search. A broadcasting algorithm for finding a path between vertex i and vertex j of a graph having n vertices. Each item Vof a list V = {V1, V2, . . . , Vn}, consists of a Boolean field Vk.found and an integer field Vk.from. The sets D1, D2, . . ., called depth sets, have the property that if k ∈ Dr, then the shortest path from vertex i to vertex k is of length r. In Step 5, a stack is used to put the vertex list for the path from the vertex i to vertex j in the proper order. That stack is the output of the algorithm.

1. Set the value Vk.found equal to False, k = 1, 2, . . . , n
2. r = 0
3. D0 = {i}
4. while Vj.found) and (D= ∅)

• D+ 1 = ∅
• for each k in Dr:
• for each edge (k, t):
• If Vt.found == False:
• Vt.found True
• Vt.from k
• D+ 1 = D+ 1 ∪ {t}
• + 1
(5) if Vj.found:

• S = EmptyStack
• k = j
• while Vk.from ≠ i:
• Push k onto S
• k = Vk. from
• Push k onto S
• Push i onto S

List 9.3.9 Notes on Breadth-first Search

• This algorithm will produce one path from vertex i to vertex j, if one exists, and that path will be as short as possible. If more than one path of this length exists, then the one that is produced depends on the order in which the edges are examined and the order in which the elements of Dare examined in Step 4.
• The condition Dr ≠ ∅ is analogous to the condition that no mail is sent in a given stage of the process, in which case MSU cannot be connected to ESU.
• This algorithm can be easily revised to find paths to all vertices that can be reached from vertex i. Step 5 would be put off until a specific path to a vertex is needed since the information in V contains an efficient list of all paths. The algorithm can also be extended further to find paths between any two vertices.

Example 9.3.10 A simple example. Consider the graph below. The existence of a path from vertex 2 to vertex 3 is not difficult to determine by examination. After a few seconds, you should be able to find two paths of length four. Algorithm 9.3.8 will produce one of them.

Figure 9.3.11 A simple example of breadth-first search

Suppose that the edges from each vertex are sorted in ascending order by terminal vertex. For example, the edges from vertex 3 would be in the order (3, 1), (3, 4), (3, 5). In addition, assume that in the body of Step 4 of the algorithm, the elements of Dr are used in ascending order. Then at the end of Step 4, the value of V will be

 k 1 2 3 4 5 6 Vk.found T T T T T T Vk.from 2 4 6 1 1 4 Depthset 1 3 4 2 2 3

Therefore, the path (2, 1, 4, 6, 3) is produced by the algorithm. Note that if we wanted a path from 2 to 5, the information in V produces the path (2, 1, 5) since Vk.from = 1 and V1.from = 2. A shortest circuit that initiates at vertex 2 is also available by noting that V2.from = 4, V4.from = 1, and V1.from = 2; thus the circuit (2, 1, 4, 2) is the output of the algorithm.

9.3.4 Graph Measurements

If we were to perform a breadth first search from each vertex in a graph, we could proceed to determine several key measurements relating to the general connectivity of that graph. From each vertex v, the distance from v to any other vertex w, d(v, w), is number of edges in the shortest path from v to w. This number is also the index of the depth set to which w belongs in a breath-first search starting at v.

d(v, w) = ⇐⇒ ∈ Dv(i)

where Dis the family of depth sets starting at v.

If the vector of "from-values" is known from the breath-first search, then the distance can be determined recursively as follows:

d(v, v) = 0

d(v, w) = 1 + d(v, w.from) if w = v

Example 9.3.12 Computing Distances.

Figure 9.3.13 Graph Measurements Example

Consider Figure 9.3.13. If we perform a breadth first search of this graph starting at vertex 2, for example, we get the following "from data" telling us from what vertex each vertex is reached.

$\begin{matrix} \mathrm{vertex} & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 \\ \mathrm{vertex.from} & 7 & 2 & 10 & 6 & 9 & 7 & 2 & 4 & 2 & 7 & 9 & 2 \end{matrix}$

For example, 4.from has a value of 6. We can compute d(2, 4):

\begin{align*} d(2,4) &= 1 + d(2, 4.from) = 1 + d(2, 6)\\ &= 2 + d(2, 6.from) = 2 + d(2, 7)\\ &= 3 + d(2, 7.from) = 3 + d(2, 2)\\ &= 3 \end{align*}

Once we know distances between any two vertices, we can determine the eccentricity of each vertex; and the graph's diameter, radius and center. First, we define these terms precisely.

Eccentricity of a Vertex The the maximum distance from a vertex to all other vertices, e(v) = maxwd(v, w).

Diameter of a Graph The maximum eccentricity of vertices in a graph, denoted d(G).

Radius of a Graph The minimum eccentricity of vertices in a graph, denoted r(G).

Center of a Graph The set of vertices with minimal eccentricity, C (G) = {v V |e(v) = r(G)}

Example 9.3.14 Measurements from distance matrices. If we compute all distances between vertices, we can summarize the results in a distance matrix, where the entry in row i, column j is the distance from vertex i to vertex j. For the graph in Example 9.3.12, that matrix is

$\begin{pmatrix} 0 & 2 & 2 & 2 & 3 & 1 & 1 & 3 & 3 & 1 & 2 & 2\\ 2 & 0 & 3 & 3 & 2 & 2 & 1 & 4 & 1 & 2 & 2 & 1\\ 2 & 3 & 0 & 2 & 5 & 3 & 2 & 3 & 4 & 1 & 4 & 3\\ 2 & 3 & 2 & 0 & 3 & 1 & 2 & 1 & 3 & 1 & 2 & 3\\ 3 & 2 & 5 & 3 & 0 & 2 & 3 & 4 & 1 & 4 & 1 & 3\\ 1 & 2 & 3 & 1 & 2 & 0 & 1 & 2 & 2 & 2 & 1 & 2\\ 1 & 1 & 2 & 2 & 3 & 1 & 0 & 3 & 2 & 1 & 2 & 1\\ 3 & 4 & 3 & 1 & 4 & 2 & 3 & 0 & 4 & 2 & 3 & 4\\ 3 & 1 & 4 & 3 & 1 & 2 & 2 & 4 & 0 & 3 & 1 & 2\\ 1 & 2 & 1 & 1 & 4 & 2 & 1 & 2 & 3 & 0 & 3 & 2\\ 2 & 2 & 4 & 2 & 1 & 1 & 2 & 3 & 1 & 3 & 0 & 3\\ 2 & 1 & 3 & 3 & 3 & 2 & 1 & 4 & 2 & 2 & 3 & 0 \end{pmatrix}$

If we scan the matrix, we can see that the maximum distance is the distance between vertices 3 and 5, which is 5 and is the diameter of the graph. If we focus on individual rows and identify the maximum values, which are the eccentricities, their minimum is 3, which the graph's radius. This eccentricity value is attained by vertices in the set {1, 4, 6, 7}, which is the center of the graph.